gjson pathing for stem paths

This commit is contained in:
2026-03-13 23:35:37 -04:00
parent d6deaa0b0f
commit 290464adc1
6 changed files with 2576 additions and 139 deletions

View File

@ -101,22 +101,24 @@ The Queryer transforms Postgres into a pre-compiled Semantic Query Engine via th
* **Array Inclusion**: `{"$in": [values]}`, `{"$nin": [values]}` use native `jsonb_array_elements_text()` bindings to enforce `IN` and `NOT IN` logic without runtime SQL injection risks. * **Array Inclusion**: `{"$in": [values]}`, `{"$nin": [values]}` use native `jsonb_array_elements_text()` bindings to enforce `IN` and `NOT IN` logic without runtime SQL injection risks.
* **Text Matching (ILIKE)**: Evaluates `$eq` or `$ne` against string fields containing the `%` character natively into Postgres `ILIKE` and `NOT ILIKE` partial substring matches. * **Text Matching (ILIKE)**: Evaluates `$eq` or `$ne` against string fields containing the `%` character natively into Postgres `ILIKE` and `NOT ILIKE` partial substring matches.
* **Type Casting**: Safely resolves dynamic combinations by casting values instantly into the physical database types mapped in the schema (e.g. parsing `uuid` bindings to `::uuid`, formatting DateTimes to `::timestamptz`, and numbers to `::numeric`). * **Type Casting**: Safely resolves dynamic combinations by casting values instantly into the physical database types mapped in the schema (e.g. parsing `uuid` bindings to `::uuid`, formatting DateTimes to `::timestamptz`, and numbers to `::numeric`).
### 4. The Stem Engine
### The Stem Engine
Rather than over-fetching heavy Entity payloads and trimming them, Punc Framework Websockets depend on isolated subgraphs defined as **Stems**. Rather than over-fetching heavy Entity payloads and trimming them, Punc Framework Websockets depend on isolated subgraphs defined as **Stems**.
A `Stem` is **not a JSON Pointer** or a physical path string (like `/properties/contacts/items/phone_number`). It is simply a declaration of an **Entity Type boundary** that exists somewhere within the compiled JSON Schema graph. A `Stem` is a declaration of an **Entity Type boundary** that exists somewhere within the compiled JSON Schema graph, expressed using **`gjson` multipath syntax** (e.g., `contacts.#.phone_numbers.#`).
Because `pg_notify` (Beats) fire rigidly from physical Postgres tables (e.g. `{"type": "phone_number"}`), the Go Framework only ever needs to know: "Does the schema `with_contacts.person` contain the `phone_number` Entity anywhere inside its tree?" Because `pg_notify` (Beats) fire rigidly from physical Postgres tables (e.g. `{"type": "phone_number"}`), the Go Framework only ever needs to know: "Does the schema `with_contacts.person` contain the `phone_number` Entity anywhere inside its tree, and if so, what is the gjson path to iterate its payload?"
* **Initialization:** During startup (`jspg_stems()`), the database crawls all Schemas and maps out every physical Entity Type it references. It builds a flat dictionary of `Schema ID -> [Entity Types]` (e.g. `with_contacts.person -> ["person", "contact", "phone_number", "email_address"]`). * **Initialization:** During startup (`jspg_stems()`), the database crawls all Schemas and maps out every physical Entity Type it references. It builds a highly optimized `HashMap<String, HashMap<String, Arc<Stem>>>` providing strictly `O(1)` memory lookups mapping `Schema ID -> { Stem Path -> Entity Type }`.
* **Identifier Prioritization**: When determining if a nested object boundary is an Entity, JSPG natively prioritizes defined `$id` tags over `$ref` inheritance pointers to prevent polymorphic boundaries from devolving into their generic base classes. * **GJSON Pathing:** Unlike standard JSON Pointers, stems utilize `.#` array iterator syntax. The Go web server consumes this native path (e.g. `lines.#`) across the raw Postgres JSON byte payload, extracting all active UUIDs in one massive sub-millisecond sweep without unmarshaling Go ASTs.
* **Cyclical Deduplication**: Because Punc relationships often reference back on themselves via deeply nested classes, the Stem Engine applies intelligent path deduplication. If the active `current_path` already ends with the target entity string, it traverses the inheritance properties without appending the entity to the stem path again, eliminating infinite powerset loops. * **Polymorphic Condition Selectors:** When trailing paths would otherwise collide because of abstract polymorphic type definitions (e.g., a `target` property bounded by a `oneOf` taking either `phone_number` or `email_address`), JSPG natively appends evaluated `gjson` type conditions into the path (e.g. `contacts.#.target#(type=="phone_number")`). This guarantees `O(1)` key uniqueness in the HashMap while retaining extreme array extraction speeds natively without runtime AST evaluation.
* **Relationship Path Squashing:** When calculating nested string paths structurally to discover these boundaries, JSPG intentionally **omits** properties natively named `target` or `source` if they belong to a native database `relationship` table override. This ensures paths like `phone_numbers/contact/target` correctly register their beat resolution pattern as `phone_numbers/contact/phone_number`. * **Identifier Prioritization:** When determining if a nested object boundary is an Entity, JSPG natively prioritizes defined `$id` tags over `$ref` inheritance pointers to prevent polymorphic boundaries from devolving into their generic base classes.
* **Cyclical Deduplication:** Because Punc relationships often reference back on themselves via deeply nested classes, the Stem Engine applies intelligent path deduplication. If the active `current_path` already ends with the target entity string, it traverses the inheritance properties without appending the entity to the stem path again, eliminating infinite powerset loops.
* **Relationship Path Squashing:** When calculating string paths structurally, JSPG intentionally **omits** properties natively named `target` or `source` if they belong to a native database `relationship` table override.
* **The Go Router**: The Golang Punc framework uses this exact mapping to register WebSocket Beat frequencies exclusively on the Entity types discovered. * **The Go Router**: The Golang Punc framework uses this exact mapping to register WebSocket Beat frequencies exclusively on the Entity types discovered.
* **The Queryer Execution**: When the Go framework asks JSPG to hydrate a partial `phone_number` stem for the `with_contacts.person` schema, instead of jumping through string paths, the SQL Compiler simply reaches into the Schema's AST using the `phone_number` Type string, pulls out exactly that entity's mapping rules, and returns a fully correlated `SELECT` block! This natively handles nested array properties injected via `oneOf` or array references efficiently bypassing runtime powerset expansion. * **The Queryer Execution**: When the Go framework asks JSPG to hydrate a partial `phone_number` stem for the `with_contacts.person` schema, instead of jumping through string paths, the SQL Compiler simply reaches into the Schema's AST using the `phone_number` Type string, pulls out exactly that entity's mapping rules, and returns a fully correlated `SELECT` block! This natively handles nested array properties injected via `oneOf` or array references efficiently bypassing runtime powerset expansion.
* **Performance:** These Stem execution structures are fully statically compiled via SPI and map perfectly to `O(1)` real-time routing logic on the application tier. * **Performance:** These Stem execution structures are fully statically compiled via SPI and map perfectly to `O(1)` real-time routing logic on the application tier.
## 5. Testing & Execution Architecture ## 5. Testing & Execution Architecture
JSPG implements a strict separation of concerns to bypass the need to boot a full PostgreSQL cluster for unit and integration testing. Because `pgrx::spi::Spi` directly links to PostgreSQL C-headers, building the library with `cargo test` on macOS natively normally results in fatal `dyld` crashes. JSPG implements a strict separation of concerns to bypass the need to boot a full PostgreSQL cluster for unit and integration testing. Because `pgrx::spi::Spi` directly links to PostgreSQL C-headers, building the library with `cargo test` on macOS natively normally results in fatal `dyld` crashes.

View File

@ -1148,7 +1148,7 @@
"description": "Full person stem query on phone number contact", "description": "Full person stem query on phone number contact",
"action": "query", "action": "query",
"schema_id": "full.person", "schema_id": "full.person",
"stem": "phone_numbers/contact", "stem": "phone_numbers.#",
"expect": { "expect": {
"success": true, "success": true,
"sql": [ "sql": [
@ -1172,7 +1172,7 @@
"description": "Full person stem query on phone number contact on phone number", "description": "Full person stem query on phone number contact on phone number",
"action": "query", "action": "query",
"schema_id": "full.person", "schema_id": "full.person",
"stem": "phone_numbers/contact/phone_number", "stem": "phone_numbers.#.target",
"expect": { "expect": {
"success": true, "success": true,
"sql": [ "sql": [
@ -1195,7 +1195,7 @@
"description": "Full person stem query on contact email address", "description": "Full person stem query on contact email address",
"action": "query", "action": "query",
"schema_id": "full.person", "schema_id": "full.person",
"stem": "contacts/contact/email_address", "stem": "contacts.#.target#(type==\"email_address\")",
"expect": { "expect": {
"success": true, "success": true,
"sql": [ "sql": [

View File

@ -10,9 +10,13 @@
"type": "relation", "type": "relation",
"constraint": "fk_contact_entity", "constraint": "fk_contact_entity",
"source_type": "contact", "source_type": "contact",
"source_columns": ["entity_id"], "source_columns": [
"entity_id"
],
"destination_type": "person", "destination_type": "person",
"destination_columns": ["id"], "destination_columns": [
"id"
],
"prefix": null "prefix": null
}, },
{ {
@ -20,88 +24,132 @@
"type": "relation", "type": "relation",
"constraint": "fk_relationship_target", "constraint": "fk_relationship_target",
"source_type": "relationship", "source_type": "relationship",
"source_columns": ["target_id", "target_type"], "source_columns": [
"target_id",
"target_type"
],
"destination_type": "entity", "destination_type": "entity",
"destination_columns": ["id", "type"], "destination_columns": [
"id",
"type"
],
"prefix": "target" "prefix": "target"
} }
], ],
"types": [ "types": [
{ {
"name": "entity", "name": "entity",
"hierarchy": ["entity"], "hierarchy": [
"schemas": [{ "entity"
"$id": "entity", ],
"type": "object", "schemas": [
"properties": {} {
}] "$id": "entity",
"type": "object",
"properties": {}
}
]
}, },
{ {
"name": "person", "name": "person",
"hierarchy": ["person", "entity"], "hierarchy": [
"schemas": [{ "person",
"$id": "person", "entity"
"$ref": "entity", ],
"properties": {} "schemas": [
}] {
"$id": "person",
"$ref": "entity",
"properties": {}
}
]
}, },
{ {
"name": "email_address", "name": "email_address",
"hierarchy": ["email_address", "entity"], "hierarchy": [
"schemas": [{ "email_address",
"$id": "email_address", "entity"
"$ref": "entity", ],
"properties": {} "schemas": [
}] {
"$id": "email_address",
"$ref": "entity",
"properties": {}
}
]
}, },
{ {
"name": "phone_number", "name": "phone_number",
"hierarchy": ["phone_number", "entity"], "hierarchy": [
"schemas": [{ "phone_number",
"$id": "phone_number", "entity"
"$ref": "entity", ],
"properties": {} "schemas": [
}] {
"$id": "phone_number",
"$ref": "entity",
"properties": {}
}
]
}, },
{ {
"name": "relationship", "name": "relationship",
"relationship": true, "relationship": true,
"hierarchy": ["relationship", "entity"], "hierarchy": [
"schemas": [{ "relationship",
"$id": "relationship", "entity"
"$ref": "entity", ],
"properties": {} "schemas": [
}] {
"$id": "relationship",
"$ref": "entity",
"properties": {}
}
]
}, },
{ {
"name": "contact", "name": "contact",
"relationship": true, "relationship": true,
"hierarchy": ["contact", "relationship", "entity"], "hierarchy": [
"schemas": [{ "contact",
"$id": "contact", "relationship",
"$ref": "relationship", "entity"
"properties": { ],
"target": { "schemas": [
"oneOf": [ {
{ "$ref": "phone_number" }, "$id": "contact",
{ "$ref": "email_address" } "$ref": "relationship",
] "properties": {
"target": {
"oneOf": [
{
"$ref": "phone_number"
},
{
"$ref": "email_address"
}
]
}
} }
} }
}] ]
}, },
{ {
"name": "save_person", "name": "save_person",
"schemas": [{ "schemas": [
"$id": "save_person.response", {
"$ref": "person", "$id": "save_person.response",
"properties": { "$ref": "person",
"contacts": { "properties": {
"type": "array", "contacts": {
"items": { "$ref": "contact" } "type": "array",
"items": {
"$ref": "contact"
}
}
} }
} }
}] ]
} }
] ]
}, },
@ -116,15 +164,15 @@
"": { "": {
"type": "person" "type": "person"
}, },
"contacts/contact": { "contacts.#": {
"type": "contact", "type": "contact",
"relation": "contacts_id" "relation": "contacts_id"
}, },
"contacts/contact/email_address": { "contacts.#.target#(type==\"email_address\")": {
"type": "email_address", "type": "email_address",
"relation": "target_id" "relation": "target_id"
}, },
"contacts/contact/phone_number": { "contacts.#.target#(type==\"phone_number\")": {
"type": "phone_number", "type": "phone_number",
"relation": "target_id" "relation": "target_id"
} }
@ -133,11 +181,11 @@
"": { "": {
"type": "contact" "type": "contact"
}, },
"email_address": { "target#(type==\"email_address\")": {
"type": "email_address", "type": "email_address",
"relation": "target_id" "relation": "target_id"
}, },
"phone_number": { "target#(type==\"phone_number\")": {
"type": "phone_number", "type": "phone_number",
"relation": "target_id" "relation": "target_id"
} }
@ -152,7 +200,7 @@
"type": "email_address" "type": "email_address"
} }
}, },
"phone_number": { "phone_number": {
"": { "": {
"type": "phone_number" "type": "phone_number"
} }
@ -172,4 +220,4 @@
} }
] ]
} }
] ]

View File

@ -265,12 +265,12 @@ impl Database {
String::from(""), String::from(""),
None, None,
None, None,
true, false,
&mut inner_map, &mut inner_map,
Vec::new(),
&mut errors, &mut errors,
); );
if !inner_map.is_empty() { if !inner_map.is_empty() {
println!("SCHEMA: {} STEMS: {:?}", schema_id, inner_map.keys());
db_stems.insert(schema_id, inner_map); db_stems.insert(schema_id, inner_map);
} }
} }
@ -288,11 +288,12 @@ impl Database {
db: &Database, db: &Database,
root_schema_id: &str, root_schema_id: &str,
schema: &Schema, schema: &Schema,
mut current_path: String, current_path: String,
parent_type: Option<String>, parent_type: Option<String>,
property_name: Option<String>, property_name: Option<String>,
is_root: bool, is_polymorphic: bool,
inner_map: &mut HashMap<String, Arc<Stem>>, inner_map: &mut HashMap<String, Arc<Stem>>,
seen_entities: Vec<String>,
errors: &mut Vec<crate::drop::Error>, errors: &mut Vec<crate::drop::Error>,
) { ) {
let mut is_entity = false; let mut is_entity = false;
@ -323,6 +324,12 @@ impl Database {
} }
} }
if is_entity {
if seen_entities.contains(&entity_type) {
return; // Break cyclical schemas!
}
}
let mut relation_col = None; let mut relation_col = None;
if is_entity { if is_entity {
if let (Some(pt), Some(prop)) = (&parent_type, &property_name) { if let (Some(pt), Some(prop)) = (&parent_type, &property_name) {
@ -344,46 +351,21 @@ impl Database {
} }
} }
let mut final_path = current_path.clone();
if is_polymorphic && !final_path.is_empty() && !final_path.ends_with(&entity_type) {
if final_path.ends_with(".#") {
final_path = format!("{}(type==\"{}\")", final_path, entity_type);
} else {
final_path = format!("{}#(type==\"{}\")", final_path, entity_type);
}
}
let stem = Stem { let stem = Stem {
r#type: entity_type.clone(), r#type: entity_type.clone(),
relation: relation_col, relation: relation_col,
schema: Arc::new(schema.clone()), schema: Arc::new(schema.clone()),
}; };
inner_map.insert(final_path, Arc::new(stem));
let mut branch_path = if is_root {
String::new()
} else if current_path.is_empty() {
entity_type.clone()
} else {
format!("{}/{}", current_path, entity_type)
};
// DEDUPLICATION: If we just recursed into the EXACT same entity type definition,
// do not append again and do not re-register the stem.
let already_registered =
if current_path == entity_type || current_path.ends_with(&format!("/{}", entity_type)) {
branch_path = current_path.clone();
true
} else {
false
};
if !already_registered {
if inner_map.contains_key(&branch_path) {
errors.push(crate::drop::Error {
code: "STEM_COLLISION".to_string(),
message: format!("The stem path `{}` resolves to multiple Entity boundaries. This usually occurs during un-wrapped $family or oneOf polymorphic schemas where multiple Entities are directly assigned to the same property. To fix this, encapsulate the polymorphic branch.", branch_path),
details: crate::drop::ErrorDetails {
path: root_schema_id.to_string(),
},
});
}
inner_map.insert(branch_path.clone(), Arc::new(stem));
}
// Update current_path for structural children
current_path = branch_path;
} }
let next_parent = if is_entity { let next_parent = if is_entity {
@ -392,34 +374,22 @@ impl Database {
parent_type.clone() parent_type.clone()
}; };
let pass_seen = if is_entity {
let mut ns = seen_entities.clone();
ns.push(entity_type.clone());
ns
} else {
seen_entities.clone()
};
// Properties branch // Properties branch
if let Some(props) = &schema.obj.properties { if let Some(props) = &schema.obj.properties {
for (k, v) in props { for (k, v) in props {
// Bypass target and source properties if we are in a relationship
if let Some(parent_str) = &next_parent {
if let Some(pt) = db.types.get(parent_str) {
if pt.relationship && (k == "target" || k == "source") {
Self::discover_stems(
db,
root_schema_id,
v,
current_path.clone(),
next_parent.clone(),
Some(k.clone()),
false,
inner_map,
errors,
);
continue;
}
}
}
// Standard Property Pathing // Standard Property Pathing
let next_path = if current_path.is_empty() { let next_path = if current_path.is_empty() {
k.clone() k.clone()
} else { } else {
format!("{}/{}", current_path, k) format!("{}.{}", current_path, k)
}; };
Self::discover_stems( Self::discover_stems(
@ -431,6 +401,7 @@ impl Database {
Some(k.clone()), Some(k.clone()),
false, false,
inner_map, inner_map,
pass_seen.clone(),
errors, errors,
); );
} }
@ -438,15 +409,22 @@ impl Database {
// Array Item branch // Array Item branch
if let Some(items) = &schema.obj.items { if let Some(items) = &schema.obj.items {
let next_path = if current_path.is_empty() {
String::from("#")
} else {
format!("{}.#", current_path)
};
Self::discover_stems( Self::discover_stems(
db, db,
root_schema_id, root_schema_id,
items, items,
current_path.clone(), next_path,
next_parent.clone(), next_parent.clone(),
property_name.clone(), property_name.clone(),
false, // Arrays themselves aren't polymorphic branches, their items might be false,
inner_map, inner_map,
pass_seen.clone(),
errors, errors,
); );
} }
@ -463,8 +441,9 @@ impl Database {
current_path.clone(), current_path.clone(),
next_parent.clone(), next_parent.clone(),
property_name.clone(), property_name.clone(),
false, is_polymorphic,
inner_map, inner_map,
seen_entities.clone(),
errors, errors,
); );
} }
@ -481,8 +460,9 @@ impl Database {
current_path.clone(), current_path.clone(),
next_parent.clone(), next_parent.clone(),
property_name.clone(), property_name.clone(),
false, true,
inner_map, inner_map,
pass_seen.clone(),
errors, errors,
); );
} }
@ -496,8 +476,9 @@ impl Database {
current_path.clone(), current_path.clone(),
next_parent.clone(), next_parent.clone(),
property_name.clone(), property_name.clone(),
false, is_polymorphic,
inner_map, inner_map,
pass_seen.clone(),
errors, errors,
); );
} }

View File

@ -112,7 +112,7 @@ pub fn jspg_validate(schema_id: &str, instance: JsonB) -> JsonB {
#[cfg_attr(not(test), pg_extern)] #[cfg_attr(not(test), pg_extern)]
pub fn jspg_stems() -> JsonB { pub fn jspg_stems() -> JsonB {
use serde_json::{Map, Value}; use serde_json::Value;
let engine_opt = { let engine_opt = {
let lock = GLOBAL_JSPG.read().unwrap(); let lock = GLOBAL_JSPG.read().unwrap();
@ -121,9 +121,24 @@ pub fn jspg_stems() -> JsonB {
match engine_opt { match engine_opt {
Some(engine) => { Some(engine) => {
JsonB(serde_json::to_value(&engine.database.stems).unwrap_or(Value::Object(Map::new()))) let mut result_arr = Vec::new();
for (schema_name, stems_map) in &engine.database.stems {
let mut stems_arr = Vec::new();
for (path_key, stem_arc) in stems_map {
stems_arr.push(serde_json::json!({
"path": path_key,
"type": stem_arc.r#type,
"relation": stem_arc.relation
}));
}
result_arr.push(serde_json::json!({
"schema": schema_name,
"stems": stems_arr
}));
}
JsonB(serde_json::to_value(result_arr).unwrap_or(Value::Array(Vec::new())))
} }
None => JsonB(Value::Object(Map::new())), None => JsonB(Value::Array(Vec::new())),
} }
} }

2391
test_output.txt Normal file

File diff suppressed because it is too large Load Diff