gjson pathing for stem paths
This commit is contained in:
18
GEMINI.md
18
GEMINI.md
@ -101,22 +101,24 @@ The Queryer transforms Postgres into a pre-compiled Semantic Query Engine via th
|
||||
* **Array Inclusion**: `{"$in": [values]}`, `{"$nin": [values]}` use native `jsonb_array_elements_text()` bindings to enforce `IN` and `NOT IN` logic without runtime SQL injection risks.
|
||||
* **Text Matching (ILIKE)**: Evaluates `$eq` or `$ne` against string fields containing the `%` character natively into Postgres `ILIKE` and `NOT ILIKE` partial substring matches.
|
||||
* **Type Casting**: Safely resolves dynamic combinations by casting values instantly into the physical database types mapped in the schema (e.g. parsing `uuid` bindings to `::uuid`, formatting DateTimes to `::timestamptz`, and numbers to `::numeric`).
|
||||
### 4. The Stem Engine
|
||||
|
||||
### The Stem Engine
|
||||
|
||||
Rather than over-fetching heavy Entity payloads and trimming them, Punc Framework Websockets depend on isolated subgraphs defined as **Stems**.
|
||||
A `Stem` is **not a JSON Pointer** or a physical path string (like `/properties/contacts/items/phone_number`). It is simply a declaration of an **Entity Type boundary** that exists somewhere within the compiled JSON Schema graph.
|
||||
A `Stem` is a declaration of an **Entity Type boundary** that exists somewhere within the compiled JSON Schema graph, expressed using **`gjson` multipath syntax** (e.g., `contacts.#.phone_numbers.#`).
|
||||
|
||||
Because `pg_notify` (Beats) fire rigidly from physical Postgres tables (e.g. `{"type": "phone_number"}`), the Go Framework only ever needs to know: "Does the schema `with_contacts.person` contain the `phone_number` Entity anywhere inside its tree?"
|
||||
Because `pg_notify` (Beats) fire rigidly from physical Postgres tables (e.g. `{"type": "phone_number"}`), the Go Framework only ever needs to know: "Does the schema `with_contacts.person` contain the `phone_number` Entity anywhere inside its tree, and if so, what is the gjson path to iterate its payload?"
|
||||
|
||||
* **Initialization:** During startup (`jspg_stems()`), the database crawls all Schemas and maps out every physical Entity Type it references. It builds a flat dictionary of `Schema ID -> [Entity Types]` (e.g. `with_contacts.person -> ["person", "contact", "phone_number", "email_address"]`).
|
||||
* **Identifier Prioritization**: When determining if a nested object boundary is an Entity, JSPG natively prioritizes defined `$id` tags over `$ref` inheritance pointers to prevent polymorphic boundaries from devolving into their generic base classes.
|
||||
* **Cyclical Deduplication**: Because Punc relationships often reference back on themselves via deeply nested classes, the Stem Engine applies intelligent path deduplication. If the active `current_path` already ends with the target entity string, it traverses the inheritance properties without appending the entity to the stem path again, eliminating infinite powerset loops.
|
||||
* **Relationship Path Squashing:** When calculating nested string paths structurally to discover these boundaries, JSPG intentionally **omits** properties natively named `target` or `source` if they belong to a native database `relationship` table override. This ensures paths like `phone_numbers/contact/target` correctly register their beat resolution pattern as `phone_numbers/contact/phone_number`.
|
||||
* **Initialization:** During startup (`jspg_stems()`), the database crawls all Schemas and maps out every physical Entity Type it references. It builds a highly optimized `HashMap<String, HashMap<String, Arc<Stem>>>` providing strictly `O(1)` memory lookups mapping `Schema ID -> { Stem Path -> Entity Type }`.
|
||||
* **GJSON Pathing:** Unlike standard JSON Pointers, stems utilize `.#` array iterator syntax. The Go web server consumes this native path (e.g. `lines.#`) across the raw Postgres JSON byte payload, extracting all active UUIDs in one massive sub-millisecond sweep without unmarshaling Go ASTs.
|
||||
* **Polymorphic Condition Selectors:** When trailing paths would otherwise collide because of abstract polymorphic type definitions (e.g., a `target` property bounded by a `oneOf` taking either `phone_number` or `email_address`), JSPG natively appends evaluated `gjson` type conditions into the path (e.g. `contacts.#.target#(type=="phone_number")`). This guarantees `O(1)` key uniqueness in the HashMap while retaining extreme array extraction speeds natively without runtime AST evaluation.
|
||||
* **Identifier Prioritization:** When determining if a nested object boundary is an Entity, JSPG natively prioritizes defined `$id` tags over `$ref` inheritance pointers to prevent polymorphic boundaries from devolving into their generic base classes.
|
||||
* **Cyclical Deduplication:** Because Punc relationships often reference back on themselves via deeply nested classes, the Stem Engine applies intelligent path deduplication. If the active `current_path` already ends with the target entity string, it traverses the inheritance properties without appending the entity to the stem path again, eliminating infinite powerset loops.
|
||||
* **Relationship Path Squashing:** When calculating string paths structurally, JSPG intentionally **omits** properties natively named `target` or `source` if they belong to a native database `relationship` table override.
|
||||
* **The Go Router**: The Golang Punc framework uses this exact mapping to register WebSocket Beat frequencies exclusively on the Entity types discovered.
|
||||
* **The Queryer Execution**: When the Go framework asks JSPG to hydrate a partial `phone_number` stem for the `with_contacts.person` schema, instead of jumping through string paths, the SQL Compiler simply reaches into the Schema's AST using the `phone_number` Type string, pulls out exactly that entity's mapping rules, and returns a fully correlated `SELECT` block! This natively handles nested array properties injected via `oneOf` or array references efficiently bypassing runtime powerset expansion.
|
||||
* **Performance:** These Stem execution structures are fully statically compiled via SPI and map perfectly to `O(1)` real-time routing logic on the application tier.
|
||||
|
||||
|
||||
## 5. Testing & Execution Architecture
|
||||
|
||||
JSPG implements a strict separation of concerns to bypass the need to boot a full PostgreSQL cluster for unit and integration testing. Because `pgrx::spi::Spi` directly links to PostgreSQL C-headers, building the library with `cargo test` on macOS natively normally results in fatal `dyld` crashes.
|
||||
|
||||
@ -1148,7 +1148,7 @@
|
||||
"description": "Full person stem query on phone number contact",
|
||||
"action": "query",
|
||||
"schema_id": "full.person",
|
||||
"stem": "phone_numbers/contact",
|
||||
"stem": "phone_numbers.#",
|
||||
"expect": {
|
||||
"success": true,
|
||||
"sql": [
|
||||
@ -1172,7 +1172,7 @@
|
||||
"description": "Full person stem query on phone number contact on phone number",
|
||||
"action": "query",
|
||||
"schema_id": "full.person",
|
||||
"stem": "phone_numbers/contact/phone_number",
|
||||
"stem": "phone_numbers.#.target",
|
||||
"expect": {
|
||||
"success": true,
|
||||
"sql": [
|
||||
@ -1195,7 +1195,7 @@
|
||||
"description": "Full person stem query on contact email address",
|
||||
"action": "query",
|
||||
"schema_id": "full.person",
|
||||
"stem": "contacts/contact/email_address",
|
||||
"stem": "contacts.#.target#(type==\"email_address\")",
|
||||
"expect": {
|
||||
"success": true,
|
||||
"sql": [
|
||||
|
||||
@ -10,9 +10,13 @@
|
||||
"type": "relation",
|
||||
"constraint": "fk_contact_entity",
|
||||
"source_type": "contact",
|
||||
"source_columns": ["entity_id"],
|
||||
"source_columns": [
|
||||
"entity_id"
|
||||
],
|
||||
"destination_type": "person",
|
||||
"destination_columns": ["id"],
|
||||
"destination_columns": [
|
||||
"id"
|
||||
],
|
||||
"prefix": null
|
||||
},
|
||||
{
|
||||
@ -20,88 +24,132 @@
|
||||
"type": "relation",
|
||||
"constraint": "fk_relationship_target",
|
||||
"source_type": "relationship",
|
||||
"source_columns": ["target_id", "target_type"],
|
||||
"source_columns": [
|
||||
"target_id",
|
||||
"target_type"
|
||||
],
|
||||
"destination_type": "entity",
|
||||
"destination_columns": ["id", "type"],
|
||||
"destination_columns": [
|
||||
"id",
|
||||
"type"
|
||||
],
|
||||
"prefix": "target"
|
||||
}
|
||||
],
|
||||
"types": [
|
||||
{
|
||||
"name": "entity",
|
||||
"hierarchy": ["entity"],
|
||||
"schemas": [{
|
||||
"$id": "entity",
|
||||
"type": "object",
|
||||
"properties": {}
|
||||
}]
|
||||
"hierarchy": [
|
||||
"entity"
|
||||
],
|
||||
"schemas": [
|
||||
{
|
||||
"$id": "entity",
|
||||
"type": "object",
|
||||
"properties": {}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "person",
|
||||
"hierarchy": ["person", "entity"],
|
||||
"schemas": [{
|
||||
"$id": "person",
|
||||
"$ref": "entity",
|
||||
"properties": {}
|
||||
}]
|
||||
"hierarchy": [
|
||||
"person",
|
||||
"entity"
|
||||
],
|
||||
"schemas": [
|
||||
{
|
||||
"$id": "person",
|
||||
"$ref": "entity",
|
||||
"properties": {}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "email_address",
|
||||
"hierarchy": ["email_address", "entity"],
|
||||
"schemas": [{
|
||||
"$id": "email_address",
|
||||
"$ref": "entity",
|
||||
"properties": {}
|
||||
}]
|
||||
"hierarchy": [
|
||||
"email_address",
|
||||
"entity"
|
||||
],
|
||||
"schemas": [
|
||||
{
|
||||
"$id": "email_address",
|
||||
"$ref": "entity",
|
||||
"properties": {}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "phone_number",
|
||||
"hierarchy": ["phone_number", "entity"],
|
||||
"schemas": [{
|
||||
"$id": "phone_number",
|
||||
"$ref": "entity",
|
||||
"properties": {}
|
||||
}]
|
||||
"hierarchy": [
|
||||
"phone_number",
|
||||
"entity"
|
||||
],
|
||||
"schemas": [
|
||||
{
|
||||
"$id": "phone_number",
|
||||
"$ref": "entity",
|
||||
"properties": {}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "relationship",
|
||||
"relationship": true,
|
||||
"hierarchy": ["relationship", "entity"],
|
||||
"schemas": [{
|
||||
"$id": "relationship",
|
||||
"$ref": "entity",
|
||||
"properties": {}
|
||||
}]
|
||||
"hierarchy": [
|
||||
"relationship",
|
||||
"entity"
|
||||
],
|
||||
"schemas": [
|
||||
{
|
||||
"$id": "relationship",
|
||||
"$ref": "entity",
|
||||
"properties": {}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "contact",
|
||||
"relationship": true,
|
||||
"hierarchy": ["contact", "relationship", "entity"],
|
||||
"schemas": [{
|
||||
"$id": "contact",
|
||||
"$ref": "relationship",
|
||||
"properties": {
|
||||
"target": {
|
||||
"oneOf": [
|
||||
{ "$ref": "phone_number" },
|
||||
{ "$ref": "email_address" }
|
||||
]
|
||||
"hierarchy": [
|
||||
"contact",
|
||||
"relationship",
|
||||
"entity"
|
||||
],
|
||||
"schemas": [
|
||||
{
|
||||
"$id": "contact",
|
||||
"$ref": "relationship",
|
||||
"properties": {
|
||||
"target": {
|
||||
"oneOf": [
|
||||
{
|
||||
"$ref": "phone_number"
|
||||
},
|
||||
{
|
||||
"$ref": "email_address"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}]
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "save_person",
|
||||
"schemas": [{
|
||||
"$id": "save_person.response",
|
||||
"$ref": "person",
|
||||
"properties": {
|
||||
"contacts": {
|
||||
"type": "array",
|
||||
"items": { "$ref": "contact" }
|
||||
"schemas": [
|
||||
{
|
||||
"$id": "save_person.response",
|
||||
"$ref": "person",
|
||||
"properties": {
|
||||
"contacts": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "contact"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}]
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -116,15 +164,15 @@
|
||||
"": {
|
||||
"type": "person"
|
||||
},
|
||||
"contacts/contact": {
|
||||
"contacts.#": {
|
||||
"type": "contact",
|
||||
"relation": "contacts_id"
|
||||
},
|
||||
"contacts/contact/email_address": {
|
||||
"contacts.#.target#(type==\"email_address\")": {
|
||||
"type": "email_address",
|
||||
"relation": "target_id"
|
||||
},
|
||||
"contacts/contact/phone_number": {
|
||||
"contacts.#.target#(type==\"phone_number\")": {
|
||||
"type": "phone_number",
|
||||
"relation": "target_id"
|
||||
}
|
||||
@ -133,11 +181,11 @@
|
||||
"": {
|
||||
"type": "contact"
|
||||
},
|
||||
"email_address": {
|
||||
"target#(type==\"email_address\")": {
|
||||
"type": "email_address",
|
||||
"relation": "target_id"
|
||||
},
|
||||
"phone_number": {
|
||||
"target#(type==\"phone_number\")": {
|
||||
"type": "phone_number",
|
||||
"relation": "target_id"
|
||||
}
|
||||
@ -152,7 +200,7 @@
|
||||
"type": "email_address"
|
||||
}
|
||||
},
|
||||
"phone_number": {
|
||||
"phone_number": {
|
||||
"": {
|
||||
"type": "phone_number"
|
||||
}
|
||||
|
||||
@ -265,12 +265,12 @@ impl Database {
|
||||
String::from(""),
|
||||
None,
|
||||
None,
|
||||
true,
|
||||
false,
|
||||
&mut inner_map,
|
||||
Vec::new(),
|
||||
&mut errors,
|
||||
);
|
||||
if !inner_map.is_empty() {
|
||||
println!("SCHEMA: {} STEMS: {:?}", schema_id, inner_map.keys());
|
||||
db_stems.insert(schema_id, inner_map);
|
||||
}
|
||||
}
|
||||
@ -288,11 +288,12 @@ impl Database {
|
||||
db: &Database,
|
||||
root_schema_id: &str,
|
||||
schema: &Schema,
|
||||
mut current_path: String,
|
||||
current_path: String,
|
||||
parent_type: Option<String>,
|
||||
property_name: Option<String>,
|
||||
is_root: bool,
|
||||
is_polymorphic: bool,
|
||||
inner_map: &mut HashMap<String, Arc<Stem>>,
|
||||
seen_entities: Vec<String>,
|
||||
errors: &mut Vec<crate::drop::Error>,
|
||||
) {
|
||||
let mut is_entity = false;
|
||||
@ -323,6 +324,12 @@ impl Database {
|
||||
}
|
||||
}
|
||||
|
||||
if is_entity {
|
||||
if seen_entities.contains(&entity_type) {
|
||||
return; // Break cyclical schemas!
|
||||
}
|
||||
}
|
||||
|
||||
let mut relation_col = None;
|
||||
if is_entity {
|
||||
if let (Some(pt), Some(prop)) = (&parent_type, &property_name) {
|
||||
@ -344,46 +351,21 @@ impl Database {
|
||||
}
|
||||
}
|
||||
|
||||
let mut final_path = current_path.clone();
|
||||
if is_polymorphic && !final_path.is_empty() && !final_path.ends_with(&entity_type) {
|
||||
if final_path.ends_with(".#") {
|
||||
final_path = format!("{}(type==\"{}\")", final_path, entity_type);
|
||||
} else {
|
||||
final_path = format!("{}#(type==\"{}\")", final_path, entity_type);
|
||||
}
|
||||
}
|
||||
|
||||
let stem = Stem {
|
||||
r#type: entity_type.clone(),
|
||||
relation: relation_col,
|
||||
schema: Arc::new(schema.clone()),
|
||||
};
|
||||
|
||||
let mut branch_path = if is_root {
|
||||
String::new()
|
||||
} else if current_path.is_empty() {
|
||||
entity_type.clone()
|
||||
} else {
|
||||
format!("{}/{}", current_path, entity_type)
|
||||
};
|
||||
|
||||
// DEDUPLICATION: If we just recursed into the EXACT same entity type definition,
|
||||
// do not append again and do not re-register the stem.
|
||||
let already_registered =
|
||||
if current_path == entity_type || current_path.ends_with(&format!("/{}", entity_type)) {
|
||||
branch_path = current_path.clone();
|
||||
true
|
||||
} else {
|
||||
false
|
||||
};
|
||||
|
||||
if !already_registered {
|
||||
if inner_map.contains_key(&branch_path) {
|
||||
errors.push(crate::drop::Error {
|
||||
code: "STEM_COLLISION".to_string(),
|
||||
message: format!("The stem path `{}` resolves to multiple Entity boundaries. This usually occurs during un-wrapped $family or oneOf polymorphic schemas where multiple Entities are directly assigned to the same property. To fix this, encapsulate the polymorphic branch.", branch_path),
|
||||
details: crate::drop::ErrorDetails {
|
||||
path: root_schema_id.to_string(),
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
inner_map.insert(branch_path.clone(), Arc::new(stem));
|
||||
}
|
||||
|
||||
// Update current_path for structural children
|
||||
current_path = branch_path;
|
||||
inner_map.insert(final_path, Arc::new(stem));
|
||||
}
|
||||
|
||||
let next_parent = if is_entity {
|
||||
@ -392,34 +374,22 @@ impl Database {
|
||||
parent_type.clone()
|
||||
};
|
||||
|
||||
let pass_seen = if is_entity {
|
||||
let mut ns = seen_entities.clone();
|
||||
ns.push(entity_type.clone());
|
||||
ns
|
||||
} else {
|
||||
seen_entities.clone()
|
||||
};
|
||||
|
||||
// Properties branch
|
||||
if let Some(props) = &schema.obj.properties {
|
||||
for (k, v) in props {
|
||||
// Bypass target and source properties if we are in a relationship
|
||||
if let Some(parent_str) = &next_parent {
|
||||
if let Some(pt) = db.types.get(parent_str) {
|
||||
if pt.relationship && (k == "target" || k == "source") {
|
||||
Self::discover_stems(
|
||||
db,
|
||||
root_schema_id,
|
||||
v,
|
||||
current_path.clone(),
|
||||
next_parent.clone(),
|
||||
Some(k.clone()),
|
||||
false,
|
||||
inner_map,
|
||||
errors,
|
||||
);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Standard Property Pathing
|
||||
let next_path = if current_path.is_empty() {
|
||||
k.clone()
|
||||
} else {
|
||||
format!("{}/{}", current_path, k)
|
||||
format!("{}.{}", current_path, k)
|
||||
};
|
||||
|
||||
Self::discover_stems(
|
||||
@ -431,6 +401,7 @@ impl Database {
|
||||
Some(k.clone()),
|
||||
false,
|
||||
inner_map,
|
||||
pass_seen.clone(),
|
||||
errors,
|
||||
);
|
||||
}
|
||||
@ -438,15 +409,22 @@ impl Database {
|
||||
|
||||
// Array Item branch
|
||||
if let Some(items) = &schema.obj.items {
|
||||
let next_path = if current_path.is_empty() {
|
||||
String::from("#")
|
||||
} else {
|
||||
format!("{}.#", current_path)
|
||||
};
|
||||
|
||||
Self::discover_stems(
|
||||
db,
|
||||
root_schema_id,
|
||||
items,
|
||||
current_path.clone(),
|
||||
next_path,
|
||||
next_parent.clone(),
|
||||
property_name.clone(),
|
||||
false, // Arrays themselves aren't polymorphic branches, their items might be
|
||||
false,
|
||||
inner_map,
|
||||
pass_seen.clone(),
|
||||
errors,
|
||||
);
|
||||
}
|
||||
@ -463,8 +441,9 @@ impl Database {
|
||||
current_path.clone(),
|
||||
next_parent.clone(),
|
||||
property_name.clone(),
|
||||
false,
|
||||
is_polymorphic,
|
||||
inner_map,
|
||||
seen_entities.clone(),
|
||||
errors,
|
||||
);
|
||||
}
|
||||
@ -481,8 +460,9 @@ impl Database {
|
||||
current_path.clone(),
|
||||
next_parent.clone(),
|
||||
property_name.clone(),
|
||||
false,
|
||||
true,
|
||||
inner_map,
|
||||
pass_seen.clone(),
|
||||
errors,
|
||||
);
|
||||
}
|
||||
@ -496,8 +476,9 @@ impl Database {
|
||||
current_path.clone(),
|
||||
next_parent.clone(),
|
||||
property_name.clone(),
|
||||
false,
|
||||
is_polymorphic,
|
||||
inner_map,
|
||||
pass_seen.clone(),
|
||||
errors,
|
||||
);
|
||||
}
|
||||
|
||||
21
src/lib.rs
21
src/lib.rs
@ -112,7 +112,7 @@ pub fn jspg_validate(schema_id: &str, instance: JsonB) -> JsonB {
|
||||
|
||||
#[cfg_attr(not(test), pg_extern)]
|
||||
pub fn jspg_stems() -> JsonB {
|
||||
use serde_json::{Map, Value};
|
||||
use serde_json::Value;
|
||||
|
||||
let engine_opt = {
|
||||
let lock = GLOBAL_JSPG.read().unwrap();
|
||||
@ -121,9 +121,24 @@ pub fn jspg_stems() -> JsonB {
|
||||
|
||||
match engine_opt {
|
||||
Some(engine) => {
|
||||
JsonB(serde_json::to_value(&engine.database.stems).unwrap_or(Value::Object(Map::new())))
|
||||
let mut result_arr = Vec::new();
|
||||
for (schema_name, stems_map) in &engine.database.stems {
|
||||
let mut stems_arr = Vec::new();
|
||||
for (path_key, stem_arc) in stems_map {
|
||||
stems_arr.push(serde_json::json!({
|
||||
"path": path_key,
|
||||
"type": stem_arc.r#type,
|
||||
"relation": stem_arc.relation
|
||||
}));
|
||||
}
|
||||
result_arr.push(serde_json::json!({
|
||||
"schema": schema_name,
|
||||
"stems": stems_arr
|
||||
}));
|
||||
}
|
||||
JsonB(serde_json::to_value(result_arr).unwrap_or(Value::Array(Vec::new())))
|
||||
}
|
||||
None => JsonB(Value::Object(Map::new())),
|
||||
None => JsonB(Value::Array(Vec::new())),
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
2391
test_output.txt
Normal file
2391
test_output.txt
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user