Compare commits

..

7 Commits

9 changed files with 214 additions and 18 deletions

View File

@ -23,6 +23,12 @@ To support high-throughput operations while allowing for runtime updates (e.g.,
3. **Immutable AST Caching**: The `Validator` struct immutably owns the `Database` registry. Schemas themselves are frozen structurally, but utilize `OnceLock` interior mutability during the Compilation Phase to permanently cache resolved `$ref` inheritances, properties, and `compiled_edges` directly onto their AST nodes. This guarantees strict `O(1)` relationship and property validation execution at runtime without locking or recursive DB polling.
4. **Lock-Free Reads**: Incoming operations acquire a read lock just long enough to clone the `Arc` inside an `RwLock<Option<Arc<Validator>>>`, ensuring zero blocking during schema updates.
### Relational Edge Resolution
When compiling nested object graphs or arrays, the JSPG engine must dynamically infer which Postgres Foreign Key constraint correctly bridges the parent to the nested schema. It utilizes a strict 3-step hierarchical resolution:
1. **Direct Prefix Match**: If an explicitly prefixed Foreign Key (e.g. `fk_invoice_counterparty_entity` -> `prefix: "counterparty"`) matches the exact name of the requested schema property (e.g. `{"counterparty": {...}}`), it is instantly selected.
2. **Base Edge Fallback (1:M)**: If no explicit prefix directly matches the property name, the compiler filters for explicitly one remaining relation with a `null` prefix (e.g. `fk_invoice_line_invoice` -> `prefix: null`). A `null` prefix mathematically denotes the standard structural parent-child ownership edge (bypassing any M:M ambiguity) and is safely picked over explicit (but unmatched) property edges.
3. **Ambiguity Elimination (M:M)**: If multiple explicitly prefixed relations remain (which happens by design in Many-to-Many junction tables like `contact` utilizing `fk_relationship_source` and `fk_relationship_target`), the compiler uses a process of elimination. It checks which of the prefix names the child schema *natively consumes* as an outbound property (e.g. `contact` defines `{ "target": ... }`). It considers that prefix "used up" and mathematically deduces the *remaining* explicitly prefixed relation (`"source"`) must be the inbound link from the parent.
### Global API Reference
These functions operate on the global `GLOBAL_JSPG` engine instance and provide administrative boundaries:

58
LOOKUP_VERIFICATION.md Normal file
View File

@ -0,0 +1,58 @@
# The Postgres Partial Index Claiming Pattern
This document outlines the architectural strategy for securely handling the deduplication, claiming, and verification of sensitive unique identifiers (like email addresses or phone numbers) strictly through PostgreSQL without requiring "magical" logic in the JSPG `Merger`.
## The Denial of Service (DoS) Squatter Problem
If you enforce a standard `UNIQUE` constraint on an email address table:
1. Malicious User A signs up and adds `jeff.bezos@amazon.com` to their account but never verifies it.
2. The real Jeff Bezos signs up.
3. The Database blocks Jeff because the unique string already exists.
The squatter has effectively locked the legitimate owner out of the system.
## The Anti-Patterns
1. **Global Entity Flags**: Adding a global `verified` boolean to the root `entity` table forces unrelated objects (like Widgets, Invoices, Orders) to carry verification logic that doesn't belong to them.
2. **Magical Merger Logic**: Making JSPG's `Merger` aware of a specific `verified` field breaks its pure structural translation model. The Merger shouldn't need hardcoded conditional logic to know if it's allowed to update an unverified row.
## The Solution: Postgres Partial Unique Indexes
The holy grail is to defer all claiming logic natively to the database engine using a **Partial Unique Index**.
```sql
-- Remove any existing global unique constraint on address first
CREATE UNIQUE INDEX lk_email_address_verified
ON email_address (address)
WHERE verified_at IS NOT NULL;
```
### How the Lifecycle Works Natively
1. **Unverified Squatters (Isolated Rows):**
A hundred different users can send `{ "address": "jeff.bezos@amazon.com" }` through the `save_person` Punc. Because the Punc isolates them and doesn't allow setting the `verified_at` property natively (enforced by the JSON schema), the JSPG Merger inserts `NULL`.
Postgres permits all 100 `INSERT` commands to succeed because the Partial Index **ignores** rows where `verified_at IS NULL`. Every user gets their own isolated, unverified row acting as a placeholder on their contact edge.
2. **The Verification Race (The Claim):**
The real Jeff clicks his magic verification link. The backend securely executes a specific verification Punc that runs:
`UPDATE email_address SET verified_at = now() WHERE id = <jeff's-real-uuid>`
3. **The Lockout:**
Because Jeff's row now strictly satisfies `verified_at IS NOT NULL`, that exact row enters the Partial Unique Index.
If any of the other 99 squatters *ever* click their fake verification links (or if a new user tries to verify that same email), PostgreSQL hits the index and violently throws a **Unique Constraint Violation**, flawlessly blocking them. The winner has permanently claimed the slot across the entire environment!
### Periodic Cleanup
Since unverified rows are allowed to accumulate without colliding, a simple Postgres `pg_cron` job or backend worker can sweep the table nightly to prune abandoned claims and preserve storage:
```sql
DELETE FROM email_address
WHERE verified_at IS NULL
AND created_at < NOW() - INTERVAL '24 hours';
```
### Why this is the Ultimate Architecture
* The **JSPG Merger** remains mathematically pure. It doesn't know what `verified_at` is; it simply respects the database's structural limits (`O(1)` pure translation).
* **Row-Level Security (RLS)** naturally blocks users from seeing or claiming each other's unverified rows.
* You offload complex race-condition tracking entirely to the C-level PostgreSQL B-Tree indexing engine, guaranteeing absolute cluster-wide atomicity.

0
agreego.sql Normal file
View File

View File

@ -1145,7 +1145,71 @@
" },",
" \"old\":{",
" \"contact_id\":\"old-contact\"",
" }",
" },",
" \"replaces\":\"33333333-3333-3333-3333-333333333333\"",
" }')"
]
]
}
},
{
"description": "Replace existing person with id and no changes (lookup)",
"action": "merge",
"data": {
"id": "33333333-3333-3333-3333-333333333333",
"type": "person",
"first_name": "LookupFirst",
"last_name": "LookupLast",
"date_of_birth": "1990-01-01T00:00:00Z",
"pronouns": "they/them"
},
"mocks": [
{
"id": "22222222-2222-2222-2222-222222222222",
"type": "person",
"first_name": "LookupFirst",
"last_name": "LookupLast",
"date_of_birth": "1990-01-01T00:00:00Z",
"pronouns": "they/them",
"contact_id": "old-contact"
}
],
"schema_id": "person",
"expect": {
"success": true,
"sql": [
[
"SELECT to_jsonb(t1.*) || to_jsonb(t2.*) || to_jsonb(t3.*) || to_jsonb(t4.*)",
"FROM agreego.\"person\" t1",
"LEFT JOIN agreego.\"user\" t2 ON t2.id = t1.id",
"LEFT JOIN agreego.\"organization\" t3 ON t3.id = t1.id",
"LEFT JOIN agreego.\"entity\" t4 ON t4.id = t1.id",
"WHERE",
" t1.id = '33333333-3333-3333-3333-333333333333'",
" OR (",
" \"first_name\" = 'LookupFirst'",
" AND \"last_name\" = 'LookupLast'",
" AND \"date_of_birth\" = '1990-01-01T00:00:00Z'",
" AND \"pronouns\" = 'they/them'",
" )"
],
[
"SELECT pg_notify('entity', '{",
" \"complete\":{",
" \"contact_id\":\"old-contact\",",
" \"date_of_birth\":\"1990-01-01T00:00:00Z\",",
" \"first_name\":\"LookupFirst\",",
" \"id\":\"22222222-2222-2222-2222-222222222222\",",
" \"last_name\":\"LookupLast\",",
" \"modified_at\":\"2026-03-10T00:00:00Z\",",
" \"modified_by\":\"00000000-0000-0000-0000-000000000000\",",
" \"pronouns\":\"they/them\",",
" \"type\":\"person\"",
" },",
" \"new\":{",
" \"type\":\"person\"",
" },",
" \"replaces\":\"33333333-3333-3333-3333-333333333333\"",
" }')"
]
]

View File

@ -27,7 +27,9 @@
{
"$id": "get_orders.response",
"type": "array",
"items": { "$ref": "light.order" }
"items": {
"$ref": "light.order"
}
}
]
}
@ -77,8 +79,23 @@
"destination_type": "person",
"destination_columns": [
"id"
]
},
{
"id": "22222222-2222-2222-2222-222222222227",
"type": "relation",
"constraint": "fk_order_counterparty_entity",
"source_type": "order",
"source_columns": [
"counterparty_id",
"counterparty_type"
],
"prefix": "customer"
"destination_type": "entity",
"destination_columns": [
"id",
"type"
],
"prefix": "counterparty"
},
{
"id": "33333333-3333-3333-3333-333333333333",
@ -91,8 +108,7 @@
"destination_type": "order",
"destination_columns": [
"id"
],
"prefix": "lines"
]
}
],
"types": [
@ -713,14 +729,18 @@
"created_by",
"modified_at",
"modified_by",
"archived"
"archived",
"counterparty_id",
"counterparty_type"
],
"grouped_fields": {
"order": [
"id",
"type",
"total",
"customer_id"
"customer_id",
"counterparty_id",
"counterparty_type"
],
"entity": [
"id",
@ -748,7 +768,9 @@
"created_at": "timestamptz",
"created_by": "uuid",
"modified_at": "timestamptz",
"modified_by": "uuid"
"modified_by": "uuid",
"counterparty_id": "uuid",
"counterparty_type": "text"
},
"variations": [
"order"

View File

@ -622,7 +622,23 @@ pub(crate) fn resolve_relation<'a>(
}
}
if !resolved {
// 1. If there's EXACTLY ONE relation with a null prefix, it's the base structural edge. Pick it.
let mut null_prefix_ids = Vec::new();
for (i, rel) in matching_rels.iter().enumerate() {
if rel.prefix.is_none() {
null_prefix_ids.push(i);
}
}
if null_prefix_ids.len() == 1 {
chosen_idx = null_prefix_ids[0];
resolved = true;
}
}
if !resolved && relative_keys.is_some() {
// 2. M:M Disambiguation: The child schema will explicitly define an outbound property
// matching one of the relational prefixes (e.g. "target"). We use the OTHER one (e.g. "source").
let keys = relative_keys.unwrap();
let mut missing_prefix_ids = Vec::new();
for (i, rel) in matching_rels.iter().enumerate() {
@ -634,6 +650,7 @@ pub(crate) fn resolve_relation<'a>(
}
if missing_prefix_ids.len() == 1 {
chosen_idx = missing_prefix_ids[0];
// resolved = true;
}
}

View File

@ -228,13 +228,15 @@ impl Merger {
let mut entity_change_kind = None;
let mut entity_fetched = None;
let mut entity_replaces = None;
if !type_def.relationship {
let (fields, kind, fetched) =
let (fields, kind, fetched, replaces) =
self.stage_entity(entity_fields.clone(), type_def, &user_id, &timestamp)?;
entity_fields = fields;
entity_change_kind = kind;
entity_fetched = fetched;
entity_replaces = replaces;
}
let mut entity_response = serde_json::Map::new();
@ -308,11 +310,12 @@ impl Merger {
}
if type_def.relationship {
let (fields, kind, fetched) =
let (fields, kind, fetched, replaces) =
self.stage_entity(entity_fields.clone(), type_def, &user_id, &timestamp)?;
entity_fields = fields;
entity_change_kind = kind;
entity_fetched = fetched;
entity_replaces = replaces;
}
self.merge_entity_fields(
@ -388,6 +391,7 @@ impl Merger {
entity_change_kind.as_deref(),
&user_id,
&timestamp,
entity_replaces.as_deref(),
)?;
if let Some(sql) = notify_sql {
@ -419,6 +423,7 @@ impl Merger {
serde_json::Map<String, Value>,
Option<String>,
Option<serde_json::Map<String, Value>>,
Option<String>,
),
String,
> {
@ -438,11 +443,22 @@ impl Merger {
.map_or(false, |s| !s.is_empty());
if is_anchor && has_valid_id {
return Ok((entity_fields, None, None));
return Ok((entity_fields, None, None, None));
}
let entity_fetched = self.fetch_entity(&entity_fields, type_def)?;
let mut replaces_id = None;
if let Some(ref fetched_row) = entity_fetched {
let provided_id = entity_fields.get("id").and_then(|v| v.as_str());
let fetched_id = fetched_row.get("id").and_then(|v| v.as_str());
if let (Some(pid), Some(fid)) = (provided_id, fetched_id) {
if !pid.is_empty() && pid != fid {
replaces_id = Some(pid.to_string());
}
}
}
let system_keys = vec![
"id".to_string(),
"type".to_string(),
@ -492,7 +508,7 @@ impl Merger {
);
entity_fields = new_fields;
} else if changes.is_empty() {
} else if changes.is_empty() && replaces_id.is_none() {
let mut new_fields = serde_json::Map::new();
new_fields.insert(
"id".to_string(),
@ -508,6 +524,8 @@ impl Merger {
.unwrap_or(false);
entity_change_kind = if is_archived {
Some("delete".to_string())
} else if changes.is_empty() && replaces_id.is_some() {
Some("replace".to_string())
} else {
Some("update".to_string())
};
@ -530,7 +548,7 @@ impl Merger {
entity_fields = new_fields;
}
Ok((entity_fields, entity_change_kind, entity_fetched))
Ok((entity_fields, entity_change_kind, entity_fetched, replaces_id))
}
fn fetch_entity(
@ -768,6 +786,7 @@ impl Merger {
entity_change_kind: Option<&str>,
user_id: &str,
timestamp: &str,
replaces_id: Option<&str>,
) -> Result<Option<String>, String> {
let change_kind = match entity_change_kind {
Some(k) => k,
@ -779,9 +798,9 @@ impl Merger {
let mut old_vals = serde_json::Map::new();
let mut new_vals = serde_json::Map::new();
let is_update = change_kind == "update" || change_kind == "delete";
let exists = change_kind == "update" || change_kind == "delete" || change_kind == "replace";
if !is_update {
if !exists {
let system_keys = vec![
"id".to_string(),
"created_by".to_string(),
@ -818,7 +837,7 @@ impl Merger {
}
let mut complete = entity_fields.clone();
if is_update {
if exists {
if let Some(fetched) = entity_fetched {
let mut temp = fetched.clone();
for (k, v) in entity_fields {
@ -842,9 +861,13 @@ impl Merger {
if old_val_obj != Value::Null {
notification.insert("old".to_string(), old_val_obj.clone());
}
if let Some(rep) = replaces_id {
notification.insert("replaces".to_string(), Value::String(rep.to_string()));
}
let mut notify_sql = None;
if type_obj.historical {
if type_obj.historical && change_kind != "replace" {
let change_sql = format!(
"INSERT INTO agreego.change (\"old\", \"new\", entity_id, id, kind, modified_at, modified_by) VALUES ({}, {}, {}, {}, {}, {}, {})",
Self::quote_literal(&old_val_obj),

View File

@ -8602,3 +8602,9 @@ fn test_merger_0_11() {
let path = format!("{}/fixtures/merger.json", env!("CARGO_MANIFEST_DIR"));
crate::tests::runner::run_test_case(&path, 0, 11).unwrap();
}
#[test]
fn test_merger_0_12() {
let path = format!("{}/fixtures/merger.json", env!("CARGO_MANIFEST_DIR"));
crate::tests::runner::run_test_case(&path, 0, 12).unwrap();
}

View File

@ -1 +1 @@
1.0.94
1.0.98