Compare commits

...

6 Commits

6 changed files with 357 additions and 59 deletions

58
LOOKUP_VERIFICATION.md Normal file
View File

@ -0,0 +1,58 @@
# The Postgres Partial Index Claiming Pattern
This document outlines the architectural strategy for securely handling the deduplication, claiming, and verification of sensitive unique identifiers (like email addresses or phone numbers) strictly through PostgreSQL without requiring "magical" logic in the JSPG `Merger`.
## The Denial of Service (DoS) Squatter Problem
If you enforce a standard `UNIQUE` constraint on an email address table:
1. Malicious User A signs up and adds `jeff.bezos@amazon.com` to their account but never verifies it.
2. The real Jeff Bezos signs up.
3. The Database blocks Jeff because the unique string already exists.
The squatter has effectively locked the legitimate owner out of the system.
## The Anti-Patterns
1. **Global Entity Flags**: Adding a global `verified` boolean to the root `entity` table forces unrelated objects (like Widgets, Invoices, Orders) to carry verification logic that doesn't belong to them.
2. **Magical Merger Logic**: Making JSPG's `Merger` aware of a specific `verified` field breaks its pure structural translation model. The Merger shouldn't need hardcoded conditional logic to know if it's allowed to update an unverified row.
## The Solution: Postgres Partial Unique Indexes
The holy grail is to defer all claiming logic natively to the database engine using a **Partial Unique Index**.
```sql
-- Remove any existing global unique constraint on address first
CREATE UNIQUE INDEX lk_email_address_verified
ON email_address (address)
WHERE verified_at IS NOT NULL;
```
### How the Lifecycle Works Natively
1. **Unverified Squatters (Isolated Rows):**
A hundred different users can send `{ "address": "jeff.bezos@amazon.com" }` through the `save_person` Punc. Because the Punc isolates them and doesn't allow setting the `verified_at` property natively (enforced by the JSON schema), the JSPG Merger inserts `NULL`.
Postgres permits all 100 `INSERT` commands to succeed because the Partial Index **ignores** rows where `verified_at IS NULL`. Every user gets their own isolated, unverified row acting as a placeholder on their contact edge.
2. **The Verification Race (The Claim):**
The real Jeff clicks his magic verification link. The backend securely executes a specific verification Punc that runs:
`UPDATE email_address SET verified_at = now() WHERE id = <jeff's-real-uuid>`
3. **The Lockout:**
Because Jeff's row now strictly satisfies `verified_at IS NOT NULL`, that exact row enters the Partial Unique Index.
If any of the other 99 squatters *ever* click their fake verification links (or if a new user tries to verify that same email), PostgreSQL hits the index and violently throws a **Unique Constraint Violation**, flawlessly blocking them. The winner has permanently claimed the slot across the entire environment!
### Periodic Cleanup
Since unverified rows are allowed to accumulate without colliding, a simple Postgres `pg_cron` job or backend worker can sweep the table nightly to prune abandoned claims and preserve storage:
```sql
DELETE FROM email_address
WHERE verified_at IS NULL
AND created_at < NOW() - INTERVAL '24 hours';
```
### Why this is the Ultimate Architecture
* The **JSPG Merger** remains mathematically pure. It doesn't know what `verified_at` is; it simply respects the database's structural limits (`O(1)` pure translation).
* **Row-Level Security (RLS)** naturally blocks users from seeing or claiming each other's unverified rows.
* You offload complex race-condition tracking entirely to the C-level PostgreSQL B-Tree indexing engine, guaranteeing absolute cluster-wide atomicity.

View File

@ -972,7 +972,12 @@
"LEFT JOIN agreego.\"user\" t2 ON t2.id = t1.id",
"LEFT JOIN agreego.\"organization\" t3 ON t3.id = t1.id",
"LEFT JOIN agreego.\"entity\" t4 ON t4.id = t1.id",
"WHERE \"first_name\" = 'LookupFirst' AND \"last_name\" = 'LookupLast' AND \"date_of_birth\" = '1990-01-01T00:00:00Z' AND \"pronouns\" = 'they/them'"
"WHERE (",
" \"first_name\" = 'LookupFirst'",
" AND \"last_name\" = 'LookupLast'",
" AND \"date_of_birth\" = '1990-01-01T00:00:00Z'",
" AND \"pronouns\" = 'they/them'",
")"
],
[
"UPDATE agreego.\"person\"",
@ -1039,6 +1044,177 @@
]
}
},
{
"description": "Update existing person with id (lookup)",
"action": "merge",
"data": {
"id": "33333333-3333-3333-3333-333333333333",
"type": "person",
"first_name": "LookupFirst",
"last_name": "LookupLast",
"date_of_birth": "1990-01-01T00:00:00Z",
"pronouns": "they/them",
"contact_id": "abc-contact"
},
"mocks": [
{
"id": "22222222-2222-2222-2222-222222222222",
"type": "person",
"first_name": "LookupFirst",
"last_name": "LookupLast",
"date_of_birth": "1990-01-01T00:00:00Z",
"pronouns": "they/them",
"contact_id": "old-contact"
}
],
"schema_id": "person",
"expect": {
"success": true,
"sql": [
[
"SELECT to_jsonb(t1.*) || to_jsonb(t2.*) || to_jsonb(t3.*) || to_jsonb(t4.*)",
"FROM agreego.\"person\" t1",
"LEFT JOIN agreego.\"user\" t2 ON t2.id = t1.id",
"LEFT JOIN agreego.\"organization\" t3 ON t3.id = t1.id",
"LEFT JOIN agreego.\"entity\" t4 ON t4.id = t1.id",
"WHERE",
" t1.id = '33333333-3333-3333-3333-333333333333'",
" OR (",
" \"first_name\" = 'LookupFirst'",
" AND \"last_name\" = 'LookupLast'",
" AND \"date_of_birth\" = '1990-01-01T00:00:00Z'",
" AND \"pronouns\" = 'they/them'",
" )"
],
[
"UPDATE agreego.\"person\"",
"SET",
" \"contact_id\" = 'abc-contact'",
"WHERE",
" id = '22222222-2222-2222-2222-222222222222'"
],
[
"UPDATE agreego.\"entity\"",
"SET",
" \"modified_at\" = '2026-03-10T00:00:00Z',",
" \"modified_by\" = '00000000-0000-0000-0000-000000000000'",
"WHERE",
" id = '22222222-2222-2222-2222-222222222222'"
],
[
"INSERT INTO agreego.change (",
" \"old\",",
" \"new\",",
" entity_id,",
" id,",
" kind,",
" modified_at,",
" modified_by",
")",
"VALUES (",
" '{",
" \"contact_id\":\"old-contact\"",
" }',",
" '{",
" \"contact_id\":\"abc-contact\",",
" \"type\":\"person\"",
" }',",
" '22222222-2222-2222-2222-222222222222',",
" '{{uuid}}',",
" 'update',",
" '{{timestamp}}',",
" '00000000-0000-0000-0000-000000000000'",
")"
],
[
"SELECT pg_notify('entity', '{",
" \"complete\":{",
" \"contact_id\":\"abc-contact\",",
" \"date_of_birth\":\"1990-01-01T00:00:00Z\",",
" \"first_name\":\"LookupFirst\",",
" \"id\":\"22222222-2222-2222-2222-222222222222\",",
" \"last_name\":\"LookupLast\",",
" \"modified_at\":\"2026-03-10T00:00:00Z\",",
" \"modified_by\":\"00000000-0000-0000-0000-000000000000\",",
" \"pronouns\":\"they/them\",",
" \"type\":\"person\"",
" },",
" \"new\":{",
" \"contact_id\":\"abc-contact\",",
" \"type\":\"person\"",
" },",
" \"old\":{",
" \"contact_id\":\"old-contact\"",
" },",
" \"replaces\":\"33333333-3333-3333-3333-333333333333\"",
" }')"
]
]
}
},
{
"description": "Replace existing person with id and no changes (lookup)",
"action": "merge",
"data": {
"id": "33333333-3333-3333-3333-333333333333",
"type": "person",
"first_name": "LookupFirst",
"last_name": "LookupLast",
"date_of_birth": "1990-01-01T00:00:00Z",
"pronouns": "they/them"
},
"mocks": [
{
"id": "22222222-2222-2222-2222-222222222222",
"type": "person",
"first_name": "LookupFirst",
"last_name": "LookupLast",
"date_of_birth": "1990-01-01T00:00:00Z",
"pronouns": "they/them",
"contact_id": "old-contact"
}
],
"schema_id": "person",
"expect": {
"success": true,
"sql": [
[
"SELECT to_jsonb(t1.*) || to_jsonb(t2.*) || to_jsonb(t3.*) || to_jsonb(t4.*)",
"FROM agreego.\"person\" t1",
"LEFT JOIN agreego.\"user\" t2 ON t2.id = t1.id",
"LEFT JOIN agreego.\"organization\" t3 ON t3.id = t1.id",
"LEFT JOIN agreego.\"entity\" t4 ON t4.id = t1.id",
"WHERE",
" t1.id = '33333333-3333-3333-3333-333333333333'",
" OR (",
" \"first_name\" = 'LookupFirst'",
" AND \"last_name\" = 'LookupLast'",
" AND \"date_of_birth\" = '1990-01-01T00:00:00Z'",
" AND \"pronouns\" = 'they/them'",
" )"
],
[
"SELECT pg_notify('entity', '{",
" \"complete\":{",
" \"contact_id\":\"old-contact\",",
" \"date_of_birth\":\"1990-01-01T00:00:00Z\",",
" \"first_name\":\"LookupFirst\",",
" \"id\":\"22222222-2222-2222-2222-222222222222\",",
" \"last_name\":\"LookupLast\",",
" \"modified_at\":\"2026-03-10T00:00:00Z\",",
" \"modified_by\":\"00000000-0000-0000-0000-000000000000\",",
" \"pronouns\":\"they/them\",",
" \"type\":\"person\"",
" },",
" \"new\":{",
" \"type\":\"person\"",
" },",
" \"replaces\":\"33333333-3333-3333-3333-333333333333\"",
" }')"
]
]
}
},
{
"description": "Update existing person with id (no lookup)",
"action": "merge",
@ -1484,7 +1660,7 @@
"SELECT to_jsonb(t1.*) || to_jsonb(t2.*)",
"FROM agreego.\"order\" t1",
"LEFT JOIN agreego.\"entity\" t2 ON t2.id = t1.id",
"WHERE t1.id = 'abc'"
"WHERE t1.id = 'abc' OR (\"id\" = 'abc')"
],
[
"INSERT INTO agreego.\"entity\" (",

View File

@ -124,42 +124,23 @@ fn parse_and_match_mocks(sql: &str, mocks: &[Value]) -> Option<Vec<Value>> {
return None;
};
// 2. Extract WHERE conditions
let mut conditions = Vec::new();
// 2. Extract WHERE conditions string
let mut where_clause = String::new();
if let Some(where_idx) = sql_upper.find(" WHERE ") {
let mut where_end = sql_upper.find(" ORDER BY ").unwrap_or(sql.len());
let mut where_end = sql_upper.find(" ORDER BY ").unwrap_or(sql_upper.len());
if let Some(limit_idx) = sql_upper.find(" LIMIT ") {
if limit_idx < where_end {
where_end = limit_idx;
}
}
let where_clause = &sql[where_idx + 7..where_end];
let and_regex = Regex::new(r"(?i)\s+AND\s+").ok()?;
let parts = and_regex.split(where_clause);
for part in parts {
if let Some(eq_idx) = part.find('=') {
let left = part[..eq_idx]
.trim()
.split('.')
.last()
.unwrap_or("")
.trim_matches('"');
let right = part[eq_idx + 1..].trim().trim_matches('\'');
conditions.push((left.to_string(), right.to_string()));
} else if part.to_uppercase().contains(" IS NULL") {
let left = part[..part.to_uppercase().find(" IS NULL").unwrap()]
.trim()
.split('.')
.last()
.unwrap_or("")
.replace('"', ""); // Remove quotes explicitly
conditions.push((left, "null".to_string()));
}
}
where_clause = sql[where_idx + 7..where_end].to_string();
}
// 3. Find matching mocks
let mut matches = Vec::new();
let or_regex = Regex::new(r"(?i)\s+OR\s+").ok()?;
let and_regex = Regex::new(r"(?i)\s+AND\s+").ok()?;
for mock in mocks {
if let Some(mock_obj) = mock.as_object() {
if let Some(t) = mock_obj.get("type") {
@ -168,25 +149,66 @@ fn parse_and_match_mocks(sql: &str, mocks: &[Value]) -> Option<Vec<Value>> {
}
}
let mut matches_all = true;
for (k, v) in &conditions {
let mock_val_str = match mock_obj.get(k) {
Some(Value::String(s)) => s.clone(),
Some(Value::Number(n)) => n.to_string(),
Some(Value::Bool(b)) => b.to_string(),
Some(Value::Null) => "null".to_string(),
_ => {
matches_all = false;
break;
if where_clause.is_empty() {
matches.push(mock.clone());
continue;
}
let or_parts = or_regex.split(&where_clause);
let mut any_branch_matched = false;
for or_part in or_parts {
let branch_str = or_part.replace('(', "").replace(')', "");
let mut branch_matches = true;
for part in and_regex.split(&branch_str) {
if let Some(eq_idx) = part.find('=') {
let left = part[..eq_idx]
.trim()
.split('.')
.last()
.unwrap_or("")
.trim_matches('"');
let right = part[eq_idx + 1..].trim().trim_matches('\'');
let mock_val_str = match mock_obj.get(left) {
Some(Value::String(s)) => s.clone(),
Some(Value::Number(n)) => n.to_string(),
Some(Value::Bool(b)) => b.to_string(),
Some(Value::Null) => "null".to_string(),
_ => "".to_string(),
};
if mock_val_str != right {
branch_matches = false;
break;
}
} else if part.to_uppercase().contains(" IS NULL") {
let left = part[..part.to_uppercase().find(" IS NULL").unwrap()]
.trim()
.split('.')
.last()
.unwrap_or("")
.trim_matches('"');
let mock_val_str = match mock_obj.get(left) {
Some(Value::Null) => "null".to_string(),
_ => "".to_string(),
};
if mock_val_str != "null" {
branch_matches = false;
break;
}
}
};
if mock_val_str != *v {
matches_all = false;
}
if branch_matches {
any_branch_matched = true;
break;
}
}
if matches_all {
if any_branch_matched {
matches.push(mock.clone());
}
}

View File

@ -228,13 +228,15 @@ impl Merger {
let mut entity_change_kind = None;
let mut entity_fetched = None;
let mut entity_replaces = None;
if !type_def.relationship {
let (fields, kind, fetched) =
let (fields, kind, fetched, replaces) =
self.stage_entity(entity_fields.clone(), type_def, &user_id, &timestamp)?;
entity_fields = fields;
entity_change_kind = kind;
entity_fetched = fetched;
entity_replaces = replaces;
}
let mut entity_response = serde_json::Map::new();
@ -308,11 +310,12 @@ impl Merger {
}
if type_def.relationship {
let (fields, kind, fetched) =
let (fields, kind, fetched, replaces) =
self.stage_entity(entity_fields.clone(), type_def, &user_id, &timestamp)?;
entity_fields = fields;
entity_change_kind = kind;
entity_fetched = fetched;
entity_replaces = replaces;
}
self.merge_entity_fields(
@ -388,6 +391,7 @@ impl Merger {
entity_change_kind.as_deref(),
&user_id,
&timestamp,
entity_replaces.as_deref(),
)?;
if let Some(sql) = notify_sql {
@ -419,6 +423,7 @@ impl Merger {
serde_json::Map<String, Value>,
Option<String>,
Option<serde_json::Map<String, Value>>,
Option<String>,
),
String,
> {
@ -438,11 +443,22 @@ impl Merger {
.map_or(false, |s| !s.is_empty());
if is_anchor && has_valid_id {
return Ok((entity_fields, None, None));
return Ok((entity_fields, None, None, None));
}
let entity_fetched = self.fetch_entity(&entity_fields, type_def)?;
let mut replaces_id = None;
if let Some(ref fetched_row) = entity_fetched {
let provided_id = entity_fields.get("id").and_then(|v| v.as_str());
let fetched_id = fetched_row.get("id").and_then(|v| v.as_str());
if let (Some(pid), Some(fid)) = (provided_id, fetched_id) {
if !pid.is_empty() && pid != fid {
replaces_id = Some(pid.to_string());
}
}
}
let system_keys = vec![
"id".to_string(),
"type".to_string(),
@ -492,7 +508,7 @@ impl Merger {
);
entity_fields = new_fields;
} else if changes.is_empty() {
} else if changes.is_empty() && replaces_id.is_none() {
let mut new_fields = serde_json::Map::new();
new_fields.insert(
"id".to_string(),
@ -508,6 +524,8 @@ impl Merger {
.unwrap_or(false);
entity_change_kind = if is_archived {
Some("delete".to_string())
} else if changes.is_empty() && replaces_id.is_some() {
Some("replace".to_string())
} else {
Some("update".to_string())
};
@ -530,7 +548,7 @@ impl Merger {
entity_fields = new_fields;
}
Ok((entity_fields, entity_change_kind, entity_fetched))
Ok((entity_fields, entity_change_kind, entity_fetched, replaces_id))
}
fn fetch_entity(
@ -585,11 +603,14 @@ impl Merger {
template
};
let where_clause = if let Some(id) = id_val {
format!("WHERE t1.id = {}", Self::quote_literal(id))
} else if lookup_complete {
let mut lookup_predicates = Vec::new();
let mut where_parts = Vec::new();
if let Some(id) = id_val {
where_parts.push(format!("t1.id = {}", Self::quote_literal(id)));
}
if lookup_complete {
let mut lookup_predicates = Vec::new();
for column in &entity_type.lookup_fields {
let val = entity_fields.get(column).unwrap_or(&Value::Null);
if column == "type" {
@ -598,10 +619,14 @@ impl Merger {
lookup_predicates.push(format!("\"{}\" = {}", column, Self::quote_literal(val)));
}
}
format!("WHERE {}", lookup_predicates.join(" AND "))
} else {
where_parts.push(format!("({})", lookup_predicates.join(" AND ")));
}
if where_parts.is_empty() {
return Ok(None);
};
}
let where_clause = format!("WHERE {}", where_parts.join(" OR "));
let final_sql = format!("{} {}", fetch_sql_template, where_clause);
@ -761,6 +786,7 @@ impl Merger {
entity_change_kind: Option<&str>,
user_id: &str,
timestamp: &str,
replaces_id: Option<&str>,
) -> Result<Option<String>, String> {
let change_kind = match entity_change_kind {
Some(k) => k,
@ -772,9 +798,9 @@ impl Merger {
let mut old_vals = serde_json::Map::new();
let mut new_vals = serde_json::Map::new();
let is_update = change_kind == "update" || change_kind == "delete";
let exists = change_kind == "update" || change_kind == "delete" || change_kind == "replace";
if !is_update {
if !exists {
let system_keys = vec![
"id".to_string(),
"created_by".to_string(),
@ -811,7 +837,7 @@ impl Merger {
}
let mut complete = entity_fields.clone();
if is_update {
if exists {
if let Some(fetched) = entity_fetched {
let mut temp = fetched.clone();
for (k, v) in entity_fields {
@ -835,9 +861,13 @@ impl Merger {
if old_val_obj != Value::Null {
notification.insert("old".to_string(), old_val_obj.clone());
}
if let Some(rep) = replaces_id {
notification.insert("replaces".to_string(), Value::String(rep.to_string()));
}
let mut notify_sql = None;
if type_obj.historical {
if type_obj.historical && change_kind != "replace" {
let change_sql = format!(
"INSERT INTO agreego.change (\"old\", \"new\", entity_id, id, kind, modified_at, modified_by) VALUES ({}, {}, {}, {}, {}, {}, {})",
Self::quote_literal(&old_val_obj),

View File

@ -8596,3 +8596,15 @@ fn test_merger_0_10() {
let path = format!("{}/fixtures/merger.json", env!("CARGO_MANIFEST_DIR"));
crate::tests::runner::run_test_case(&path, 0, 10).unwrap();
}
#[test]
fn test_merger_0_11() {
let path = format!("{}/fixtures/merger.json", env!("CARGO_MANIFEST_DIR"));
crate::tests::runner::run_test_case(&path, 0, 11).unwrap();
}
#[test]
fn test_merger_0_12() {
let path = format!("{}/fixtures/merger.json", env!("CARGO_MANIFEST_DIR"));
crate::tests::runner::run_test_case(&path, 0, 12).unwrap();
}

View File

@ -1 +1 @@
1.0.93
1.0.96