validator reorg

2026-02-26 19:17:13 -05:00
parent 960a99034a
commit e14f53e7d9
16 changed files with 501 additions and 423 deletions
--- a/src/entity/GEMINI.md
+++ b/src/entity/GEMINI.md
@ -0,0 +1,79 @@
+# Entity Engine (jspg)
+
+## Overview
+
+This document outlines the architecture for moving the complex, CPU-bound row merging (`merge_entity`) and dynamic querying (`query_entity`) functionality out of PL/pgSQL and directly into the Rust-based `jspg` extension. 
+
+By treating the `jspg` schema registry as the absolute Single Source of Truth, we can leverage Rust and the Postgres query planner (via SPI) to achieve near O(1) execution planning for deeply nested reads, complex relational writes, and partial hydration beats.
+
+## The Problem
+
+Historically, `agreego.merge_entity` (PL/pgSQL) handled nested writes by segmenting JSON, resolving types, searching hierarchies, and dynamically concatenating `INSERT`/`UPDATE` statements. `agreego.query_entity` was conceived to do the same for reads (handling base security, inheritance JOINs, and filtering automatically). 
+
+However, this design hits three major limitations:
+1. **CPU Bound Operations**: PL/pgSQL is comparatively slow at complex string concatenation and massive JSON graph traversals.
+2. **Query Planning Cache Busting**: Generating massive, dynamic SQL strings prevents Postgres from caching query plans. `EXECUTE dynamic_sql` forces the planner to re-evaluate statistics and execution paths on every function call, leading to extreme latency spikes at scale.
+3. **The Hydration Beat Problem**: The Punc framework requires fetching specific UI "fragments" (e.g. just the `target` of a specific `contact` array element) to feed WebSockets. Hand-rolling CTEs for every possible sub-tree permutation to serve beats will quickly become unmaintainable.
+
+## The Solution: Semantic Engine Database
+
+By migrating `merge_entity` and `query_entity` to `jspg`, we turn the database into a pre-compiled Semantic Engine.
+
+1. **Schema-to-SQL Compilation**: During the connection lifecycle (`cache_json_schemas()`), `jspg` statically analyzes the JSON Schema AST. It acts as a compiler, translating the schema layout into perfectly optimized, multi-JOIN SQL query strings for *every* node/fragment in the schema.
+2. **Prepared Statements (SPI)**: `jspg` feeds these computed SQL strings into the Postgres SPI (Server Programming Interface) using `Spi::prepare()`. Postgres calculates the query execution plan *once* and caches it in memory.
+3. **Instant Execution**: When a Punc needs data, `jspg` retrieves the cached PreparedStatement, securely binds binary parameters, and executes the pre-planned query instantly.
+
+## Architecture
+
+### 1. The `cache_json_schemas()` Expansion
+The initialization function must now ingest `types` and `agreego.relation` data so the internal `Registry` holds the full Relational Graph.
+
+During schema compilation, if a schema is associated with a database Type, it triggers the **SQL Compiler Phase**:
+- It builds a table-resolution AST mapping to `JOIN` clauses based on foreign keys.
+- It translates JSON schema properties to `SELECT jsonb_build_object(...)`.
+- It generates static SQL for `INSERT`, `UPDATE`, and `SELECT` (including path-based fragment SELECTs).
+- It calls `Spi::prepare()` to cache these plans inside the Session Context.
+
+### 2. `agreego.query_entity` (Reads)
+* **API**: `agreego.query_entity(schema_id TEXT, fragment_path TEXT, cue JSONB)`
+* **Execution**:
+    * Rust locates the target Schema in memory.
+    * It uses the `fragment_path` (e.g., `/` for a full read, or `/contacts/0/target` for a hydration beat) to fetch the exact PreparedStatement.
+    * It binds variables (Row Level Security IDs, filtering, pagination limit/offset) parsed from the `cue`.
+    * SPI returns the heavily nested, pre-aggregated `JSONB` instantly.
+
+### 3. Unified Aggregations & Computeds (Schema `query` objects)
+We replace the concept of a complex string parser (PEL) with native structured JSON JSON objects using the `query` keyword.
+
+A structured `query` block in the schema:
+```json
+"total": {
+  "type": "number",
+  "readOnly": true,
+  "query": {
+    "aggregate": "sum",
+    "source": "lines",
+    "field": "amount"
+  }
+}
+```
+* **Frontend (Dart)**: The Go generator parses the JSON object directly and emits the native UI aggregation code (e.g. `lines.fold(...)`) for instant UI updates before the server responds.
+* **Backend (jspg)**: The Rust SQL compiler natively deserializes the `query` object into an internal struct. It recognizes the `aggregate` instruction and outputs a Postgres native aggregation: `(SELECT SUM(amount) FROM agreego.invoice_line WHERE invoice_id = t1.id)` as a column in the prepared `SELECT` statement. 
+* **Unification**: The database-calculated value acts as the authoritative truth, synchronizing and correcting the client automatically on the resulting `beat`.
+
+### 4. `agreego.merge_entity` (Writes)
+* **API**: `agreego.merge_entity(cue JSONB)`
+* **Execution**:
+    * Parses the incoming `cue` JSON via `serde_json` at C-like speeds.
+    * Recursively validates and *constructively masks* the tree against the strict schema.
+    * Traverses the relational graph (which is fully loaded in the `jspg` registry).
+    * Binds the new values directly into the cached `INSERT` or `UPDATE` SPI prepared statements for each table in the hierarchy.
+    * Evaluates field differences and natively uses `pg_notify` to fire atomic row-level changes for the Go Beat framework.
+
+## Roadmap
+
+1. **Relational Ingestion**: Update `cache_json_schemas` to pass relational metadata (`agreego.relation` rows) into the `jspg` registry cache.
+2. **The SQL Compiler**: Build the AST-to-String compiler in Rust that reads properties, `$ref`s, and `$family` trees to piece together generic SQL.
+3. **SPI Caching**: Integrate `Spi::prepare` into the `Validator` creation phase.
+4. **Rust `merge_entity`**: Port the constructive structural extraction loop from PL/pgSQL to Rust.
+5. **Rust `query_entity`**: Abstract the query runtime, mapping Punc JSON `filters` arrays to SPI-bound parameters safely.