# JSPG: JSON Schema Postgres **JSPG** is a high-performance PostgreSQL extension written in Rust (using `pgrx`) that transforms Postgres into a pre-compiled Semantic Engine. It serves as the core engine for the "Punc" architecture, where the database is the single source of truth for all data models, API contracts, validations, and reactive queries. ## 1. Overview & Architecture JSPG operates by deeply integrating the JSON Schema Draft 2020-12 specification directly into the Postgres session lifecycle. It is built around three core pillars: * **Validator**: In-memory, near-instant JSON structural validation and type polymorphism routing. * **Merger**: Automatically traverse and UPSERT deeply nested JSON graphs into normalized relational tables. * **Queryer**: Compile JSON Schemas into static, cached SQL SPI `SELECT` plans for fetching full entities or isolated ad-hoc object boundaries. ### 🎯 Goals 1. **Draft 2020-12 Based**: Attempt to adhere to the official JSON Schema Draft 2020-12 specification, while heavily augmenting it for strict structural typing. 2. **Ultra-Fast Execution**: Compile schemas into optimized in-memory validation trees and cached SQL SPIs to bypass Postgres Query Builder overheads. 3. **Connection-Bound Caching**: Leverage the PostgreSQL session lifecycle using an **Atomic Swap** pattern. Schemas are 100% frozen, completely eliminating locks during read access. 4. **Structural Inheritance**: Support object-oriented schema design via Implicit Keyword Shadowing and virtual `family` references natively mapped to Postgres table constraints. 5. **Reactive Beats**: Provide ultra-fast natively generated flat payloads mapping directly to the Dart topological state for dynamic websocket reactivity. ### Concurrency & Threading ("Immutable Graphs") To support high-throughput operations while allowing for runtime updates (e.g., during hot-reloading), JSPG uses an **Atomic Swap** pattern: 1. **Parser Phase**: Schema JSONs are parsed into ordered `Schema` structs. 2. **Compiler Phase**: The database iterates all parsed schemas and pre-computes native optimization maps (Descendants Map, Depths Map, Variations Map). 3. **Immutable AST Caching**: The `Validator` struct immutably owns the `Database` registry. Schemas themselves are frozen structurally, but utilize `OnceLock` interior mutability during the Compilation Phase to permanently cache resolved `type` inheritances, properties, and `compiled_edges` directly onto their AST nodes. This guarantees strict `O(1)` relationship and property validation execution at runtime without locking or recursive DB polling. 4. **Lock-Free Reads**: Incoming operations acquire a read lock just long enough to clone the `Arc` inside an `RwLock>>`, ensuring zero blocking during schema updates. ### Global API Reference These functions operate on the global `GLOBAL_JSPG` engine instance and provide administrative boundaries: * `jspg_setup(database jsonb) -> jsonb`: Initializes the engine. Deserializes the full database schema registry (types, enums, puncs, relations) from Postgres and compiles them into memory atomically. * `jspg_teardown() -> jsonb`: Clears the current session's engine instance from `GLOBAL_JSPG`, resetting the cache. * `jspg_database() -> jsonb`: Exports the fully compiled snapshot of the database registry (including Types, Puncs, Enums, and Relations) out of `GLOBAL_JSPG` into standard JSON Schema representations. --- ## 2. Schema Modeling (Punc Developer Guide) JSPG augments standard JSON Schema 2020-12 to provide an opinionated, strict, and highly ergonomic Object-Oriented paradigm. Developers defining Punc Data Models should follow these conventions. ### Realms (Topological Boundaries) JSPG strictly organizes schemas into three distinct topological boundaries called **Realms** to prevent cross-contamination and ensure secure API generation: * **Type Realm (`database.types`)**: Represents physical Postgres tables or structural JSONB bubbles. Table-backed entities here are strictly evaluated for their `type` or `kind` discriminators if they possess polymorphic variations. * **Punc Realm (`database.puncs`)**: Represents API endpoint Contracts (functions). Contains strictly `.request` and `.response` shapes. These cannot be inherited by standard data models. * **Enum Realm (`database.enums`)**: Represents simple restricted value lists. Handled universally across all lookups. The core execution engines natively enforce these boundaries: * **Validator**: Routes dynamically using a single schema key, transparently switching domains to validate Punc requests/responses from the `Punc` realm, or raw instance payloads from the `Type` realm. * **Merger**: Strictly bounded to the `Type` Realm. It is philosophically impossible and mathematically illegal to attempt to UPSERT an API endpoint. * **Queryer**: Routes recursively. Safely evaluates API boundary inputs directly from the `Punc` realm, while tracing underlying table targets back through the `Type` realm to physically compile SQL `SELECT` statements. ### Types of Types * **Table-Backed (Entity Types)**: Primarily defined in root `types` schemas. These represent physical Postgres tables. * They are implicitly registered in the Global Registry using their precise key name mapped from the database compilation phase. * The schema conceptually requires a `type` discriminator at runtime so the engine knows what physical variation to interact with. * Can inherit other entity types to build lineage (e.g. `person` -> `organization` -> `entity`) natively using the `type` property. * **Field-Backed (JSONB Bubbles)**: These are shapes that live entirely inside a Postgres JSONB column without being tied to a top-level table constraint. * **Global Schema Registration**: Roots must be attached to the top-level keys mapped from the `types`, `enums`, or `puncs` database tables. * They can re-use the standard `type` discriminator locally for `oneOf` polymorphism without conflicting with global Postgres Table constraints. ### Discriminators & The `.` Convention In Punc, polymorphic targets like explicit tagged unions or STI (Single Table Inheritance) rely on discriminators. The system heavily leverages a standard `.` dot-notation to enforce topological boundaries deterministically. **The 2-Tier Paradigm**: The system prevents "God Tables" by restricting routing to exactly two dimensions, guaranteeing absolute $O(1)$ lookups without ambiguity: 1. **Base (Vertical Routing)**: Represents the core physical lineage or foundational structural boundary. For entities, this is the table `type` (e.g. `person` or `widget`). For composed schemas, this is the root structural archetype (e.g., `filter`). 2. **Variant (Horizontal Routing)**: Represents the specific contextual projection or runtime mutation applied to the Base. For STI entities, this is the `kind` (e.g., `light`, `heavy`, `stock`). For composed filters, the variant identifies the entity it targets (e.g., `person`, `invoice`). When an object is evaluated for STI polymorphism, the runtime natively extracts its `$kind` and `$type` values, dynamically concatenating them as `.` (e.g. `light.person` or `stock.widget`) to yield the namespace-protected schema key. Therefore, any schema that participates in polymorphic discrimination MUST explicitly define its discriminator properties natively inside its `properties` block. However, to stay DRY and maintain flexible APIs, you **DO NOT** need to hardcode `const` values, nor should you add them to your `required` array. The Punc engine treats `type` and `kind` as **magic properties**. **Magic Validation Constraints**: * **Dynamically Required**: The system inherently drives the need for their requirement. The Validator dynamically expects the discriminators and structurally bubbles `MISSING_TYPE` ultimata ONLY when a polymorphic router (`family` / `oneOf`) dynamically requires them to resolve a path. You never manually put them in the JSON schema `required` block. * **Implicit Resolution**: When wrapped in `family` or `oneOf`, the polymorphic router can mathematically parse the schema key (e.g. `light.person`) and natively validate that `type` equals `"person"` and `kind` equals `"light"`, bubbling `CONST_VIOLATED` if they mismatch, all without you ever hardcoding `const` limitations. * **Generator Explicitness**: Because Postgres is the Single Source of Truth, forcing the explicit definition in `properties` initially guarantees the downstream Dart/Go code generators observe the fields and can cleanly serialize them dynamically back to the server. For example, a schema registered under the exact key `"light.person"` inside the database registry must natively define its own structural boundaries: ```json { "type": "person", "properties": { "type": { "type": "string" }, "kind": { "type": "string" } } } ``` * **The Object Contract (Presence)**: The Object enforces its own structural integrity mechanically. Standard JSON Validation natively ensures `type` and `kind` are dynamically present as expected. * **The Dynamic Values (`db.types`)**: Because the `type` and `kind` properties technically exist, the Punc engine dynamically intercepts them during `validate_object`. It mathematically parses the schema key (e.g. `light.person`) and natively validates that `type` equals `"person"` (or a valid descendant in `db.types`) and `kind` equals `"light"`, bubbling `CONST_VIOLATED` if they mismatch. * **The Routing Contract**: When wrapped in `family` or `oneOf`, the polymorphic router can execute Lightning Fast $O(1)$ fast-paths by reading the payload's `type`/`kind` identifiers, and gracefully fallback to standard structural failure if omitted. ### Composition & Inheritance (The `type` keyword) Punc completely abandons the standard JSON Schema `$ref` keyword. Instead, it overloads the exact same `type` keyword used for primitives. A `"type"` in Punc is mathematically evaluated as either a Native Primitive (`"string"`, `"null"`) or a Custom Object Pointer (`"budget"`, `"user"`). * **Single Inheritance**: Setting `"type": "user"` acts exactly like an `extends` keyword. The schema borrows all fields and constraints from the `user` identity. During `jspg_setup`, the compiler recursively crawls the dependencies to map the physical Postgres table, permanently mapping its type restriction to `"object"` under the hood so JSON standards remain unbroken. * **Implicit Keyword Shadowing**: Unlike standard JSON Schema inheritance, local property definitions natively override and shadow inherited properties. * **Primitive Array Shorthand (Optionality)**: The `type` array syntax is heavily optimized for nullable fields. Defining `"type": ["budget", "null"]` natively builds a nullable strict, generating `Budget? budget;` in Dart. You can freely mix primitives like `["string", "number", "null"]`. * **Strict Array Constraint**: To explicitly prevent mathematically ambiguous Multiple Inheritance, a `type` array is strictly constrained to at most **ONE** Custom Object Pointer. Defining `"type": ["person", "organization"]` will intentionally trigger a fatal database compilation error natively instructing developers to build a proper tagged union (`oneOf`) instead. * **Dynamic Type Bindings (`"$sibling.[suffix]"`)**: If a `type` string begins with a `$` (e.g., `"type": "$kind.filter"`), the JSPG engine treats it as a Dynamic Pointer. During compile time, it safely defers boundary checks. During runtime validation, the engine dynamically reads the literal string value of the referenced sibling property (`kind`) on the *current parent JSON object*, evaluates the substitution (e.g., `"person.filter"`), and instantly routes execution to that schema in $O(1)$ time. This enables incredibly powerful dynamic JSONB shapes (like a generic `filter` column inside a `search` table) without forcing downstream code generators to build unmaintainable unions. ### Polymorphism (`family` and `oneOf`) Polymorphism is how an object boundary can dynamically take on entirely different shapes based on the payload provided at runtime. Punc utilizes the static database metadata generated from Postgres (`db.types`) to enforce these boundaries deterministically, rather than relying on ambiguous tree-traversals. * **`family` (Target-Based Polymorphism)**: An explicit Punc compiler macro instructing the engine to resolve dynamic options against the registered database `types` variations or its inner schema registry. It uses the exact physical constraints of the database to build SQL and validation routes. * **Scenario A: Global Tables (Vertical Routing)** * *Setup*: `{ "family": "organization" }` * *Execution*: The engine queries `db.types.get("organization").variations` and finds `["bot", "organization", "person"]`. Because organizations are structurally table-backed, the `family` automatically uses `type` as the discriminator. * *Options*: `bot` -> `bot`, `person` -> `person`, `organization` -> `organization`. * **Scenario B: Prefixed Tables (Vertical Projection)** * *Setup*: `{ "family": "light.organization" }` * *Execution*: The engine sees the prefix `light.` and base `organization`. It queries `db.types.get("organization").variations` and dynamically prepends the prefix to discover the relevant UI schemas. * *Options*: `person` -> `light.person`, `organization` -> `light.organization`. (If a projection like `light.bot` does not exist in the Type Registry, it is safely ignored). * **Scenario C: Single Table Inheritance (Horizontal Routing)** * *Setup*: `{ "family": "widget" }` (Where `widget` is a table type but has no external variations). * *Execution*: The engine queries `db.types.get("widget").variations` and finds only `["widget"]`. Since it lacks table inheritance, it is treated as STI. The engine scans the specific, confined `schemas` array directly under `db.types.get("widget")` for any registered key terminating in the base `.widget` (e.g., `stock.widget`). The `family` automatically uses `kind` as the discriminator. * *Options*: `stock` -> `stock.widget`, `tasks` -> `tasks.widget`. * **`oneOf` (Strict Tagged Unions)**: A hardcoded list of candidate schemas. Unlike `family` which relies on global DB metadata, `oneOf` forces pure mathematical structural evaluation of the provided candidates. It strictly bans typical JSON Schema "Union of Sets" fallback searches. Every candidate MUST possess a mathematically unique discriminator payload to allow $O(1)$ routing. * **Disjoint Types**: `oneOf: [{ "type": "person" }, { "type": "widget" }]`. The engine succeeds because the native `type` acts as a unique discriminator (`"person"` vs `"widget"`). * **STI Types**: `oneOf: [{ "type": "heavy.person" }, { "type": "light.person" }]`. The engine succeeds. Even though both share `"type": "person"`, their explicit discriminator is `kind` (`"heavy"` vs `"light"`), ensuring unique $O(1)$ fast-paths. * **Conflicting Types**: `oneOf: [{ "type": "person" }, { "type": "light.person" }]`. The engine **fails compilation natively**. Both schemas evaluate to `"type": "person"` and neither provides a disjoint `kind` constraint, making them mathematically ambiguous and impossible to route in $O(1)$ time. ### Conditionals (`cases`) Standard JSON Schema forces developers to write deeply nested `allOf` -> `if` -> `properties` blocks just to execute conditional branching. **JSPG completely abandons `allOf` and this practice.** For declarative business logic and structural mutations conditionally based upon property bounds, use the top-level `cases` array. It evaluates as an **Independent Declarative Rules Engine**. Every `Case` block within the array is evaluated independently in parallel. For a given rule, if the `when` condition evaluates to true, its `then` schema is executed. If it evaluates to false, its `else` schema is executed (if present). To maintain strict standard JSON Schema compatibility internally, the `when` block utilizes pure JSON Schema `properties` definitions (e.g. `enum`, `const`) rather than injecting unstandardized MongoDB operators. Because `when`, `then`, and `else` are themselves standard schemas, they natively support nested `cases` to handle mutually exclusive `else if` architectures. ```json { "cases": [ { "when": { "properties": { "status": { "const": "unverified" } }, "required": ["status"] }, "then": { "required": ["amount_1", "amount_2"] } }, { "when": { "properties": { "kind": { "const": "credit" } }, "required": ["kind"] }, "then": { "required": ["details"] }, "else": { "cases": [ { "when": { "properties": { "kind": { "const": "checking" } }, "required": ["kind"] }, "then": { "required": ["routing_number"] } } ] } } ] } ``` ### Strict by Default & Extensibility * **Strictness**: By default, any property not explicitly defined in the schema causes a validation error (effectively enforcing `additionalProperties: false` globally). * **Extensibility (`extensible: true`)**: To allow a free-for-all of undefined properties, schemas must explicitly declare `"extensible": true`. * **Structured Additional Properties**: If `additionalProperties: {...}` is defined as a schema, arbitrary keys are allowed so long as their values match the defined type constraint. * **Inheritance Boundaries**: Strictness resets when crossing non-primitive `type` boundaries. A schema extending a strict parent remains strict unless it explicitly overrides with `"extensible": true`. ### Format Leniency for Empty Strings To simplify frontend form validation, format validators specifically for `uuid`, `date-time`, and `email` explicitly allow empty strings (`""`), treating them as "present but unset". ### Filters & Conditions In the Punc architecture, filters are automatically synthesized, strongly-typed JSON Schema boundaries that dictate the exact querying capabilities for any given entity or enum. They are completely generated for you; you never write them manually. * **Conditions**: A condition schema is the contract defining the mathematical operations allowed on a primitive field. For example, a `string.condition` allows `$eq`, `$ne`, `$gt`, `$gte`, `$lt`, `$lte`, `$of` (IN), and `$nof` (NOT IN). * **Enum Conditions**: When JSPG synthesizes an enum, it dynamically generates an `.condition` (e.g., `address_kind.condition`). This strongly-typed condition perfectly mirrors the operations of a `string.condition`, but strictly limits the arrays and inputs of `$eq`, `$ne`, `$of`, and `$nof` to the exact variations defined by that Enum. This context ensures that UI generators know exactly when to render `