11 KiB
JSPG: JSON Schema Postgres
JSPG is a high-performance PostgreSQL extension written in Rust (using pgrx) that transforms Postgres into a pre-compiled Semantic Engine. It serves as the core engine for the "Punc" architecture, where the database is the single source of truth for all data models, API contracts, validations, and reactive queries.
1. Overview & Architecture
JSPG operates by deeply integrating the JSON Schema Draft 2020-12 specification directly into the Postgres session lifecycle. It is built around three core pillars:
- Validator: In-memory, near-instant JSON structural validation and type polymorphism routing.
- Merger: Automatically traverse and UPSERT deeply nested JSON graphs into normalized relational tables.
- Queryer: Compile JSON Schemas into static, cached SQL SPI
SELECTplans for fetching full entities or isolated "Stems".
🎯 Goals
- Draft 2020-12 Compliance: Attempt to adhere to the official JSON Schema Draft 2020-12 specification.
- Ultra-Fast Execution: Compile schemas into optimized in-memory validation trees and cached SQL SPIs to bypass Postgres Query Builder overheads.
- Connection-Bound Caching: Leverage the PostgreSQL session lifecycle using an Atomic Swap pattern. Schemas are 100% frozen, completely eliminating locks during read access.
- Structural Inheritance: Support object-oriented schema design via Implicit Keyword Shadowing and virtual
$familyreferences natively mapped to Postgres table constraints. - Reactive Beats: Provide natively generated "Stems" (isolated payload fragments) for dynamic websocket reactivity.
Concurrency & Threading ("Immutable Graphs")
To support high-throughput operations while allowing for runtime updates (e.g., during hot-reloading), JSPG uses an Atomic Swap pattern:
- Parser Phase: Schema JSONs are parsed into ordered
Schemastructs. - Compiler Phase: The database iterates all parsed schemas and pre-computes native optimization maps (Descendants Map, Depths Map, Variations Map).
- Immutable Validator: The
Validatorstruct immutably owns theDatabaseregistry and all its global maps. Schemas themselves are completely frozen;$refstrings are resolved dynamically at runtime using pre-computed O(1) maps. - Lock-Free Reads: Incoming operations acquire a read lock just long enough to clone the
Arcinside anRwLock<Option<Arc<Validator>>>, ensuring zero blocking during schema updates.
2. Validator
The Validator provides strict, schema-driven evaluation for the "Punc" architecture.
API Reference
jspg_setup(database jsonb) -> jsonb: Loads and compiles the entire registry (types, enums, puncs, relations) atomically.mask_json_schema(schema_id text, instance jsonb) -> jsonb: Validates and prunes unknown properties dynamically, returning masked data.jspg_validate(schema_id text, instance jsonb) -> jsonb: Returns boolean-like success or structured errors.jspg_teardown() -> jsonb: Clears the current session's schema cache.
Custom Features & Deviations
JSPG implements specific extensions to the Draft 2020-12 standard to support the Punc architecture's object-oriented needs while heavily optimizing for zero-runtime lookups.
A. Polymorphism & Referencing ($ref, $family, and Native Types)
- Native Type Discrimination (
variations): Schemas defined inside a Postgrestypeare Entities. The validator securely and implicitly manages their"type"property. If an entity inherits fromuser, incoming JSON can safely define{"type": "person"}without errors, thanks tocompiled_variationsinheritance. - Structural Inheritance & Viral Infection (
$ref):$refis used exclusively for structural inheritance, never for union creation. A Punc request schema that$refs an Entity virally inherits all physical database polymorphism rules for that target. - Shape Polymorphism (
$family): Auto-expands polymorphic API lists based on an abstract Descendants Graph. If{"$family": "widget"}is used, JSPG evaluates the JSON against every schema that$refs widget. - Strict Matches & Depth Heuristic: Polymorphic structures MUST match exactly one schema permutation. If multiple inherited struct permutations pass, JSPG applies the Depth Heuristic Tie-Breaker, selecting the candidate deepest in the inheritance tree.
B. Strict by Default & Extensibility
- Strictness: By default, any property not explicitly defined in the schema causes a validation error (effectively enforcing
additionalProperties: falseglobally). - Extensibility (
extensible: true): To allow a free-for-all of undefined properties, schemas must explicitly declare"extensible": true. - Structured Additional Properties: If
additionalProperties: {...}is defined as a schema, arbitrary keys are allowed so long as their values match the defined type constraint. - Inheritance Boundaries: Strictness resets when crossing
$refboundaries. A schema extending a strict parent remains strict unless it explicitly overrides with"extensible": true.
C. Implicit Keyword Shadowing
- Inheritance (
$ref+ properties): Unlike standard JSON Schema, when a schema uses$refalongside local properties, JSPG implements Smart Merge. Local constraints natively take precedence over (shadow) inherited constraints for the same keyword.- Example: If
entityhastype: {const: "entity"}, butpersondefinestype: {const: "person"}, the localpersonconst cleanly overrides the inherited one.
- Example: If
- Composition (
allOf): When evaluatingallOf, standard intersection rules apply seamlessly. No shadowing occurs, meaning all constraints from all branches must pass.
D. Format Leniency for Empty Strings
To simplify frontend form validation, format validators specifically for uuid, date-time, and email explicitly allow empty strings (""), treating them as "present but unset".
3. Merger
The Merger provides an automated, high-performance graph synchronization engine via the jspg_merge(cue JSONB) API. It orchestrates the complex mapping of nested JSON objects into normalized Postgres relational tables, honoring all inheritance and graph constraints.
Core Features
- Deep Graph Merging: The Merger walks arbitrary levels of deeply nested JSON schemas (e.g. tracking an
order, itscustomer, and an array of itslines). It intelligently discovers the correct parent-to-child or child-to-parent Foreign Keys stored in the registry and automatically maps the UUIDs across the relationships during UPSERT. - Prefix Foreign Key Matching: Handles scenario where multiple relations point to the same table by using database Foreign Key constraint prefixes (
fk_). For example, if a schema hasshipping_addressandbilling_address, the merger resolves againstfk_shipping_address_entityvsfk_billing_address_entityautomatically to correctly route object properties. - Dynamic Deduplication & Lookups: If a nested object is provided without an
id, the Merger utilizes Postgreslk_index constraints defined in the schema registry (e.g.lk_personmapped tofirst_nameandlast_name). It dynamically queries these unique matching constraints to discover the correct UUID to perform an UPDATE, preventing data duplication. - Hierarchical Table Inheritance: The Punc system uses distributed table inheritance (e.g.
personinheritsuserinheritsorganizationinheritsentity). The Merger splits the incoming JSON payload and performs atomic row updates across all relevant tables in the lineage map. - The Archive Paradigm: Data is never deleted in the Punc system. The Merger securely enforces referential integrity by toggling the
archivedBoolean flag on the baseentitytable rather than issuing SQLDELETEcommands. - Change Tracking & Reactivity: The Merger diffs the incoming JSON against the existing database row (utilizing static,
DashMap-cachedlk_SELECT string templates). Every detected change is recorded into theagreego.changeaudit table, tracking the user mapping. It then natively usespg_notifyto broadcast a completely flat row-level diff out to the Go WebSocket server for O(1) routing. - Many-to-Many Graph Edge Management: Operates seamlessly with the global
agreego.relationshiptable, allowing the system to represent and merge arbitrary reified M:M relationships directionally between any two entities. - Sparse Updates: Empty JSON strings
""are directly bound as explicit SQLNULLdirectives to clear data, whilst omitted (missing) properties skip UPDATE execution entirely, ensuring partial UI submissions do not wipe out sibling fields. - Unified Return Structure: To eliminate UI hydration race conditions and multi-user duplication,
jspg_mergeexplicitly strips the response graph and returns only the root{ "id": "uuid" }(or an array of IDs for list insertions). External APIs can then explicitly call read APIs to fetch the resulting graph, while the UI relies 100% implicitly on the flatpg_notifypipeline for reactive state synchronization. - Decoupled SQL Generation: Because Writes (INSERT/UPDATE) are inherently highly dynamic based on partial payload structures, the Merger generates raw SQL strings dynamically per execution without caching, guaranteeing a minimal memory footprint while scaling optimally.
4. Queryer
The Queryer transforms Postgres into a pre-compiled Semantic Query Engine via the jspg_query(schema_id text, cue jsonb) API, designed to serve the exact shape of Punc responses directly via SQL.
Core Features
- Schema-to-SQL Compilation: Compiles JSON Schema ASTs spanning deep arrays directly into static, pre-planned SQL multi-JOIN queries.
- DashMap SQL Caching: Executes compiled SQL via Postgres SPI execution, securely caching the static string compilation templates per schema permutation inside the
GLOBAL_JSPGapplication memory, drastically reducing repetitive schema crawling. - Dynamic Filtering: Binds parameters natively through
cue.filtersobjects. Dynamically handles string formatting (e.g. parsinguuidor formatting date-times) and safely escapes complex combinations utilizingILIKEoperations correctly mapped to the originating structural table. - The Stem Engine: Rather than over-fetching heavy Entity payloads and trimming them, Punc Framework Websockets depend on isolated subgraphs defined as Stems.
- During initialization, the generator auto-discovers graph boundaries (Stems) inside the schema tree.
- The Queryer prepares dedicated SQL execution templates tailored precisely for that exact
Stempath (e.g. executingget_dashboardqueried specifically for the/ownerstem). - These Stem outputs instantly hydrate targeted Go Bitsets, providing
O(1)real-time routing for fractional data payloads without any application-layer overhead.