Files
jspg/GEMINI.md
2026-02-17 17:41:54 -05:00

99 lines
6.3 KiB
Markdown

# JSPG: JSON Schema Postgres
**JSPG** is a high-performance PostgreSQL extension for in-memory JSON Schema validation, specifically targeting **Draft 2020-12**.
It is designed to serve as the validation engine for the "Punc" architecture, where the database is the single source of truth for all data models and API contracts.
## 🎯 Goals
1. **Draft 2020-12 Compliance**: Attempt to adhere to the official JSON Schema Draft 2020-12 specification.
2. **Ultra-Fast Validation**: Compile schemas into an optimized in-memory representation for near-instant validation during high-throughput workloads.
3. **Connection-Bound Caching**: Leverage the PostgreSQL session lifecycle to maintain a per-connection schema cache, eliminating the need for repetitive parsing.
4. **Structural Inheritance**: Support object-oriented schema design via Implicit Keyword Shadowing and virtual `.family` schemas.
5. **Punc Integration**: validation is aware of the "Punc" context (request/response) and can validate `cue` objects efficiently.
## 🔌 API Reference
The extension exposes the following functions to PostgreSQL:
### `cache_json_schemas(enums jsonb, types jsonb, puncs jsonb) -> jsonb`
Loads and compiles the entire schema registry into the session's memory.
* **Inputs**:
* `enums`: Array of enum definitions.
* `types`: Array of type definitions (core entities).
* `puncs`: Array of punc (function) definitions with request/response schemas.
* **Behavior**:
* Parses all inputs into an internal schema graph.
* Resolves all internal references (`$ref`).
* Generates virtual `.family` schemas for type hierarchies.
* Compiles schemas into validators.
* **Returns**: `{"response": "success"}` or an error object.
### `validate_json_schema(schema_id text, instance jsonb) -> jsonb`
Validates a JSON instance against a pre-compiled schema.
* **Inputs**:
* `schema_id`: The `$id` of the schema to validate against (e.g., `person`, `save_person.request`).
* `instance`: The JSON data to validate.
* **Returns**:
* On success: `{"response": "success"}`
* On failure: A JSON object containing structured errors (e.g., `{"errors": [...]}`).
### `json_schema_cached(schema_id text) -> bool`
Checks if a specific schema ID is currently present in the cache.
### `clear_json_schemas() -> jsonb`
Clears the current session's schema cache, freeing memory.
### `show_json_schemas() -> jsonb`
Returns a debug dump of the currently cached schemas (for development/debugging).
## ✨ Custom Features & Deviations
JSPG implements specific extensions to the Draft 2020-12 standard to support the Punc architecture's object-oriented needs.
### 1. Implicit Keyword Shadowing
Standard JSON Schema composition (`allOf`) is additive (Intersection), meaning constraints can only be tightened, not replaced. However, JSPG treats `$ref` differently when it appears alongside other properties to support object-oriented inheritance.
* **Inheritance (`$ref` + `properties`)**: When a schema uses `$ref` *and* defines its own properties, JSPG implements **Smart Merge** (or Shadowing). If a property is defined in the current schema, its constraints take precedence over the inherited constraints for that specific keyword.
* *Example*: If `Entity` defines `type: { const: "entity" }` and `Person` (which refs Entity) defines `type: { const: "person" }`, validation passes for "person". The local `const` shadows the inherited `const`.
* *Granularity*: Shadowing is per-keyword. If `Entity` defined `type: { const: "entity", minLength: 5 }`, `Person` would shadow `const` but still inherit `minLength: 5`.
* **Composition (`allOf`)**: When using `allOf`, standard intersection rules apply. No shadowing occurs; all constraints from all branches must pass. This is used for mixins or interfaces.
### 2. Virtual Family Schemas (`.family`)
To support polymorphic fields (e.g., a field that accepts any "User" type), JSPG generates virtual schemas representing type hierarchies.
* **Mechanism**: When caching types, if a type defines a `hierarchy` (e.g., `["entity", "organization", "person"]`), JSPG generates a schema like `organization.family` which is a `oneOf` containing refs to all valid descendants.
### 3. Strict by Default & Extensibility
JSPG enforces a "Secure by Default" philosophy. All schemas are treated as if `unevaluatedProperties: false` (and `unevaluatedItems: false`) is set, unless explicitly overridden.
* **Strictness**: By default, any property in the instance data that is not explicitly defined in the schema causes a validation error. This prevents clients from sending undeclared fields.
* **Extensibility (`extensible: true`)**: To allow additional, undefined properties, you must add `"extensible": true` to the schema. This is useful for types that are designed to be open for extension.
* **Ref Boundaries**: Strictness is reset when crossing `$ref` boundaries. The referenced schema's strictness is determined by its own definition (strict by default unless `extensible: true`), ignoring the caller's state.
* **Inheritance**: Strictness is inherited. A schema extending a strict parent will also be strict unless it declares itself `extensible: true`. Conversely, a schema extending a loose parent will also be loose unless it declares itself `extensible: false`.
### 4. Format Leniency for Empty Strings
To simplify frontend form logic, the format validators for `uuid`, `date-time`, and `email` explicitly allow empty strings (`""`). This treats an empty string as "present but unset" rather than "invalid format".
## 🏗️ Architecture
The extension is written in Rust using `pgrx` and structures its schema parser to mirror the Punc Generator's design:
* **Single `Schema` Struct**: A unified struct representing the exact layout of a JSON Schema object, including standard keywords and custom vocabularies (`form`, `display`, etc.).
* **Compiler Phase**: schema JSONs are parsed into this struct, linked (references resolved), and then compiled into an efficient validation tree.
* **Validation Phase**: The compiled validators traverse the JSON instance using `serde_json::Value`.
## 🧪 Testing
Testing is driven by standard Rust unit tests that load JSON fixtures.
The tests are located in `tests/fixtures/*.json` and are executed via `cargo test`.