Compare commits

...

6 Commits

Author SHA1 Message Date
6b6647f2d6 version: 1.0.39 2025-09-30 20:44:35 -04:00
d301d5fab9 types at root not strict 2025-09-30 20:44:17 -04:00
61511b595d added flow commands for testing validator vs jspg 2025-09-30 20:29:13 -04:00
c7ae975275 version: 1.0.38 2025-09-30 20:19:51 -04:00
aa58082cd7 boon test suite itself passing 2025-09-30 20:19:41 -04:00
491fb3a3e3 docs updated 2025-09-30 20:01:49 -04:00
10 changed files with 91 additions and 31 deletions

View File

@ -1,25 +1,79 @@
# Gemini Project Overview: `jspg` # Gemini Project Overview: `jspg`
This document outlines the purpose of the `jspg` project and the specific modifications made to the vendored `boon` JSON schema validator crate. This document outlines the purpose of the `jspg` project, its architecture, and the specific modifications made to the vendored `boon` JSON schema validator crate.
## What is `jspg`? ## What is `jspg`?
`jspg` is a PostgreSQL extension written in Rust using the `pgrx` framework. Its primary function is to provide fast, in-database JSON schema validation against the 2020-12 draft of the JSON Schema specification. `jspg` is a PostgreSQL extension written in Rust using the `pgrx` framework. Its primary function is to provide fast, in-database JSON schema validation against the 2020-12 draft of the JSON Schema specification.
It works by: ### How It Works
1. Exposing a SQL function, `cache_json_schemas(...)`, which takes arrays of schema objects, compiles them, and caches them in memory.
2. Exposing a SQL validation function, `validate_json_schema(schema_id, instance)`, which validates a JSONB instance against one of the pre-cached schemas. The extension is designed for high-performance scenarios where schemas are defined once and used many times for validation. It achieves this through an in-memory cache.
3. Using a locally modified (vendored) version of the `boon` crate to perform the validation, allowing for custom enhancements to its core logic.
1. **Caching:** A user first calls the `cache_json_schemas(enums, types, puncs)` SQL function. This function takes arrays of JSON objects representing different kinds of schemas within a larger application framework. It uses the vendored `boon` crate to compile all these schemas into an efficient internal format and stores them in a static, in-memory `SCHEMA_CACHE`. This cache is managed by a `RwLock` to allow concurrent reads during validation.
2. **Validation:** The `validate_json_schema(schema_id, instance)` SQL function is then used to validate a JSONB `instance` against a specific, pre-cached schema identified by its `$id`. This function looks up the compiled schema in the cache and runs the validation, returning a success response or a detailed error report.
3. **Custom Logic:** `jspg` uses a locally modified (vendored) version of the `boon` crate. This allows for powerful, application-specific validation logic that goes beyond the standard JSON Schema specification, such as runtime-based strictness.
### Error Handling
When validation fails, `jspg` provides a detailed error report in a consistent JSON format, which we refer to as a "DropError". This process involves two main helper functions in `src/lib.rs`:
1. **`collect_errors`**: `boon` returns a nested tree of `ValidationError` objects. This function recursively traverses that tree to find the most specific, underlying causes of the failure. It filters out structural errors (like `allOf` or `anyOf`) to create a flat list of concrete validation failures.
2. **`format_errors`**: This function takes the flat list of errors and transforms each one into the final DropError JSON format. It also de-duplicates errors that occur at the same JSON Pointer path, ensuring a cleaner output if a single value violates multiple constraints.
#### DropError Format
A DropError object provides a clear, structured explanation of a validation failure:
```json
{
"code": "ADDITIONAL_PROPERTIES_NOT_ALLOWED",
"message": "Property 'extra' is not allowed",
"details": {
"path": "/extra",
"context": "not allowed",
"cause": {
"got": [
"extra"
]
},
"schema": "basic_strict_test.request"
}
}
```
- `code` (string): A machine-readable error code (e.g., `ADDITIONAL_PROPERTIES_NOT_ALLOWED`, `MIN_LENGTH_VIOLATED`).
- `message` (string): A human-readable summary of the error.
- `details` (object):
- `path` (string): The JSON Pointer path to the invalid data within the instance.
- `context` (any): The actual value that failed validation.
- `cause` (any): The low-level reason from the validator, often including the expected value (`want`) and the actual value (`got`).
- `schema` (string): The `$id` of the schema that was being validated.
---
## `boon` Crate Modifications ## `boon` Crate Modifications
The version of `boon` located in the `validator/` directory has been modified to address specific requirements of the `jspg` project. The key deviations from the upstream `boon` crate are as follows: The version of `boon` located in the `validator/` directory has been significantly modified to support runtime-based strict validation. The original `boon` crate only supports compile-time strictness and lacks the necessary mechanisms to propagate validation context correctly for our use case.
### 1. Recursive Runtime Strictness Control ### 1. Recursive Runtime Strictness Control
- **Problem:** The `jspg` project requires that certain schemas enforce a strict "no extra properties" policy (specifically, schemas for public `puncs` and global `type`s). This strictness needs to cascade through the entire validation hierarchy, including all nested objects and `$ref` chains. A compile-time flag was unsuitable because it would incorrectly apply strictness to shared, reusable schemas. - **Problem:** The `jspg` project requires that certain schemas (specifically those for public `puncs` and global `type`s) enforce a strict "no extra properties" policy. This strictness needs to be decided at runtime and must cascade through the entire validation hierarchy, including all nested objects and `$ref` chains. A compile-time flag was unsuitable because it would incorrectly apply strictness to shared, reusable schemas.
- **Solution:** A runtime validation option was implemented to enforce strictness recursively. - **Solution:** A runtime validation option was implemented to enforce strictness recursively. This required several coordinated changes to the `boon` validator.
1. A `ValidationOptions { be_strict: bool }` struct was added. The `jspg` code in `src/lib.rs` determines whether a validation run should be strict (based on the `punc`'s `public` flag or if validating a global `type`) and passes the appropriate option to the validator.
2. The `be_strict` option is propagated through the entire recursive validation process. A bug was fixed in `_validate_self` (which handles `$ref`s) to ensure that the sub-validator is always initialized to track unevaluated properties when `be_strict` is enabled. Previously, tracking was only initiated if the parent was already tracking unevaluated properties, causing strictness to be dropped across certain `$ref` boundaries. #### Key Changes
3. At any time, if `unevaluatedProperties` or `additionalProperties` is found in the schema, it should override the strict (or non-strict) validation at that level.
1. **`ValidationOptions` Struct**: A new `ValidationOptions { be_strict: bool }` struct was added to `validator/src/lib.rs`. The `jspg` code in `src/lib.rs` determines if a validation run should be strict and passes this struct to the validator.
2. **Strictness Check in `uneval_validate`**: The original `boon` only checked for unevaluated properties if the `unevaluatedProperties` keyword was present in the schema. We added an `else if be_strict` block to `uneval_validate` in `validator/src/validator.rs`. This block triggers a check for any leftover unevaluated properties at the end of a validation pass and reports them as errors, effectively enforcing our runtime strictness rule.
3. **Correct Context Propagation**: The most complex part of the fix was ensuring the set of unevaluated properties was correctly maintained across different validation contexts (especially `$ref` and nested property validations). Three critical changes were made:
- **Inheriting Context in `_validate_self`**: When validating keywords that apply to the same instance (like `$ref` or `allOf`), the sub-validator must know what properties the parent has already evaluated. We changed the creation of the `Validator` inside `_validate_self` to pass a clone of the parent's `uneval` state (`uneval: self.uneval.clone()`) instead of creating a new one from scratch. This allows the context to flow downwards.
- **Isolating Context in `validate_val`**: Conversely, when validating a property's value, that value is a *different* part of the JSON instance. The sub-validation should not affect the parent's list of unevaluated properties. We fixed this by commenting out the `self.uneval.merge(...)` call in the `validate_val` function.
- **Simplifying `Uneval::merge`**: The original logic for merging `uneval` state was different for `$ref` keywords. This was incorrect. We simplified the `merge` function to *always* perform an intersection (`retain`), which correctly combines the knowledge of evaluated properties from different schema parts that apply to the same instance.
4. **Removing Incompatible Assertions**: The changes to context propagation broke several `debug_assert!` macros in the `arr_validate` function, which were part of `boon`'s original design. Since our new validation flow is different but correct, these assertions were removed.

13
flow
View File

@ -97,11 +97,16 @@ install() {
fi fi
} }
test() { test-jspg() {
info "Running jspg tests..." info "Running jspg tests..."
cargo pgrx test "pg${POSTGRES_VERSION}" "$@" || return $? cargo pgrx test "pg${POSTGRES_VERSION}" "$@" || return $?
} }
test-validator() {
info "Running validator tests..."
cargo test -p boon --features "pgrx/pg${POSTGRES_VERSION}" "$@" || return $?
}
clean() { clean() {
info "Cleaning build artifacts..." info "Cleaning build artifacts..."
cargo clean || return $? cargo clean || return $?
@ -111,7 +116,8 @@ jspg-usage() {
printf "prepare\tCheck OS, Cargo, and PGRX dependencies.\n" printf "prepare\tCheck OS, Cargo, and PGRX dependencies.\n"
printf "install\tBuild and install the extension locally (after prepare).\n" printf "install\tBuild and install the extension locally (after prepare).\n"
printf "reinstall\tClean, build, and install the extension locally (after prepare).\n" printf "reinstall\tClean, build, and install the extension locally (after prepare).\n"
printf "test\t\tRun pgrx integration tests.\n" printf "test-jspg\t\tRun pgrx integration tests.\n"
printf "test-validator\t\tRun validator integration tests.\n"
printf "clean\t\tRemove pgrx build artifacts.\n" printf "clean\t\tRemove pgrx build artifacts.\n"
} }
@ -121,7 +127,8 @@ jspg-flow() {
build) build; return $?;; build) build; return $?;;
install) install; return $?;; install) install; return $?;;
reinstall) clean && install; return $?;; reinstall) clean && install; return $?;;
test) test "${@:2}"; return $?;; test-jspg) test-jspg "${@:2}"; return $?;;
test-validator) test-validator "${@:2}"; return $?;;
clean) clean; return $?;; clean) clean; return $?;;
*) return 1 ;; *) return 1 ;;
esac esac

View File

@ -304,7 +304,7 @@ fn validate_json_schema(schema_id: &str, instance: JsonB) -> JsonB {
Some(schema) => { Some(schema) => {
let instance_value: Value = instance.0; let instance_value: Value = instance.0;
let options = match schema.t { let options = match schema.t {
SchemaType::Type | SchemaType::PublicPunc => Some(ValidationOptions { be_strict: true }), SchemaType::PublicPunc => Some(ValidationOptions { be_strict: true }),
_ => None, _ => None,
}; };

View File

@ -10,7 +10,7 @@ let mut schemas = Schemas::new(); // container for compiled schemas
let mut compiler = Compiler::new(); let mut compiler = Compiler::new();
let sch_index = compiler.compile("schema.json", &mut schemas)?; let sch_index = compiler.compile("schema.json", &mut schemas)?;
let instance: Value = serde_json::from_reader(File::open("instance.json")?)?; let instance: Value = serde_json::from_reader(File::open("instance.json")?)?;
let valid = schemas.validate(&instance, sch_index).is_ok(); let valid = schemas.validate(&instance, sch_index, None).is_ok();
# Ok(()) # Ok(())
# } # }
``` ```

View File

@ -849,7 +849,6 @@ impl<'v, 's> Validator<'v, 's, '_, '_> {
) -> Result<(), ValidationError<'s, 'v>> { ) -> Result<(), ValidationError<'s, 'v>> {
let scope = self.scope.child(sch, ref_kw, self.scope.vid); let scope = self.scope.child(sch, ref_kw, self.scope.vid);
let schema = &self.schemas.get(sch); let schema = &self.schemas.get(sch);
let be_strict = self.options.map_or(false, |o| o.be_strict);
let (result, reply) = Validator { let (result, reply) = Validator {
v: self.v, v: self.v,
vloc: self.vloc, vloc: self.vloc,

View File

@ -15,7 +15,7 @@ fn test_debug() -> Result<(), Box<dyn Error>> {
let url = "http://debug.com/schema.json"; let url = "http://debug.com/schema.json";
compiler.add_resource(url, test["schema"].clone())?; compiler.add_resource(url, test["schema"].clone())?;
let sch = compiler.compile(url, &mut schemas)?; let sch = compiler.compile(url, &mut schemas)?;
let result = schemas.validate(&test["data"], sch); let result = schemas.validate(&test["data"], sch, None);
if let Err(e) = &result { if let Err(e) = &result {
for line in format!("{e}").lines() { for line in format!("{e}").lines() {
println!(" {line}"); println!(" {line}");

View File

@ -13,7 +13,7 @@ fn example_from_files() -> Result<(), Box<dyn Error>> {
let mut schemas = Schemas::new(); let mut schemas = Schemas::new();
let mut compiler = Compiler::new(); let mut compiler = Compiler::new();
let sch_index = compiler.compile(schema_file, &mut schemas)?; let sch_index = compiler.compile(schema_file, &mut schemas)?;
let result = schemas.validate(&instance, sch_index); let result = schemas.validate(&instance, sch_index, None);
assert!(result.is_ok()); assert!(result.is_ok());
Ok(()) Ok(())
@ -51,7 +51,7 @@ fn example_from_strings() -> Result<(), Box<dyn Error>> {
compiler.add_resource("tests/examples/pet.json", pet_schema)?; compiler.add_resource("tests/examples/pet.json", pet_schema)?;
compiler.add_resource("tests/examples/cat.json", cat_schema)?; compiler.add_resource("tests/examples/cat.json", cat_schema)?;
let sch_index = compiler.compile("tests/examples/pet.json", &mut schemas)?; let sch_index = compiler.compile("tests/examples/pet.json", &mut schemas)?;
let result = schemas.validate(&instance, sch_index); let result = schemas.validate(&instance, sch_index, None);
assert!(result.is_ok()); assert!(result.is_ok());
Ok(()) Ok(())
@ -79,7 +79,7 @@ fn example_from_https() -> Result<(), Box<dyn Error>> {
loader.register("https", Box::new(HttpUrlLoader)); loader.register("https", Box::new(HttpUrlLoader));
compiler.use_loader(Box::new(loader)); compiler.use_loader(Box::new(loader));
let sch_index = compiler.compile(schema_url, &mut schemas)?; let sch_index = compiler.compile(schema_url, &mut schemas)?;
let result = schemas.validate(&instance, sch_index); let result = schemas.validate(&instance, sch_index, None);
assert!(result.is_ok()); assert!(result.is_ok());
Ok(()) Ok(())
@ -114,7 +114,7 @@ fn example_from_yaml_files() -> Result<(), Box<dyn Error>> {
loader.register("file", Box::new(FileUrlLoader)); loader.register("file", Box::new(FileUrlLoader));
compiler.use_loader(Box::new(loader)); compiler.use_loader(Box::new(loader));
let sch_index = compiler.compile(schema_file, &mut schemas)?; let sch_index = compiler.compile(schema_file, &mut schemas)?;
let result = schemas.validate(&instance, sch_index); let result = schemas.validate(&instance, sch_index, None);
assert!(result.is_ok()); assert!(result.is_ok());
Ok(()) Ok(())
@ -148,7 +148,7 @@ fn example_custom_format() -> Result<(), Box<dyn Error>> {
}); });
compiler.add_resource(schema_url, schema)?; compiler.add_resource(schema_url, schema)?;
let sch_index = compiler.compile(schema_url, &mut schemas)?; let sch_index = compiler.compile(schema_url, &mut schemas)?;
let result = schemas.validate(&instance, sch_index); let result = schemas.validate(&instance, sch_index, None);
assert!(result.is_ok()); assert!(result.is_ok());
Ok(()) Ok(())
@ -193,7 +193,7 @@ fn example_custom_content_encoding() -> Result<(), Box<dyn Error>> {
}); });
compiler.add_resource(schema_url, schema)?; compiler.add_resource(schema_url, schema)?;
let sch_index = compiler.compile(schema_url, &mut schemas)?; let sch_index = compiler.compile(schema_url, &mut schemas)?;
let result = schemas.validate(&instance, sch_index); let result = schemas.validate(&instance, sch_index, None);
assert!(result.is_err()); assert!(result.is_err());
Ok(()) Ok(())
@ -223,7 +223,7 @@ fn example_custom_content_media_type() -> Result<(), Box<dyn Error>> {
}); });
compiler.add_resource(schema_url, schema)?; compiler.add_resource(schema_url, schema)?;
let sch_index = compiler.compile(schema_url, &mut schemas)?; let sch_index = compiler.compile(schema_url, &mut schemas)?;
let result = schemas.validate(&instance, sch_index); let result = schemas.validate(&instance, sch_index, None);
assert!(result.is_ok()); assert!(result.is_ok());
Ok(()) Ok(())

View File

@ -52,7 +52,7 @@ fn test_folder(suite: &str, folder: &str, draft: Draft) -> Result<(), Box<dyn Er
let sch = compiler.compile(schema_url, &mut schemas)?; let sch = compiler.compile(schema_url, &mut schemas)?;
for test in group.tests { for test in group.tests {
println!(" {}", test.description); println!(" {}", test.description);
match schemas.validate(&test.data, sch) { match schemas.validate(&test.data, sch, None) {
Ok(_) => println!(" validation success"), Ok(_) => println!(" validation success"),
Err(e) => { Err(e) => {
if let Some(sch) = test.output.basic { if let Some(sch) = test.output.basic {
@ -64,7 +64,7 @@ fn test_folder(suite: &str, folder: &str, draft: Draft) -> Result<(), Box<dyn Er
compiler.add_resource(schema_url, sch)?; compiler.add_resource(schema_url, sch)?;
let sch = compiler.compile(schema_url, &mut schemas)?; let sch = compiler.compile(schema_url, &mut schemas)?;
let basic: Value = serde_json::from_str(&e.basic_output().to_string())?; let basic: Value = serde_json::from_str(&e.basic_output().to_string())?;
let result = schemas.validate(&basic, sch); let result = schemas.validate(&basic, sch, None);
if let Err(e) = result { if let Err(e) = result {
println!("{basic:#}\n"); println!("{basic:#}\n");
for line in format!("{e}").lines() { for line in format!("{e}").lines() {
@ -83,7 +83,7 @@ fn test_folder(suite: &str, folder: &str, draft: Draft) -> Result<(), Box<dyn Er
let sch = compiler.compile(schema_url, &mut schemas)?; let sch = compiler.compile(schema_url, &mut schemas)?;
let detailed: Value = let detailed: Value =
serde_json::from_str(&e.detailed_output().to_string())?; serde_json::from_str(&e.detailed_output().to_string())?;
let result = schemas.validate(&detailed, sch); let result = schemas.validate(&detailed, sch, None);
if let Err(e) = result { if let Err(e) = result {
println!("{detailed:#}\n"); println!("{detailed:#}\n");
for line in format!("{e}").lines() { for line in format!("{e}").lines() {

View File

@ -90,7 +90,7 @@ fn test_file(suite: &str, path: &str, draft: Draft) -> Result<(), Box<dyn Error>
let sch_index = compiler.compile(url, &mut schemas)?; let sch_index = compiler.compile(url, &mut schemas)?;
for test in group.tests { for test in group.tests {
println!(" {}", test.description); println!(" {}", test.description);
let result = schemas.validate(&test.data, sch_index); let result = schemas.validate(&test.data, sch_index, None);
if let Err(e) = &result { if let Err(e) = &result {
for line in format!("{e}").lines() { for line in format!("{e}").lines() {
println!(" {line}"); println!(" {line}");

View File

@ -1 +1 @@
1.0.37 1.0.39