Compare commits

...

14 Commits

17 changed files with 970 additions and 581 deletions

3
.geminiignore Normal file
View File

@ -0,0 +1,3 @@
/target/
/package/
.env

667
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -10,14 +10,14 @@ version = "0.1.0"
edition = "2024"
[dependencies]
pgrx = "0.15.0"
pgrx = "0.16.1"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
lazy_static = "1.5.0"
boon = { path = "validator" }
[dev-dependencies]
pgrx-tests = "0.15.0"
pgrx-tests = "0.16.1"
[lib]
crate-type = ["cdylib", "lib"]

View File

@ -10,7 +10,29 @@ This document outlines the purpose of the `jspg` project, its architecture, and
The extension is designed for high-performance scenarios where schemas are defined once and used many times for validation. It achieves this through an in-memory cache.
1. **Caching:** A user first calls the `cache_json_schemas(enums, types, puncs)` SQL function. This function takes arrays of JSON objects representing different kinds of schemas within a larger application framework. It uses the vendored `boon` crate to compile all these schemas into an efficient internal format and stores them in a static, in-memory `SCHEMA_CACHE`. This cache is managed by a `RwLock` to allow concurrent reads during validation.
1. **Caching and Pre-processing:** A user first calls the `cache_json_schemas(enums, types, puncs)` SQL function. This function takes arrays of JSON objects representing different kinds of schemas:
- `enums`: Standalone enum schemas (e.g., for a `task_priority` list).
- `types`: Schemas for core application data models (e.g., `person`, `organization`). These may contain a `hierarchy` array for inheritance information.
- `puncs`: Schemas for API/function-specific requests and responses.
Before compiling, `jspg` performs a crucial **pre-processing step** for type hierarchies. It inspects each definition in the `types` array. If a type includes a `hierarchy` array (e.g., a `person` type with `["entity", "organization", "user", "person"]`), `jspg` uses this to build a map of "type families."
From this map, it generates new, virtual schemas on the fly. For example, for the `organization` type, it will generate a schema with `$id: "organization.family"` that contains an `enum` of all its descendant types, such as `["organization", "user", 'person"]`.
This allows developers to write more flexible schemas. Instead of strictly requiring a `const` type, you can validate against an entire inheritance chain:
```json
// In an "organization" schema definition
"properties": {
"type": {
// Allows the 'type' field to be "organization", "user", or "person"
"$ref": "organization.family",
"override": true
}
}
```
Finally, all user-defined schemas and the newly generated `.family` schemas are passed to the vendored `boon` crate, compiled into an efficient internal format, and stored in a static, in-memory `SCHEMA_CACHE`. This cache is managed by a `RwLock` to allow for high-performance, concurrent reads during validation.
2. **Validation:** The `validate_json_schema(schema_id, instance)` SQL function is then used to validate a JSONB `instance` against a specific, pre-cached schema identified by its `$id`. This function looks up the compiled schema in the cache and runs the validation, returning a success response or a detailed error report.
@ -57,9 +79,43 @@ A DropError object provides a clear, structured explanation of a validation fail
## `boon` Crate Modifications
The version of `boon` located in the `validator/` directory has been significantly modified to support runtime-based strict validation. The original `boon` crate only supports compile-time strictness and lacks the necessary mechanisms to propagate validation context correctly for our use case.
The version of `boon` located in the `validator/` directory has been significantly modified to support application-specific validation logic that goes beyond the standard JSON Schema specification.
### 1. Recursive Runtime Strictness Control
### 1. Property-Level Overrides for Inheritance
- **Problem:** A primary use case for this project is validating data models that use `$ref` to create inheritance chains (e.g., a `person` schema `$ref`s a `user` schema, which `$ref`s an `entity` schema). A common pattern is to use a `const` keyword on a `type` property to identify the specific model (e.g., `"type": {"const": "person"}`). However, standard JSON Schema composition with `allOf` (which is implicitly used by `$ref`) treats these as a logical AND. This creates an impossible condition where an instance's `type` property would need to be "person" AND "user" AND "entity" simultaneously.
- **Solution:** We've implemented a custom, explicit override mechanism. A new keyword, `"override": true`, can be added to any property definition within a schema.
```json
// person.json
{
"$id": "person",
"$ref": "user",
"properties": {
"type": { "const": "person", "override": true }
}
}
```
This signals to the validator that this definition of the `type` property should be the *only* one applied, and any definitions for `type` found in base schemas (like `user` or `entity`) should be ignored for the duration of this validation.
#### Key Changes
This was achieved by making the validator stateful, using a pattern already present in `boon` for handling `unevaluatedProperties`.
1. **Meta-Schema Update**: The meta-schema for Draft 2020-12 was modified to recognize `"override": true` as a valid keyword within a schema object, preventing the compiler from rejecting our custom schemas.
2. **Compiler Modification**: The schema compiler in `validator/src/compiler.rs` was updated. It now inspects sub-schemas within a `properties` keyword and, if it finds `"override": true`, it records the name of that property in a new `override_properties` `HashSet` on the compiled `Schema` struct.
3. **Stateful Validator with `Override` Context**: The core `Validator` in `validator/src/validator.rs` was modified to carry an `Override` context (a `HashSet` of property names) throughout the validation process.
- **Initialization**: When validation begins, the `Override` context is created and populated with the names of any properties that the top-level schema has marked with `override`.
- **Propagation**: As the validator descends through a `$ref` or `allOf`, this `Override` context is cloned and passed down. The child schema adds its own override properties to the set, ensuring that higher-level overrides are always maintained.
- **Enforcement**: In `obj_validate`, before a property is validated, the validator first checks if the property's name exists in the `Override` context it has received. If it does, it means a parent schema has already claimed responsibility for validating this property, so the child validator **skips** it entirely. This effectively achieves the "top-level wins" inheritance model.
This approach cleanly integrates our desired inheritance behavior directly into the validator with minimal and explicit deviation from the standard, avoiding the need for a complex, post-processing validation function like the old `walk_and_validate_refs`.
### 2. Recursive Runtime Strictness Control
- **Problem:** The `jspg` project requires that certain schemas (specifically those for public `puncs` and global `type`s) enforce a strict "no extra properties" policy. This strictness needs to be decided at runtime and must cascade through the entire validation hierarchy, including all nested objects and `$ref` chains. A compile-time flag was unsuitable because it would incorrectly apply strictness to shared, reusable schemas.

15
flow
View File

@ -11,7 +11,7 @@ source ./flows/rust
POSTGRES_VERSION="17"
POSTGRES_CONFIG_PATH="/opt/homebrew/opt/postgresql@${POSTGRES_VERSION}/bin/pg_config"
DEPENDENCIES+=(icu4c pkg-config "postgresql@${POSTGRES_VERSION}")
CARGO_DEPENDENCIES=(cargo-pgrx==0.15.0)
CARGO_DEPENDENCIES=(cargo-pgrx==0.16.1)
GITEA_ORGANIZATION="cellular"
GITEA_REPOSITORY="jspg"
@ -97,11 +97,16 @@ install() {
fi
}
test() {
test-jspg() {
info "Running jspg tests..."
cargo pgrx test "pg${POSTGRES_VERSION}" "$@" || return $?
}
test-validator() {
info "Running validator tests..."
cargo test -p boon --features "pgrx/pg${POSTGRES_VERSION}" "$@" || return $?
}
clean() {
info "Cleaning build artifacts..."
cargo clean || return $?
@ -111,7 +116,8 @@ jspg-usage() {
printf "prepare\tCheck OS, Cargo, and PGRX dependencies.\n"
printf "install\tBuild and install the extension locally (after prepare).\n"
printf "reinstall\tClean, build, and install the extension locally (after prepare).\n"
printf "test\t\tRun pgrx integration tests.\n"
printf "test-jspg\t\tRun pgrx integration tests.\n"
printf "test-validator\t\tRun validator integration tests.\n"
printf "clean\t\tRemove pgrx build artifacts.\n"
}
@ -121,7 +127,8 @@ jspg-flow() {
build) build; return $?;;
install) install; return $?;;
reinstall) clean && install; return $?;;
test) test "${@:2}"; return $?;;
test-jspg) test-jspg "${@:2}"; return $?;;
test-validator) test-validator "${@:2}"; return $?;;
clean) clean; return $?;;
*) return 1 ;;
esac

44
out.txt
View File

@ -1,44 +0,0 @@
running 23 tests
 Building extension with features pg_test pg17
 Running command "/opt/homebrew/bin/cargo" "build" "--lib" "--features" "pg_test pg17" "--message-format=json-render-diagnostics"
 Installing extension
 Copying control file to /opt/homebrew/share/postgresql@17/extension/jspg.control
 Copying shared library to /opt/homebrew/lib/postgresql@17/jspg.dylib
 Finished installing jspg
test tests::pg_test_cache_invalid ... ok
test tests::pg_test_validate_nested_req_deps ... ok
test tests::pg_test_validate_format_empty_string_with_ref ... ok
test tests::pg_test_validate_format_normal ... ok
test tests::pg_test_validate_format_empty_string ... ok
test tests::pg_test_validate_dependencies ... ok
test tests::pg_test_validate_dependencies_merging ... ok
test tests::pg_test_validate_additional_properties ... ok
test tests::pg_test_validate_enum_schema ... ok
test tests::pg_test_validate_errors ... ok
test tests::pg_test_validate_not_cached ... ok
test tests::pg_test_validate_oneof ... ok
test tests::pg_test_validate_punc_with_refs ... ok
test tests::pg_test_validate_property_merging ... ok
test tests::pg_test_validate_punc_local_refs ... ok
test tests::pg_test_validate_required_merging ... ok
test tests::pg_test_validate_required ... ok
test tests::pg_test_validate_simple ... ok
test tests::pg_test_validate_root_types ... ok
test tests::pg_test_validate_strict ... ok
test tests::pg_test_validate_title_override ... ok
test tests::pg_test_validate_unevaluated_properties ... ok
test tests::pg_test_validate_type_matching ... ok
test result: ok. 23 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 7.66s
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

View File

@ -7,12 +7,13 @@ use lazy_static::lazy_static;
use serde_json::{json, Value, Number};
use std::borrow::Cow;
use std::collections::hash_map::Entry;
use std::{collections::HashMap, sync::RwLock};
use std::{collections::{HashMap, HashSet}, sync::RwLock};
#[derive(Clone, Copy, Debug, PartialEq)]
enum SchemaType {
Enum,
Type,
Family, // Added for generated hierarchy schemas
PublicPunc,
PrivatePunc,
}
@ -20,7 +21,6 @@ enum SchemaType {
struct Schema {
index: SchemaIndex,
t: SchemaType,
value: Value,
}
struct Cache {
@ -77,9 +77,11 @@ fn cache_json_schemas(enums: JsonB, types: JsonB, puncs: JsonB) -> JsonB {
}
}
// Phase 2: Types
// Phase 2: Types & Hierarchy Pre-processing
let mut hierarchy_map: HashMap<String, HashSet<String>> = HashMap::new();
if let Some(types_array) = types_value.as_array() {
for type_row in types_array {
// Process main schemas for the type
if let Some(schemas_raw) = type_row.get("schemas") {
if let Some(schemas_array) = schemas_raw.as_array() {
for schema_def in schemas_array {
@ -89,9 +91,37 @@ fn cache_json_schemas(enums: JsonB, types: JsonB, puncs: JsonB) -> JsonB {
}
}
}
// Process hierarchy to build .family enums
if let Some(type_name) = type_row.get("name").and_then(|v| v.as_str()) {
if let Some(hierarchy_raw) = type_row.get("hierarchy") {
if let Some(hierarchy_array) = hierarchy_raw.as_array() {
for ancestor_val in hierarchy_array {
if let Some(ancestor_name) = ancestor_val.as_str() {
hierarchy_map
.entry(ancestor_name.to_string())
.or_default()
.insert(type_name.to_string());
}
}
}
}
}
}
}
// Generate and add the .family schemas
for (base_type, descendant_types) in hierarchy_map {
let family_schema_id = format!("{}.family", base_type);
let enum_values: Vec<String> = descendant_types.into_iter().collect();
let family_schema = json!({
"$id": family_schema_id,
"type": "string",
"enum": enum_values
});
schemas_to_compile.push((family_schema_id, family_schema, SchemaType::Family));
}
// Phase 3: Puncs
if let Some(puncs_array) = puncs_value.as_array() {
for punc_row in puncs_array {
@ -166,7 +196,7 @@ fn compile_all_schemas(
for (id, value, schema_type) in schemas_to_compile {
match compiler.compile(id, &mut cache.schemas) {
Ok(index) => {
cache.map.insert(id.clone(), Schema { index, t: *schema_type, value: value.clone() });
cache.map.insert(id.clone(), Schema { index, t: *schema_type });
}
Err(e) => {
match &e {
@ -189,104 +219,6 @@ fn compile_all_schemas(
}
}
fn walk_and_validate_refs(
instance: &Value,
schema: &Value,
cache: &std::sync::RwLockReadGuard<Cache>,
path_parts: &mut Vec<String>,
type_validated: bool,
top_level_id: Option<&str>,
errors: &mut Vec<Value>,
) {
if let Some(ref_url) = schema.get("$ref").and_then(|v| v.as_str()) {
if let Some(s) = cache.map.get(ref_url) {
let mut new_type_validated = type_validated;
if !type_validated && s.t == SchemaType::Type {
let id_to_use = top_level_id.unwrap_or(ref_url);
let expected_type = id_to_use.split('.').next().unwrap_or(id_to_use);
if let Some(actual_type) = instance.get("type").and_then(|v| v.as_str()) {
if actual_type == expected_type {
new_type_validated = true;
} else {
path_parts.push("type".to_string());
let path = format!("/{}", path_parts.join("/"));
path_parts.pop();
errors.push(json!({
"code": "TYPE_MISMATCH",
"message": format!("Instance type '{}' does not match expected type '{}' derived from schema $ref", actual_type, expected_type),
"details": { "path": path, "context": instance, "cause": { "expected": expected_type, "actual": actual_type }, "schema": ref_url }
}));
}
} else {
if top_level_id.is_some() {
let path = if path_parts.is_empty() { "".to_string() } else { format!("/{}", path_parts.join("/")) };
errors.push(json!({
"code": "TYPE_MISMATCH",
"message": "Instance is missing 'type' property required for schema validation",
"details": { "path": path, "context": instance, "cause": { "expected": expected_type }, "schema": ref_url }
}));
}
}
}
walk_and_validate_refs(instance, &s.value, cache, path_parts, new_type_validated, None, errors);
}
}
if let Some(properties) = schema.get("properties").and_then(|v| v.as_object()) {
for (prop_name, prop_schema) in properties {
if let Some(prop_value) = instance.get(prop_name) {
path_parts.push(prop_name.clone());
walk_and_validate_refs(prop_value, prop_schema, cache, path_parts, type_validated, None, errors);
path_parts.pop();
}
}
}
if let Some(items_schema) = schema.get("items") {
if let Some(instance_array) = instance.as_array() {
for (i, item) in instance_array.iter().enumerate() {
path_parts.push(i.to_string());
walk_and_validate_refs(item, items_schema, cache, path_parts, false, None, errors);
path_parts.pop();
}
}
}
if let Some(all_of_array) = schema.get("allOf").and_then(|v| v.as_array()) {
for sub_schema in all_of_array {
walk_and_validate_refs(instance, sub_schema, cache, path_parts, type_validated, None, errors);
}
}
if let Some(any_of_array) = schema.get("anyOf").and_then(|v| v.as_array()) {
for sub_schema in any_of_array {
walk_and_validate_refs(instance, sub_schema, cache, path_parts, type_validated, None, errors);
}
}
if let Some(one_of_array) = schema.get("oneOf").and_then(|v| v.as_array()) {
for sub_schema in one_of_array {
walk_and_validate_refs(instance, sub_schema, cache, path_parts, type_validated, None, errors);
}
}
if let Some(if_schema) = schema.get("if") {
walk_and_validate_refs(instance, if_schema, cache, path_parts, type_validated, None, errors);
}
if let Some(then_schema) = schema.get("then") {
walk_and_validate_refs(instance, then_schema, cache, path_parts, type_validated, None, errors);
}
if let Some(else_schema) = schema.get("else") {
walk_and_validate_refs(instance, else_schema, cache, path_parts, type_validated, None, errors);
}
if let Some(not_schema) = schema.get("not") {
walk_and_validate_refs(instance, not_schema, cache, path_parts, type_validated, None, errors);
}
}
#[pg_extern(strict, parallel_safe)]
fn validate_json_schema(schema_id: &str, instance: JsonB) -> JsonB {
let cache = SCHEMA_CACHE.read().unwrap();
@ -304,24 +236,13 @@ fn validate_json_schema(schema_id: &str, instance: JsonB) -> JsonB {
Some(schema) => {
let instance_value: Value = instance.0;
let options = match schema.t {
SchemaType::Type | SchemaType::PublicPunc => Some(ValidationOptions { be_strict: true }),
SchemaType::PublicPunc => Some(ValidationOptions { be_strict: true }),
_ => None,
};
match cache.schemas.validate(&instance_value, schema.index, options.as_ref()) {
match cache.schemas.validate(&instance_value, schema.index, options) {
Ok(_) => {
let mut custom_errors = Vec::new();
if schema.t == SchemaType::Type || schema.t == SchemaType::PublicPunc || schema.t == SchemaType::PrivatePunc {
let mut path_parts = vec![];
let top_level_id = if schema.t == SchemaType::Type { Some(schema_id) } else { None };
walk_and_validate_refs(&instance_value, &schema.value, &cache, &mut path_parts, false, top_level_id, &mut custom_errors);
}
if custom_errors.is_empty() {
JsonB(json!({ "response": "success" }))
} else {
JsonB(json!({ "errors": custom_errors }))
}
}
Err(validation_error) => {
let mut error_list = Vec::new();
@ -394,7 +315,7 @@ fn collect_errors(error: &ValidationError, errors_list: &mut Vec<Error>) {
ErrorKind::AnyOf => handle_any_of_error(&base_path),
ErrorKind::OneOf(matched) => handle_one_of_error(&base_path, matched),
};
errors_list.extend(errors_to_add);
}

View File

@ -355,7 +355,16 @@ pub fn additional_properties_schemas() -> JsonB {
pub fn unevaluated_properties_schemas() -> JsonB {
let enums = json!([]);
let types = json!([]);
let types = json!([{
"name": "nested_for_uneval",
"schemas": [{
"$id": "nested_for_uneval",
"type": "object",
"properties": {
"deep_prop": { "type": "string" }
}
}]
}]);
let puncs = json!([
{
"name": "simple_unevaluated_test",
@ -396,6 +405,29 @@ pub fn unevaluated_properties_schemas() -> JsonB {
},
"unevaluatedProperties": false
}]
},
{
"name": "nested_unevaluated_test",
"public": true, // To trigger strict mode
"schemas": [{
"$id": "nested_unevaluated_test.request",
"type": "object",
"properties": {
"non_strict_branch": {
"type": "object",
"unevaluatedProperties": true, // The magic switch
"properties": {
"some_prop": { "$ref": "nested_for_uneval" }
}
},
"strict_branch": {
"type": "object",
"properties": {
"another_prop": { "type": "string" }
}
}
}
}]
}
]);
@ -433,7 +465,7 @@ pub fn property_merging_schemas() -> JsonB {
"properties": {
"id": { "type": "string" },
"name": { "type": "string" },
"type": { "type": "string" }
"type": { "type": "string", "const": "entity" }
},
"required": ["id"]
}]
@ -444,6 +476,7 @@ pub fn property_merging_schemas() -> JsonB {
"$id": "user",
"$ref": "entity",
"properties": {
"type": { "type": "string", "const": "user", "override": true },
"password": { "type": "string", "minLength": 8 }
},
"required": ["password"]
@ -455,6 +488,7 @@ pub fn property_merging_schemas() -> JsonB {
"$id": "person",
"$ref": "user",
"properties": {
"type": { "type": "string", "const": "person", "override": true },
"first_name": { "type": "string", "minLength": 1 },
"last_name": { "type": "string", "minLength": 1 }
},
@ -820,7 +854,10 @@ pub fn type_matching_schemas() -> JsonB {
"schemas": [{
"$id": "entity",
"type": "object",
"properties": { "type": { "type": "string" }, "name": { "type": "string" } },
"properties": {
"type": { "type": "string", "const": "entity" },
"name": { "type": "string" }
},
"required": ["type", "name"]
}]
},
@ -829,7 +866,10 @@ pub fn type_matching_schemas() -> JsonB {
"schemas": [{
"$id": "job",
"$ref": "entity",
"properties": { "job_id": { "type": "string" } },
"properties": {
"type": { "type": "string", "const": "job", "override": true },
"job_id": { "type": "string" }
},
"required": ["job_id"]
}]
},
@ -839,7 +879,10 @@ pub fn type_matching_schemas() -> JsonB {
{
"$id": "super_job",
"$ref": "job",
"properties": { "manager_id": { "type": "string" } },
"properties": {
"type": { "type": "string", "const": "super_job", "override": true },
"manager_id": { "type": "string" }
},
"required": ["manager_id"]
},
{
@ -875,4 +918,211 @@ pub fn type_matching_schemas() -> JsonB {
}]
}]);
cache_json_schemas(jsonb(enums), jsonb(types), jsonb(puncs))
}
}
pub fn union_schemas() -> JsonB {
let enums = json!([]);
let types = json!([
{
"name": "union_base",
"schemas": [{
"$id": "union_base",
"type": "object",
"properties": {
"type": { "type": "string", "const": "union_base" },
"id": { "type": "string" }
},
"required": ["type", "id"]
}]
},
{
"name": "union_a",
"schemas": [{
"$id": "union_a",
"$ref": "union_base",
"properties": {
"type": { "type": "string", "const": "union_a", "override": true },
"prop_a": { "type": "string" }
},
"required": ["prop_a"]
}]
},
{
"name": "union_b",
"schemas": [{
"$id": "union_b",
"$ref": "union_base",
"properties": {
"type": { "type": "string", "const": "union_b", "override": true },
"prop_b": { "type": "number" }
},
"required": ["prop_b"]
}]
},
{
"name": "union_c",
"schemas": [{
"$id": "union_c",
"$ref": "union_base",
"properties": {
"type": { "type": "string", "const": "union_c", "override": true },
"prop_c": { "type": "boolean" }
},
"required": ["prop_c"]
}]
}
]);
let puncs = json!([{
"name": "union_test",
"public": true,
"schemas": [{
"$id": "union_test.request",
"type": "object",
"properties": {
"union_prop": {
"oneOf": [
{ "$ref": "union_a" },
{ "$ref": "union_b" },
{ "$ref": "union_c" }
]
}
},
"required": ["union_prop"]
}]
}]);
cache_json_schemas(jsonb(enums), jsonb(types), jsonb(puncs))
}
pub fn nullable_union_schemas() -> JsonB {
let enums = json!([]);
let types = json!([
{
"name": "thing_base",
"schemas": [{
"$id": "thing_base",
"type": "object",
"properties": {
"type": { "type": "string", "const": "thing_base" },
"id": { "type": "string" }
},
"required": ["type", "id"]
}]
},
{
"name": "thing_a",
"schemas": [{
"$id": "thing_a",
"$ref": "thing_base",
"properties": {
"type": { "type": "string", "const": "thing_a", "override": true },
"prop_a": { "type": "string" }
},
"required": ["prop_a"]
}]
},
{
"name": "thing_b",
"schemas": [{
"$id": "thing_b",
"$ref": "thing_base",
"properties": {
"type": { "type": "string", "const": "thing_b", "override": true },
"prop_b": { "type": "string" }
},
"required": ["prop_b"]
}]
}
]);
let puncs = json!([{
"name": "nullable_union_test",
"public": true,
"schemas": [{
"$id": "nullable_union_test.request",
"type": "object",
"properties": {
"nullable_prop": {
"oneOf": [
{ "$ref": "thing_a" },
{ "$ref": "thing_b" },
{ "type": "null" }
]
}
},
"required": ["nullable_prop"]
}]
}]);
cache_json_schemas(jsonb(enums), jsonb(types), jsonb(puncs))
}
pub fn hierarchy_schemas() -> JsonB {
let enums = json!([]);
let types = json!([
{
"name": "entity",
"hierarchy": ["entity"],
"schemas": [{
"$id": "entity",
"type": "object",
"properties": {
"id": { "type": "string" },
"type": { "$ref": "entity.family", "override": true }
},
"required": ["id", "type"]
}]
},
{
"name": "organization",
"hierarchy": ["entity", "organization"],
"schemas": [{
"$id": "organization",
"$ref": "entity",
"properties": {
"type": { "$ref": "organization.family", "override": true },
"name": { "type": "string" }
},
"required": ["name"]
}]
},
{
"name": "user",
"hierarchy": ["entity", "organization", "user"],
"schemas": [{
"$id": "user",
"$ref": "organization",
"properties": {
"type": { "$ref": "user.family", "override": true },
"password": { "type": "string" }
},
"required": ["password"]
}]
},
{
"name": "person",
"hierarchy": ["entity", "organization", "user", "person"],
"schemas": [{
"$id": "person",
"$ref": "user",
"properties": {
"type": { "$ref": "person.family", "override": true },
"first_name": { "type": "string" }
},
"required": ["first_name"]
}]
}
]);
let puncs = json!([{
"name": "test_org_punc",
"public": false,
"schemas": [{
"$id": "test_org_punc.request",
"$ref": "organization"
}]
}]);
cache_json_schemas(jsonb(enums), jsonb(types), jsonb(puncs))
}

View File

@ -452,6 +452,33 @@ fn test_validate_unevaluated_properties() {
let valid_result = validate_json_schema("simple_unevaluated_test.request", jsonb(valid_instance));
assert_success(&valid_result);
// Test 4: Test that unevaluatedProperties: true cascades down refs
let cascading_instance = json!({
"strict_branch": {
"another_prop": "is_ok"
},
"non_strict_branch": {
"extra_at_toplevel": "is_ok", // Extra property at this level
"some_prop": {
"deep_prop": "is_ok",
"extra_in_ref": "is_also_ok" // Extra property in the $ref'd schema
}
}
});
let cascading_result = validate_json_schema("nested_unevaluated_test.request", jsonb(cascading_instance));
assert_success(&cascading_result);
// Test 5: For good measure, test that the strict branch is still strict
let strict_fail_instance = json!({
"strict_branch": {
"another_prop": "is_ok",
"extra_in_strict": "is_not_ok"
}
});
let strict_fail_result = validate_json_schema("nested_unevaluated_test.request", jsonb(strict_fail_instance));
assert_error_count(&strict_fail_result, 1);
assert_has_error(&strict_fail_result, "ADDITIONAL_PROPERTIES_NOT_ALLOWED", "/strict_branch/extra_in_strict");
}
#[pg_test]
@ -797,8 +824,8 @@ fn test_validate_type_matching() {
"job_id": "job123"
});
let result_invalid_job = validate_json_schema("job", jsonb(invalid_job));
assert_error_count(&result_invalid_job, 1);
assert_has_error(&result_invalid_job, "TYPE_MISMATCH", "/type");
assert_failure(&result_invalid_job);
assert_has_error(&result_invalid_job, "CONST_VIOLATED", "/type");
// 2. Test 'super_job' which extends 'job'
let valid_super_job = json!({
@ -827,9 +854,8 @@ fn test_validate_type_matching() {
"manager_id": "mgr1"
});
let result_invalid_short = validate_json_schema("super_job.short", jsonb(invalid_short_super_job));
assert_error_count(&result_invalid_short, 1);
let error = find_error_with_code_and_path(&result_invalid_short, "TYPE_MISMATCH", "/type");
assert_error_message_contains(error, "Instance type 'job' does not match expected type 'super_job'");
assert_failure(&result_invalid_short);
assert_has_error(&result_invalid_short, "CONST_VIOLATED", "/type");
// 4. Test punc with root, nested, and oneOf type refs
let valid_punc_instance = json!({
@ -863,8 +889,8 @@ fn test_validate_type_matching() {
}
});
let result_invalid_punc_root = validate_json_schema("type_test_punc.request", jsonb(invalid_punc_root));
assert_error_count(&result_invalid_punc_root, 1);
assert_has_error(&result_invalid_punc_root, "TYPE_MISMATCH", "/root_job/type");
assert_failure(&result_invalid_punc_root);
assert_has_error(&result_invalid_punc_root, "CONST_VIOLATED", "/root_job/type");
// 6. Test invalid type at punc nested ref
let invalid_punc_nested = json!({
@ -882,8 +908,8 @@ fn test_validate_type_matching() {
}
});
let result_invalid_punc_nested = validate_json_schema("type_test_punc.request", jsonb(invalid_punc_nested));
assert_error_count(&result_invalid_punc_nested, 1);
assert_has_error(&result_invalid_punc_nested, "TYPE_MISMATCH", "/nested_or_super_job/my_job/type");
assert_failure(&result_invalid_punc_nested);
assert_has_error(&result_invalid_punc_nested, "CONST_VIOLATED", "/nested_or_super_job/my_job/type");
// 7. Test invalid type at punc oneOf ref
let invalid_punc_oneof = json!({
@ -900,6 +926,164 @@ fn test_validate_type_matching() {
}
});
let result_invalid_punc_oneof = validate_json_schema("type_test_punc.request", jsonb(invalid_punc_oneof));
// This will have multiple errors because the invalid oneOf branch will also fail the other branch's validation
assert_has_error(&result_invalid_punc_oneof, "TYPE_MISMATCH", "/nested_or_super_job/type");
}
assert_failure(&result_invalid_punc_oneof);
assert_has_error(&result_invalid_punc_oneof, "CONST_VIOLATED", "/nested_or_super_job/type");
}
#[pg_test]
fn test_validate_union_type_matching() {
let cache_result = union_schemas();
assert_success(&cache_result);
// 1. Test valid instance with type 'union_a'
let valid_instance_a = json!({
"union_prop": {
"id": "123",
"type": "union_a",
"prop_a": "hello"
}
});
let result_a = validate_json_schema("union_test.request", jsonb(valid_instance_a));
assert_success(&result_a);
// 2. Test valid instance with type 'union_b'
let valid_instance_b = json!({
"union_prop": {
"id": "456",
"type": "union_b",
"prop_b": 123
}
});
let result_b = validate_json_schema("union_test.request", jsonb(valid_instance_b));
assert_success(&result_b);
// 3. Test invalid instance - wrong type const in a valid oneOf branch
let invalid_sub_schema = json!({
"union_prop": {
"id": "789",
"type": "union_b", // Should be union_a
"prop_a": "hello"
}
});
let result_invalid_sub = validate_json_schema("union_test.request", jsonb(invalid_sub_schema));
assert_failure(&result_invalid_sub);
// This should fail because the `type` override in `union_a` is `const: "union_a"`
assert_has_error(&result_invalid_sub, "CONST_VIOLATED", "/union_prop/type");
// 4. Test invalid instance - base type, should fail due to override
let invalid_base_type = json!({
"union_prop": {
"id": "101",
"type": "union_base", // This is the base type, but the override should be enforced
"prop_a": "world"
}
});
let result_invalid_base = validate_json_schema("union_test.request", jsonb(invalid_base_type));
assert_failure(&result_invalid_base);
assert_has_error(&result_invalid_base, "CONST_VIOLATED", "/union_prop/type");
}
#[pg_test]
fn test_validate_nullable_union() {
let cache_result = nullable_union_schemas();
assert_success(&cache_result);
// 1. Test valid instance with object type 'thing_a'
let valid_object_a = json!({
"nullable_prop": {
"id": "123",
"type": "thing_a",
"prop_a": "hello"
}
});
let result_obj_a = validate_json_schema("nullable_union_test.request", jsonb(valid_object_a));
assert_success(&result_obj_a);
// 2. Test valid instance with object type 'thing_b'
let valid_object_b = json!({
"nullable_prop": {
"id": "456",
"type": "thing_b",
"prop_b": "goodbye"
}
});
let result_obj_b = validate_json_schema("nullable_union_test.request", jsonb(valid_object_b));
assert_success(&result_obj_b);
// 3. Test valid instance with null
let valid_null = json!({
"nullable_prop": null
});
let result_null = validate_json_schema("nullable_union_test.request", jsonb(valid_null));
assert_success(&result_null);
// 4. Test invalid instance - base type, should fail due to override
let invalid_base_type = json!({
"nullable_prop": {
"id": "789",
"type": "thing_base",
"prop_a": "should fail"
}
});
let result_invalid_base = validate_json_schema("nullable_union_test.request", jsonb(invalid_base_type));
assert_failure(&result_invalid_base);
assert_has_error(&result_invalid_base, "CONST_VIOLATED", "/nullable_prop/type");
// 5. Test invalid instance (e.g., a string)
let invalid_string = json!({
"nullable_prop": "not_an_object_or_null"
});
let result_invalid = validate_json_schema("nullable_union_test.request", jsonb(invalid_string));
assert_failure(&result_invalid);
assert_has_error(&result_invalid, "TYPE_MISMATCH", "/nullable_prop");
}
#[pg_test]
fn test_validate_type_hierarchy() {
clear_json_schemas();
let cache_result = hierarchy_schemas();
assert_success(&cache_result);
// 1. Test success case: validating a derived type (person) against a base schema (organization)
let person_instance = json!({
"id": "person-id",
"type": "person",
"name": "person-name",
"password": "person-password",
"first_name": "person-first-name"
});
let result_success = validate_json_schema("organization", jsonb(person_instance.clone()));
assert_success(&result_success);
// 2. Test success case: validating a base type (organization) against its own schema
let org_instance = json!({
"id": "org-id",
"type": "organization",
"name": "org-name"
});
let result_org_success = validate_json_schema("organization", jsonb(org_instance));
assert_success(&result_org_success);
// 3. Test failure case: validating an ancestor type (entity) against a derived schema (organization)
let entity_instance = json!({
"id": "entity-id",
"type": "entity"
});
let result_fail_ancestor = validate_json_schema("organization", jsonb(entity_instance));
assert_failure(&result_fail_ancestor);
assert_has_error(&result_fail_ancestor, "ENUM_VIOLATED", "/type");
// 4. Test failure case: validating a completely unrelated type
let unrelated_instance = json!({
"id": "job-id",
"type": "job",
"name": "job-name"
});
let result_fail_unrelated = validate_json_schema("organization", jsonb(unrelated_instance));
assert_failure(&result_fail_unrelated);
assert_has_error(&result_fail_unrelated, "ENUM_VIOLATED", "/type");
// 5. Test that the punc using the schema also works
let punc_success = validate_json_schema("test_org_punc.request", jsonb(person_instance.clone()));
assert_success(&punc_success);
}

View File

@ -1,7 +1,7 @@
[package]
name = "boon"
version = "0.6.1"
edition = "2021"
edition = "2024"
description = "JSONSchema (draft 2020-12, draft 2019-09, draft-7, draft-6, draft-4) Validation"
readme = "README.md"
repository = "https://github.com/santhosh-tekuri/boon"
@ -12,27 +12,27 @@ categories = ["web-programming"]
exclude = [ "tests", ".github", ".gitmodules" ]
[dependencies]
pgrx = "0.15.0"
pgrx = "0.16.1"
serde = "1"
serde_json = "1"
regex = "1.10.3"
regex-syntax = "0.8.2"
regex = "1.12.2"
regex-syntax = "0.8.8"
url = "2"
fluent-uri = "0.3.2"
idna = "1.0"
fluent-uri = "0.4.1"
idna = "1.1"
percent-encoding = "2"
once_cell = "1"
base64 = "0.22"
ahash = "0.8.3"
ahash = "0.8.12"
appendlist = "1.4"
[dev-dependencies]
pgrx-tests = "0.15.0"
pgrx-tests = "0.16.1"
serde = { version = "1.0", features = ["derive"] }
serde_yaml = "0.9"
ureq = "2.12"
ureq = "3.1"
rustls = "0.23"
criterion = "0.5"
criterion = "0.7"
[[bench]]
name = "bench"

View File

@ -370,7 +370,21 @@ impl ObjCompiler<'_, '_, '_, '_, '_, '_> {
}
}
s.properties = self.enqueue_map("properties");
if let Some(Value::Object(props_obj)) = self.value("properties") {
let mut properties = AHashMap::with_capacity(props_obj.len());
for (pname, pvalue) in props_obj {
let ptr = self.up.ptr.append2("properties", pname);
let sch_idx = self.enqueue_schema(ptr);
properties.insert(pname.clone(), sch_idx);
if let Some(prop_schema_obj) = pvalue.as_object() {
if let Some(Value::Bool(true)) = prop_schema_obj.get("override") {
s.override_properties.insert(pname.clone());
}
}
}
s.properties = properties;
}
s.pattern_properties = {
let mut v = vec![];
if let Some(Value::Object(obj)) = self.value("patternProperties") {

View File

@ -137,7 +137,7 @@ impl Visitor for Translator<'_> {
Ast::ClassPerl(perl) => {
self.replace_class_class(perl);
}
Ast::Literal(ref literal) => {
Ast::Literal(literal) => {
if let Literal {
kind: LiteralKind::Special(SpecialLiteralKind::Bell),
..

View File

@ -129,7 +129,7 @@ pub use {
use std::{borrow::Cow, collections::HashMap, error::Error, fmt::Display};
use ahash::AHashMap;
use ahash::{AHashMap, AHashSet};
use regex::Regex;
use serde_json::{Number, Value};
use util::*;
@ -194,7 +194,7 @@ impl Schemas {
&'s self,
v: &'v Value,
sch_index: SchemaIndex,
options: Option<&'s ValidationOptions>,
options: Option<ValidationOptions>,
) -> Result<(), ValidationError<'s, 'v>> {
let Some(sch) = self.list.get(sch_index.0) else {
panic!("Schemas::validate: schema index out of bounds");
@ -238,6 +238,7 @@ struct Schema {
max_properties: Option<usize>,
required: Vec<String>,
properties: AHashMap<String, SchemaIndex>,
override_properties: AHashSet<String>,
pattern_properties: Vec<(Regex, SchemaIndex)>,
property_names: Option<SchemaIndex>,
additional_properties: Option<Additional>,

View File

@ -24,6 +24,9 @@
"type": ["object", "boolean"],
"$comment": "This meta-schema also defines keywords that have appeared in previous drafts in order to prevent incompatible extensions as they remain in common use.",
"properties": {
"override": {
"type": "boolean"
},
"definitions": {
"$comment": "\"definitions\" has been replaced by \"$defs\".",
"type": "object",

View File

@ -444,8 +444,8 @@ impl Hash for HashedValue<'_> {
fn hash<H: Hasher>(&self, state: &mut H) {
match self.0 {
Value::Null => state.write_u32(3_221_225_473), // chosen randomly
Value::Bool(ref b) => b.hash(state),
Value::Number(ref num) => {
Value::Bool(b) => b.hash(state),
Value::Number(num) => {
if let Some(num) = num.as_f64() {
num.to_bits().hash(state);
} else if let Some(num) = num.as_u64() {
@ -454,13 +454,13 @@ impl Hash for HashedValue<'_> {
num.hash(state);
}
}
Value::String(ref str) => str.hash(state),
Value::Array(ref arr) => {
Value::String(str) => str.hash(state),
Value::Array(arr) => {
for item in arr {
HashedValue(item).hash(state);
}
}
Value::Object(ref obj) => {
Value::Object(obj) => {
let mut hash = 0;
for (pname, pvalue) in obj {
// We have no way of building a new hasher of type `H`, so we

View File

@ -1,9 +1,13 @@
use std::{borrow::Cow, cmp::min, collections::HashSet, fmt::Write};
use ahash::AHashSet;
use serde_json::{Map, Value};
use crate::{util::*, *};
#[derive(Default, Clone)]
struct Override<'s>(AHashSet<&'s str>);
macro_rules! prop {
($prop:expr) => {
InstanceToken::Prop(Cow::Borrowed($prop))
@ -20,7 +24,7 @@ pub(crate) fn validate<'s, 'v>(
v: &'v Value,
schema: &'s Schema,
schemas: &'s Schemas,
options: Option<&'s ValidationOptions>,
options: Option<ValidationOptions>,
) -> Result<(), ValidationError<'s, 'v>> {
let scope = Scope {
sch: schema.idx,
@ -29,7 +33,7 @@ pub(crate) fn validate<'s, 'v>(
parent: None,
};
let mut vloc = Vec::with_capacity(8);
let be_strict = options.map_or(false, |o| o.be_strict);
let options = options.unwrap_or_default();
let (result, _) = Validator {
v,
vloc: &mut vloc,
@ -37,7 +41,8 @@ pub(crate) fn validate<'s, 'v>(
schemas,
scope,
options,
uneval: Uneval::from(v, schema, be_strict),
overrides: Override::default(), // Start with an empty override context
uneval: Uneval::from(v, schema, options.be_strict),
errors: vec![],
bool_result: false,
}
@ -89,7 +94,8 @@ struct Validator<'v, 's, 'd, 'e> {
schema: &'s Schema,
schemas: &'s Schemas,
scope: Scope<'d>,
options: Option<&'s ValidationOptions>,
options: ValidationOptions,
overrides: Override<'s>,
uneval: Uneval<'v>,
errors: Vec<ValidationError<'s, 'v>>,
bool_result: bool, // is interested to know valid or not (but not actuall error)
@ -190,7 +196,7 @@ impl<'v, 's> Validator<'v, 's, '_, '_> {
}
// type specific validations
impl<'v> Validator<'v, '_, '_, '_> {
impl<'v> Validator<'v, '_, '_,'_> {
fn obj_validate(&mut self, obj: &'v Map<String, Value>) {
let s = self.schema;
macro_rules! add_err {
@ -244,6 +250,11 @@ impl<'v> Validator<'v, '_, '_, '_> {
let mut additional_props = vec![];
for (pname, pvalue) in obj {
if self.overrides.0.contains(pname.as_str()) {
self.uneval.props.remove(pname);
continue;
}
if self.bool_result && !self.errors.is_empty() {
return;
}
@ -296,7 +307,7 @@ impl<'v> Validator<'v, '_, '_, '_> {
if let Some(sch) = &s.property_names {
for pname in obj.keys() {
let v = Value::String(pname.to_owned());
if let Err(mut e) = self.schemas.validate(&v, *sch, self.options) {
if let Err(mut e) = self.schemas.validate(&v, *sch, Some(self.options)) {
e.schema_url = &s.loc;
e.kind = ErrorKind::PropertyName {
prop: pname.to_owned(),
@ -510,7 +521,7 @@ impl<'v> Validator<'v, '_, '_, '_> {
// contentSchema --
if let (Some(sch), Some(v)) = (s.content_schema, deserialized) {
if let Err(mut e) = self.schemas.validate(&v, sch, self.options) {
if let Err(mut e) = self.schemas.validate(&v, sch, Some(self.options)) {
e.schema_url = &s.loc;
e.kind = kind!(ContentSchema);
self.errors.push(e.clone_static());
@ -762,8 +773,6 @@ impl Validator<'_, '_, '_, '_> {
};
}
let be_strict = self.options.map_or(false, |o| o.be_strict);
// unevaluatedProperties --
if let Value::Object(obj) = v {
if let Some(sch_idx) = s.unevaluated_properties {
@ -786,7 +795,7 @@ impl Validator<'_, '_, '_, '_> {
}
self.uneval.props.clear();
}
} else if be_strict && !self.bool_result {
} else if self.options.be_strict && !self.bool_result {
// 2. Runtime strictness check
if !self.uneval.props.is_empty() {
let props: Vec<Cow<str>> = self.uneval.props.iter().map(|p| Cow::from((*p).as_str())).collect();
@ -824,15 +833,33 @@ impl<'v, 's> Validator<'v, 's, '_, '_> {
}
let scope = self.scope.child(sch, None, self.scope.vid + 1);
let schema = &self.schemas.get(sch);
let be_strict = self.options.map_or(false, |o| o.be_strict);
// Check if the new schema turns off strictness
let allows_unevaluated = schema.boolean == Some(true) ||
if let Some(idx) = schema.unevaluated_properties {
self.schemas.get(idx).boolean == Some(true)
} else {
false
};
let mut new_options = self.options;
if allows_unevaluated {
new_options.be_strict = false;
}
let mut overrides = Override::default();
for pname in &schema.override_properties {
overrides.0.insert(pname.as_str());
}
let (result, _reply) = Validator {
v,
vloc: self.vloc,
schema,
schemas: self.schemas,
scope,
options: self.options,
uneval: Uneval::from(v, schema, be_strict || !self.uneval.is_empty()),
options: new_options,
overrides,
uneval: Uneval::from(v, schema, new_options.be_strict || !self.uneval.is_empty()),
errors: vec![],
bool_result: self.bool_result,
}
@ -849,14 +876,32 @@ impl<'v, 's> Validator<'v, 's, '_, '_> {
) -> Result<(), ValidationError<'s, 'v>> {
let scope = self.scope.child(sch, ref_kw, self.scope.vid);
let schema = &self.schemas.get(sch);
let be_strict = self.options.map_or(false, |o| o.be_strict);
// Check if the new schema turns off strictness
let allows_unevaluated = schema.boolean == Some(true) ||
if let Some(idx) = schema.unevaluated_properties {
self.schemas.get(idx).boolean == Some(true)
} else {
false
};
let mut new_options = self.options;
if allows_unevaluated {
new_options.be_strict = false;
}
let mut overrides = self.overrides.clone();
for pname in &self.schema.override_properties {
overrides.0.insert(pname.as_str());
}
let (result, reply) = Validator {
v: self.v,
vloc: self.vloc,
schema,
schemas: self.schemas,
scope,
options: self.options,
options: new_options,
overrides,
uneval: self.uneval.clone(),
errors: vec![],
bool_result: self.bool_result || bool_result,

View File

@ -1 +1 @@
1.0.38
1.0.45