queryer fixes in place

This commit is contained in:
2026-03-17 22:13:34 -04:00
parent 3d66a7fc3c
commit 091007006d
15 changed files with 1148 additions and 885 deletions

View File

@ -43,7 +43,7 @@ JSPG implements specific extensions to the Draft 2020-12 standard to support the
#### A. Polymorphism & Referencing (`$ref`, `$family`, and Native Types) #### A. Polymorphism & Referencing (`$ref`, `$family`, and Native Types)
* **Native Type Discrimination (`variations`)**: Schemas defined inside a Postgres `type` are Entities. The validator securely and implicitly manages their `"type"` property. If an entity inherits from `user`, incoming JSON can safely define `{"type": "person"}` without errors, thanks to `compiled_variations` inheritance. * **Native Type Discrimination (`variations`)**: Schemas defined inside a Postgres `type` are Entities. The validator securely and implicitly manages their `"type"` property. If an entity inherits from `user`, incoming JSON can safely define `{"type": "person"}` without errors, thanks to `compiled_variations` inheritance.
* **Structural Inheritance & Viral Infection (`$ref`)**: `$ref` is used exclusively for structural inheritance, *never* for union creation. A Punc request schema that `$ref`s an Entity virally inherits all physical database polymorphism rules for that target. * **Structural Inheritance & Viral Infection (`$ref`)**: `$ref` is used exclusively for structural inheritance, *never* for union creation. A Punc request schema that `$ref`s an Entity virally inherits all physical database polymorphism rules for that target.
* **Shape Polymorphism (`$family`)**: Auto-expands polymorphic API lists based on an abstract Descendants Graph. If `{"$family": "widget"}` is used, JSPG evaluates the JSON against every schema that `$ref`s widget. * **Shape Polymorphism (`$family`)**: Auto-expands polymorphic API lists based on an abstract **Descendants Graph**. If `{"$family": "widget"}` is used, the Validator dynamically identifies *every* schema in the registry that `$ref`s `widget` (e.g., `stock.widget`, `task.widget`) and evaluates the JSON against all of them.
* **Strict Matches & Depth Heuristic**: Polymorphic structures MUST match exactly **one** schema permutation. If multiple inherited struct permutations pass, JSPG applies the **Depth Heuristic Tie-Breaker**, selecting the candidate deepest in the inheritance tree. * **Strict Matches & Depth Heuristic**: Polymorphic structures MUST match exactly **one** schema permutation. If multiple inherited struct permutations pass, JSPG applies the **Depth Heuristic Tie-Breaker**, selecting the candidate deepest in the inheritance tree.
#### B. Dot-Notation Schema Resolution & Database Mapping #### B. Dot-Notation Schema Resolution & Database Mapping
@ -103,6 +103,10 @@ The Queryer transforms Postgres into a pre-compiled Semantic Query Engine via th
* **Array Inclusion**: `{"$in": [values]}`, `{"$nin": [values]}` use native `jsonb_array_elements_text()` bindings to enforce `IN` and `NOT IN` logic without runtime SQL injection risks. * **Array Inclusion**: `{"$in": [values]}`, `{"$nin": [values]}` use native `jsonb_array_elements_text()` bindings to enforce `IN` and `NOT IN` logic without runtime SQL injection risks.
* **Text Matching (ILIKE)**: Evaluates `$eq` or `$ne` against string fields containing the `%` character natively into Postgres `ILIKE` and `NOT ILIKE` partial substring matches. * **Text Matching (ILIKE)**: Evaluates `$eq` or `$ne` against string fields containing the `%` character natively into Postgres `ILIKE` and `NOT ILIKE` partial substring matches.
* **Type Casting**: Safely resolves dynamic combinations by casting values instantly into the physical database types mapped in the schema (e.g. parsing `uuid` bindings to `::uuid`, formatting DateTimes to `::timestamptz`, and numbers to `::numeric`). * **Type Casting**: Safely resolves dynamic combinations by casting values instantly into the physical database types mapped in the schema (e.g. parsing `uuid` bindings to `::uuid`, formatting DateTimes to `::timestamptz`, and numbers to `::numeric`).
* **Polymorphic SQL Generation (`$family`)**: Compiles `$family` properties by analyzing the **Physical Database Variations**, *not* the schema descendants.
* **The Dot Convention**: When a schema requests `$family: "target.schema"`, the compiler extracts the base type (e.g. `schema`) and looks up its Physical Table definition.
* **Multi-Table Branching**: If the Physical Table is a parent to other tables (e.g. `organization` has variations `["organization", "bot", "person"]`), the compiler generates a dynamic `CASE WHEN type = '...' THEN ...` query, expanding into `JOIN`s for each variation.
* **Single-Table Bypass**: If the Physical Table is a leaf node with only one variation (e.g. `person` has variations `["person"]`), the compiler cleanly bypasses `CASE` generation and compiles a simple `SELECT` across the base table, as all schema extensions (e.g. `light.person`, `full.person`) are guaranteed to reside in the exact same physical row.
### The Stem Engine ### The Stem Engine

File diff suppressed because it is too large Load Diff

View File

@ -21,15 +21,12 @@ impl Merger {
} }
pub fn merge(&self, data: Value) -> crate::drop::Drop { pub fn merge(&self, data: Value) -> crate::drop::Drop {
let mut val_resolved = Value::Null;
let mut notifications_queue = Vec::new(); let mut notifications_queue = Vec::new();
let result = self.merge_internal(data, &mut notifications_queue); let result = self.merge_internal(data, &mut notifications_queue);
match result { let val_resolved = match result {
Ok(val) => { Ok(val) => val,
val_resolved = val;
}
Err(msg) => { Err(msg) => {
return crate::drop::Drop::with_errors(vec![crate::drop::Error { return crate::drop::Drop::with_errors(vec![crate::drop::Error {
code: "MERGE_FAILED".to_string(), code: "MERGE_FAILED".to_string(),

View File

@ -179,32 +179,49 @@ impl SqlCompiler {
} }
// Handle $family Polymorphism fallbacks for relations // Handle $family Polymorphism fallbacks for relations
if let Some(family_target) = &schema.obj.family { if let Some(family_target) = &schema.obj.family {
let mut all_targets = vec![family_target.clone()]; let base_type_name = family_target.split('.').next_back().unwrap_or(family_target).to_string();
if let Some(schema_id) = &schema.obj.id {
if let Some(descendants) = self.db.descendants.get(schema_id) { if let Some(type_def) = self.db.types.get(&base_type_name) {
all_targets.extend(descendants.clone()); if type_def.variations.len() == 1 {
let mut bypass_schema = crate::database::schema::Schema::default();
bypass_schema.obj.r#ref = Some(family_target.clone());
return self.walk_schema(
&std::sync::Arc::new(bypass_schema),
parent_alias,
parent_table_aliases,
parent_type_def,
prop_name_context,
filter_keys,
is_stem_query,
depth,
current_path,
alias_counter,
);
} }
}
let mut family_schemas = Vec::new(); let mut sorted_variations: Vec<String> = type_def.variations.iter().cloned().collect();
for target in all_targets { sorted_variations.sort();
let mut ref_schema = crate::database::schema::Schema::default();
ref_schema.obj.r#ref = Some(target);
family_schemas.push(std::sync::Arc::new(ref_schema));
}
return self.compile_one_of( let mut family_schemas = Vec::new();
&family_schemas, for variation in &sorted_variations {
parent_alias, let mut ref_schema = crate::database::schema::Schema::default();
parent_table_aliases, ref_schema.obj.r#ref = Some(variation.clone());
parent_type_def, family_schemas.push(std::sync::Arc::new(ref_schema));
prop_name_context, }
filter_keys,
is_stem_query, return self.compile_one_of(
depth, &family_schemas,
current_path, parent_alias,
alias_counter, parent_table_aliases,
); parent_type_def,
prop_name_context,
filter_keys,
is_stem_query,
depth,
current_path,
alias_counter,
);
}
} }
// Handle oneOf Polymorphism fallbacks for relations // Handle oneOf Polymorphism fallbacks for relations
@ -305,45 +322,56 @@ impl SqlCompiler {
// 2.5 Inject polymorphism directly into the query object // 2.5 Inject polymorphism directly into the query object
if let Some(family_target) = &schema.obj.family { if let Some(family_target) = &schema.obj.family {
let mut family_schemas = Vec::new(); let base_type_name = family_target.split('.').next_back().unwrap_or(family_target).to_string();
if let Some(base_type) = self.db.types.get(family_target) {
let mut sorted_targets: Vec<String> = base_type.variations.iter().cloned().collect();
// Ensure the base type is included if not listed in variations by default
if !sorted_targets.contains(family_target) {
sorted_targets.push(family_target.clone());
}
sorted_targets.sort();
for target in sorted_targets { if let Some(fam_type_def) = self.db.types.get(&base_type_name) {
let mut ref_schema = crate::database::schema::Schema::default(); if fam_type_def.variations.len() == 1 {
ref_schema.obj.r#ref = Some(target); let mut bypass_schema = crate::database::schema::Schema::default();
family_schemas.push(std::sync::Arc::new(ref_schema)); bypass_schema.obj.r#ref = Some(family_target.clone());
let mut bypassed_args = self.map_properties_to_aliases(
&bypass_schema,
type_def,
&table_aliases,
parent_alias,
filter_keys,
is_stem_query,
depth,
&current_path,
alias_counter,
)?;
select_args.append(&mut bypassed_args);
} else {
let mut family_schemas = Vec::new();
let mut sorted_fam_variations: Vec<String> = fam_type_def.variations.iter().cloned().collect();
sorted_fam_variations.sort();
for variation in &sorted_fam_variations {
let mut ref_schema = crate::database::schema::Schema::default();
ref_schema.obj.r#ref = Some(variation.clone());
family_schemas.push(std::sync::Arc::new(ref_schema));
}
let base_alias = table_aliases
.get(&type_def.name)
.cloned()
.unwrap_or_else(|| parent_alias.to_string());
select_args.push(format!("'id', {}.id", base_alias));
let (case_sql, _) = self.compile_one_of(
&family_schemas,
&base_alias,
Some(&table_aliases),
parent_type_def,
None,
filter_keys,
is_stem_query,
depth,
current_path.clone(),
alias_counter,
)?;
select_args.push(format!("'type', {}", case_sql));
} }
} else {
// Fallback for types not strictly defined in physical DB
let mut ref_schema = crate::database::schema::Schema::default();
ref_schema.obj.r#ref = Some(family_target.clone());
family_schemas.push(std::sync::Arc::new(ref_schema));
} }
let base_alias = table_aliases
.get(&type_def.name)
.cloned()
.unwrap_or_else(|| parent_alias.to_string());
select_args.push(format!("'id', {}.id", base_alias));
let (case_sql, _) = self.compile_one_of(
&family_schemas,
&base_alias,
Some(&table_aliases),
parent_type_def,
None,
filter_keys,
is_stem_query,
depth,
current_path.clone(),
alias_counter,
)?;
select_args.push(format!("'type', {}", case_sql));
} else if let Some(one_of) = &schema.obj.one_of { } else if let Some(one_of) = &schema.obj.one_of {
let base_alias = table_aliases let base_alias = table_aliases
.get(&type_def.name) .get(&type_def.name)
@ -448,8 +476,11 @@ impl SqlCompiler {
let mut select_args = Vec::new(); let mut select_args = Vec::new();
let grouped_fields = type_def.grouped_fields.as_ref().and_then(|v| v.as_object()); let grouped_fields = type_def.grouped_fields.as_ref().and_then(|v| v.as_object());
let merged_props = self.get_merged_properties(schema); let merged_props = self.get_merged_properties(schema);
let mut sorted_keys: Vec<&String> = merged_props.keys().collect();
sorted_keys.sort();
for (prop_key, prop_schema) in &merged_props { for prop_key in sorted_keys {
let prop_schema = &merged_props[prop_key];
let mut owner_alias = table_aliases let mut owner_alias = table_aliases
.get("entity") .get("entity")
.cloned() .cloned()
@ -832,6 +863,8 @@ impl SqlCompiler {
return Ok(("NULL".to_string(), "string".to_string())); return Ok(("NULL".to_string(), "string".to_string()));
} }
case_statements.sort();
let sql = format!("CASE {} ELSE NULL END", case_statements.join(" ")); let sql = format!("CASE {} ELSE NULL END", case_statements.join(" "));
Ok((sql, "object".to_string())) Ok((sql, "object".to_string()))

View File

@ -2,7 +2,6 @@ use crate::*;
pub mod runner; pub mod runner;
pub mod types; pub mod types;
use serde_json::json; use serde_json::json;
pub mod sql_validator;
// Database module tests moved to src/database/executors/mock.rs // Database module tests moved to src/database/executors/mock.rs

View File

@ -1,19 +1,10 @@
use crate::tests::types::Suite;
use serde::Deserialize; use serde::Deserialize;
use serde_json::Value;
use std::collections::HashMap; use std::collections::HashMap;
use std::fs; use std::fs;
use std::sync::{Arc, OnceLock, RwLock}; use std::sync::{Arc, OnceLock, RwLock};
#[derive(Debug, Deserialize)]
pub struct TestSuite {
#[allow(dead_code)]
pub description: String,
pub database: serde_json::Value,
pub tests: Vec<TestCase>,
}
use crate::tests::types::TestCase;
use serde_json::Value;
pub fn deserialize_some<'de, D>(deserializer: D) -> Result<Option<Value>, D::Error> pub fn deserialize_some<'de, D>(deserializer: D) -> Result<Option<Value>, D::Error>
where where
D: serde::Deserializer<'de>, D: serde::Deserializer<'de>,
@ -23,7 +14,7 @@ where
} }
// Type alias for easier reading // Type alias for easier reading
type CompiledSuite = Arc<Vec<(TestSuite, Arc<crate::database::Database>)>>; type CompiledSuite = Arc<Vec<(Suite, Arc<crate::database::Database>)>>;
// Global cache mapping filename -> Vector of (Parsed JSON suite, Compiled Database) // Global cache mapping filename -> Vector of (Parsed JSON suite, Compiled Database)
static CACHE: OnceLock<RwLock<HashMap<String, CompiledSuite>>> = OnceLock::new(); static CACHE: OnceLock<RwLock<HashMap<String, CompiledSuite>>> = OnceLock::new();
@ -46,7 +37,7 @@ fn get_cached_file(path: &str) -> CompiledSuite {
} else { } else {
let content = let content =
fs::read_to_string(path).unwrap_or_else(|_| panic!("Failed to read file: {}", path)); fs::read_to_string(path).unwrap_or_else(|_| panic!("Failed to read file: {}", path));
let suites: Vec<TestSuite> = serde_json::from_str(&content) let suites: Vec<Suite> = serde_json::from_str(&content)
.unwrap_or_else(|e| panic!("Failed to parse JSON in {}: {}", path, e)); .unwrap_or_else(|e| panic!("Failed to parse JSON in {}: {}", path, e));
let mut compiled_suites = Vec::new(); let mut compiled_suites = Vec::new();

View File

@ -1,194 +0,0 @@
use sqlparser::ast::{Expr, Query, SelectItem, Statement, TableFactor};
use sqlparser::dialect::PostgreSqlDialect;
use sqlparser::parser::Parser;
use std::collections::HashSet;
pub fn validate_semantic_sql(sql: &str) -> Result<(), String> {
let dialect = PostgreSqlDialect {};
let statements = match Parser::parse_sql(&dialect, sql) {
Ok(s) => s,
Err(e) => return Err(format!("SQL Syntax Error: {}\nSQL: {}", e, sql)),
};
for statement in statements {
validate_statement(&statement, sql)?;
}
Ok(())
}
fn validate_statement(stmt: &Statement, original_sql: &str) -> Result<(), String> {
match stmt {
Statement::Query(query) => validate_query(query, &HashSet::new(), original_sql)?,
Statement::Insert(insert) => {
if let Some(query) = &insert.source {
validate_query(query, &HashSet::new(), original_sql)?
}
}
Statement::Update(update) => {
if let Some(expr) = &update.selection {
validate_expr(expr, &HashSet::new(), original_sql)?;
}
}
Statement::Delete(delete) => {
if let Some(expr) = &delete.selection {
validate_expr(expr, &HashSet::new(), original_sql)?;
}
}
_ => {}
}
Ok(())
}
fn validate_query(
query: &Query,
available_aliases: &HashSet<String>,
original_sql: &str,
) -> Result<(), String> {
if let sqlparser::ast::SetExpr::Select(select) = &*query.body {
validate_select(&select, available_aliases, original_sql)?;
}
Ok(())
}
fn validate_select(
select: &sqlparser::ast::Select,
parent_aliases: &HashSet<String>,
original_sql: &str,
) -> Result<(), String> {
let mut available_aliases = parent_aliases.clone();
// 1. Collect all declared table aliases in the FROM clause and JOINs
for table_with_joins in &select.from {
collect_aliases_from_table_factor(&table_with_joins.relation, &mut available_aliases);
for join in &table_with_joins.joins {
collect_aliases_from_table_factor(&join.relation, &mut available_aliases);
}
}
// 2. Validate all SELECT projection fields
for projection in &select.projection {
if let SelectItem::UnnamedExpr(expr) | SelectItem::ExprWithAlias { expr, .. } = projection {
validate_expr(expr, &available_aliases, original_sql)?;
}
}
// 3. Validate ON conditions in joins
for table_with_joins in &select.from {
for join in &table_with_joins.joins {
if let sqlparser::ast::JoinOperator::Inner(sqlparser::ast::JoinConstraint::On(expr))
| sqlparser::ast::JoinOperator::LeftOuter(sqlparser::ast::JoinConstraint::On(expr))
| sqlparser::ast::JoinOperator::RightOuter(sqlparser::ast::JoinConstraint::On(expr))
| sqlparser::ast::JoinOperator::FullOuter(sqlparser::ast::JoinConstraint::On(expr))
| sqlparser::ast::JoinOperator::Join(sqlparser::ast::JoinConstraint::On(expr)) =
&join.join_operator
{
validate_expr(expr, &available_aliases, original_sql)?;
}
}
}
// 4. Validate WHERE conditions
if let Some(selection) = &select.selection {
validate_expr(selection, &available_aliases, original_sql)?;
}
Ok(())
}
fn collect_aliases_from_table_factor(tf: &TableFactor, aliases: &mut HashSet<String>) {
match tf {
TableFactor::Table { name, alias, .. } => {
if let Some(table_alias) = alias {
aliases.insert(table_alias.name.value.clone());
} else if let Some(last) = name.0.last() {
match last {
sqlparser::ast::ObjectNamePart::Identifier(i) => {
aliases.insert(i.value.clone());
}
_ => {}
}
}
}
TableFactor::Derived {
subquery,
alias: Some(table_alias),
..
} => {
aliases.insert(table_alias.name.value.clone());
// A derived table is technically a nested scope which is opaque outside, but for pure semantic checks
// its internal contents should be validated purely within its own scope (not leaking external aliases in, usually)
// but Postgres allows lateral correlation. We will validate its interior with an empty scope.
let _ = validate_query(subquery, &HashSet::new(), "");
}
_ => {}
}
}
fn validate_expr(
expr: &Expr,
available_aliases: &HashSet<String>,
sql: &str,
) -> Result<(), String> {
match expr {
Expr::CompoundIdentifier(idents) => {
if idents.len() == 2 {
let alias = &idents[0].value;
if !available_aliases.is_empty() && !available_aliases.contains(alias) {
return Err(format!(
"Semantic Error: Orchestrated query referenced table alias '{}' but it was not declared in the query's FROM/JOIN clauses.\nAvailable aliases: {:?}\nSQL: {}",
alias, available_aliases, sql
));
}
} else if idents.len() > 2 {
let alias = &idents[1].value; // In form schema.table.column, 'table' is idents[1]
if !available_aliases.is_empty() && !available_aliases.contains(alias) {
return Err(format!(
"Semantic Error: Orchestrated query referenced table '{}' but it was not mapped.\nAvailable aliases: {:?}\nSQL: {}",
alias, available_aliases, sql
));
}
}
}
Expr::Subquery(subquery) => validate_query(subquery, available_aliases, sql)?,
Expr::Exists { subquery, .. } => validate_query(subquery, available_aliases, sql)?,
Expr::InSubquery {
expr: e, subquery, ..
} => {
validate_expr(e, available_aliases, sql)?;
validate_query(subquery, available_aliases, sql)?;
}
Expr::BinaryOp { left, right, .. } => {
validate_expr(left, available_aliases, sql)?;
validate_expr(right, available_aliases, sql)?;
}
Expr::IsFalse(e)
| Expr::IsNotFalse(e)
| Expr::IsTrue(e)
| Expr::IsNotTrue(e)
| Expr::IsNull(e)
| Expr::IsNotNull(e)
| Expr::InList { expr: e, .. }
| Expr::Nested(e)
| Expr::UnaryOp { expr: e, .. }
| Expr::Cast { expr: e, .. }
| Expr::Like { expr: e, .. }
| Expr::ILike { expr: e, .. }
| Expr::AnyOp { left: e, .. }
| Expr::AllOp { left: e, .. } => {
validate_expr(e, available_aliases, sql)?;
}
Expr::Function(func) => {
if let sqlparser::ast::FunctionArguments::List(args) = &func.args {
if let Some(sqlparser::ast::FunctionArg::Unnamed(sqlparser::ast::FunctionArgExpr::Expr(
e,
))) = args.args.get(0)
{
validate_expr(e, available_aliases, sql)?;
}
}
}
_ => {}
}
Ok(())
}

View File

@ -1,11 +1,11 @@
use super::expect::ExpectBlock; use super::expect::Expect;
use crate::database::Database; use crate::database::Database;
use serde::Deserialize; use serde::Deserialize;
use serde_json::Value; use serde_json::Value;
use std::sync::Arc; use std::sync::Arc;
#[derive(Debug, Deserialize)] #[derive(Debug, Deserialize)]
pub struct TestCase { pub struct Case {
pub description: String, pub description: String,
#[serde(default = "default_action")] #[serde(default = "default_action")]
@ -30,14 +30,14 @@ pub struct TestCase {
#[serde(default)] #[serde(default)]
pub mocks: Option<serde_json::Value>, pub mocks: Option<serde_json::Value>,
pub expect: Option<ExpectBlock>, pub expect: Option<Expect>,
} }
fn default_action() -> String { fn default_action() -> String {
"validate".to_string() "validate".to_string()
} }
impl TestCase { impl Case {
pub fn run_compile(&self, db: Arc<Database>) -> Result<(), String> { pub fn run_compile(&self, db: Arc<Database>) -> Result<(), String> {
let expected_success = self.expect.as_ref().map(|e| e.success).unwrap_or(false); let expected_success = self.expect.as_ref().map(|e| e.success).unwrap_or(false);
@ -138,6 +138,7 @@ impl TestCase {
)) ))
} else if let Some(expect) = &self.expect { } else if let Some(expect) = &self.expect {
let queries = db.executor.get_queries(); let queries = db.executor.get_queries();
expect.assert_pattern(&queries)?;
expect.assert_sql(&queries) expect.assert_sql(&queries)
} else { } else {
Ok(()) Ok(())
@ -176,6 +177,7 @@ impl TestCase {
)) ))
} else if let Some(expect) = &self.expect { } else if let Some(expect) = &self.expect {
let queries = db.executor.get_queries(); let queries = db.executor.get_queries();
expect.assert_pattern(&queries)?;
expect.assert_sql(&queries) expect.assert_sql(&queries)
} else { } else {
Ok(()) Ok(())

View File

@ -0,0 +1,22 @@
pub mod pattern;
pub mod sql;
use serde::Deserialize;
use std::collections::HashMap;
#[derive(Debug, Deserialize)]
#[serde(untagged)]
pub enum SqlExpectation {
Single(String),
Multi(Vec<String>),
}
#[derive(Debug, Deserialize)]
pub struct Expect {
pub success: bool,
pub result: Option<serde_json::Value>,
pub errors: Option<Vec<serde_json::Value>>,
pub stems: Option<HashMap<String, HashMap<String, serde_json::Value>>>,
#[serde(default)]
pub sql: Option<Vec<SqlExpectation>>,
}

View File

@ -1,30 +1,13 @@
use super::Expect;
use regex::Regex; use regex::Regex;
use serde::Deserialize;
use std::collections::HashMap; use std::collections::HashMap;
#[derive(Debug, Deserialize)] impl Expect {
#[serde(untagged)]
pub enum SqlExpectation {
Single(String),
Multi(Vec<String>),
}
#[derive(Debug, Deserialize)]
pub struct ExpectBlock {
pub success: bool,
pub result: Option<serde_json::Value>,
pub errors: Option<Vec<serde_json::Value>>,
pub stems: Option<HashMap<String, HashMap<String, serde_json::Value>>>,
#[serde(default)]
pub sql: Option<Vec<SqlExpectation>>,
}
impl ExpectBlock {
/// Advanced SQL execution assertion algorithm ported from `assert.go`. /// Advanced SQL execution assertion algorithm ported from `assert.go`.
/// This compares two arrays of strings, one containing {{uuid:name}} or {{timestamp}} placeholders, /// This compares two arrays of strings, one containing {{uuid:name}} or {{timestamp}} placeholders,
/// and the other containing actual executed database queries. It ensures that placeholder UUIDs /// and the other containing actual executed database queries. It ensures that placeholder UUIDs
/// are consistently mapped to the same actual UUIDs across all lines, and strictly validates line-by-line sequences. /// are consistently mapped to the same actual UUIDs across all lines, and strictly validates line-by-line sequences.
pub fn assert_sql(&self, actual: &[String]) -> Result<(), String> { pub fn assert_pattern(&self, actual: &[String]) -> Result<(), String> {
let patterns = match &self.sql { let patterns = match &self.sql {
Some(s) => s, Some(s) => s,
None => return Ok(()), None => return Ok(()),
@ -39,12 +22,6 @@ impl ExpectBlock {
)); ));
} }
for query in actual {
if let Err(e) = crate::tests::sql_validator::validate_semantic_sql(query) {
return Err(e);
}
}
let ws_re = Regex::new(r"\s+").unwrap(); let ws_re = Regex::new(r"\s+").unwrap();
let types = HashMap::from([ let types = HashMap::from([
@ -82,8 +59,8 @@ impl ExpectBlock {
let aline = clean_str(aline_raw); let aline = clean_str(aline_raw);
let pattern_str_raw = match pattern_expect { let pattern_str_raw = match pattern_expect {
SqlExpectation::Single(s) => s.clone(), super::SqlExpectation::Single(s) => s.clone(),
SqlExpectation::Multi(m) => m.join(" "), super::SqlExpectation::Multi(m) => m.join(" "),
}; };
let pattern_str = clean_str(&pattern_str_raw); let pattern_str = clean_str(&pattern_str_raw);

View File

@ -0,0 +1,206 @@
use super::Expect;
use sqlparser::ast::{Expr, Query, SelectItem, Statement, TableFactor};
use sqlparser::dialect::PostgreSqlDialect;
use sqlparser::parser::Parser;
use std::collections::HashSet;
impl Expect {
pub fn assert_sql(&self, actual: &[String]) -> Result<(), String> {
for query in actual {
if let Err(e) = Self::validate_semantic_sql(query) {
return Err(e);
}
}
Ok(())
}
pub fn validate_semantic_sql(sql: &str) -> Result<(), String> {
let dialect = PostgreSqlDialect {};
let statements = match Parser::parse_sql(&dialect, sql) {
Ok(s) => s,
Err(e) => return Err(format!("SQL Syntax Error: {}\nSQL: {}", e, sql)),
};
for statement in statements {
Self::validate_statement(&statement, sql)?;
}
Ok(())
}
fn validate_statement(stmt: &Statement, original_sql: &str) -> Result<(), String> {
match stmt {
Statement::Query(query) => Self::validate_query(query, &HashSet::new(), original_sql)?,
Statement::Insert(insert) => {
if let Some(query) = &insert.source {
Self::validate_query(query, &HashSet::new(), original_sql)?
}
}
Statement::Update(update) => {
if let Some(expr) = &update.selection {
Self::validate_expr(expr, &HashSet::new(), original_sql)?;
}
}
Statement::Delete(delete) => {
if let Some(expr) = &delete.selection {
Self::validate_expr(expr, &HashSet::new(), original_sql)?;
}
}
_ => {}
}
Ok(())
}
fn validate_query(
query: &Query,
available_aliases: &HashSet<String>,
original_sql: &str,
) -> Result<(), String> {
if let sqlparser::ast::SetExpr::Select(select) = &*query.body {
Self::validate_select(&select, available_aliases, original_sql)?;
}
Ok(())
}
fn validate_select(
select: &sqlparser::ast::Select,
parent_aliases: &HashSet<String>,
original_sql: &str,
) -> Result<(), String> {
let mut available_aliases = parent_aliases.clone();
// 1. Collect all declared table aliases in the FROM clause and JOINs
for table_with_joins in &select.from {
Self::collect_aliases_from_table_factor(&table_with_joins.relation, &mut available_aliases);
for join in &table_with_joins.joins {
Self::collect_aliases_from_table_factor(&join.relation, &mut available_aliases);
}
}
// 2. Validate all SELECT projection fields
for projection in &select.projection {
if let SelectItem::UnnamedExpr(expr) | SelectItem::ExprWithAlias { expr, .. } = projection {
Self::validate_expr(expr, &available_aliases, original_sql)?;
}
}
// 3. Validate ON conditions in joins
for table_with_joins in &select.from {
for join in &table_with_joins.joins {
if let sqlparser::ast::JoinOperator::Inner(sqlparser::ast::JoinConstraint::On(expr))
| sqlparser::ast::JoinOperator::LeftOuter(sqlparser::ast::JoinConstraint::On(expr))
| sqlparser::ast::JoinOperator::RightOuter(sqlparser::ast::JoinConstraint::On(expr))
| sqlparser::ast::JoinOperator::FullOuter(sqlparser::ast::JoinConstraint::On(expr))
| sqlparser::ast::JoinOperator::Join(sqlparser::ast::JoinConstraint::On(expr)) =
&join.join_operator
{
Self::validate_expr(expr, &available_aliases, original_sql)?;
}
}
}
// 4. Validate WHERE conditions
if let Some(selection) = &select.selection {
Self::validate_expr(selection, &available_aliases, original_sql)?;
}
Ok(())
}
fn collect_aliases_from_table_factor(tf: &TableFactor, aliases: &mut HashSet<String>) {
match tf {
TableFactor::Table { name, alias, .. } => {
if let Some(table_alias) = alias {
aliases.insert(table_alias.name.value.clone());
} else if let Some(last) = name.0.last() {
match last {
sqlparser::ast::ObjectNamePart::Identifier(i) => {
aliases.insert(i.value.clone());
}
_ => {}
}
}
}
TableFactor::Derived {
subquery,
alias: Some(table_alias),
..
} => {
aliases.insert(table_alias.name.value.clone());
// A derived table is technically a nested scope which is opaque outside, but for pure semantic checks
// its internal contents should be validated purely within its own scope (not leaking external aliases in, usually)
// but Postgres allows lateral correlation. We will validate its interior with an empty scope.
let _ = Self::validate_query(subquery, &HashSet::new(), "");
}
_ => {}
}
}
fn validate_expr(
expr: &Expr,
available_aliases: &HashSet<String>,
sql: &str,
) -> Result<(), String> {
match expr {
Expr::CompoundIdentifier(idents) => {
if idents.len() == 2 {
let alias = &idents[0].value;
if !available_aliases.is_empty() && !available_aliases.contains(alias) {
return Err(format!(
"Semantic Error: Orchestrated query referenced table alias '{}' but it was not declared in the query's FROM/JOIN clauses.\nAvailable aliases: {:?}\nSQL: {}",
alias, available_aliases, sql
));
}
} else if idents.len() > 2 {
let alias = &idents[1].value; // In form schema.table.column, 'table' is idents[1]
if !available_aliases.is_empty() && !available_aliases.contains(alias) {
return Err(format!(
"Semantic Error: Orchestrated query referenced table '{}' but it was not mapped.\nAvailable aliases: {:?}\nSQL: {}",
alias, available_aliases, sql
));
}
}
}
Expr::Subquery(subquery) => Self::validate_query(subquery, available_aliases, sql)?,
Expr::Exists { subquery, .. } => Self::validate_query(subquery, available_aliases, sql)?,
Expr::InSubquery {
expr: e, subquery, ..
} => {
Self::validate_expr(e, available_aliases, sql)?;
Self::validate_query(subquery, available_aliases, sql)?;
}
Expr::BinaryOp { left, right, .. } => {
Self::validate_expr(left, available_aliases, sql)?;
Self::validate_expr(right, available_aliases, sql)?;
}
Expr::IsFalse(e)
| Expr::IsNotFalse(e)
| Expr::IsTrue(e)
| Expr::IsNotTrue(e)
| Expr::IsNull(e)
| Expr::IsNotNull(e)
| Expr::InList { expr: e, .. }
| Expr::Nested(e)
| Expr::UnaryOp { expr: e, .. }
| Expr::Cast { expr: e, .. }
| Expr::Like { expr: e, .. }
| Expr::ILike { expr: e, .. }
| Expr::AnyOp { left: e, .. }
| Expr::AllOp { left: e, .. } => {
Self::validate_expr(e, available_aliases, sql)?;
}
Expr::Function(func) => {
if let sqlparser::ast::FunctionArguments::List(args) = &func.args {
if let Some(sqlparser::ast::FunctionArg::Unnamed(sqlparser::ast::FunctionArgExpr::Expr(
e,
))) = args.args.get(0)
{
Self::validate_expr(e, available_aliases, sql)?;
}
}
}
_ => {}
}
Ok(())
}
}

View File

@ -2,6 +2,6 @@ pub mod case;
pub mod expect; pub mod expect;
pub mod suite; pub mod suite;
pub use case::TestCase; pub use case::Case;
pub use expect::ExpectBlock; pub use expect::Expect;
pub use suite::TestSuite; pub use suite::Suite;

View File

@ -1,10 +1,10 @@
use super::case::TestCase; use super::case::Case;
use serde::Deserialize; use serde::Deserialize;
#[derive(Debug, Deserialize)] #[derive(Debug, Deserialize)]
pub struct TestSuite { pub struct Suite {
#[allow(dead_code)] #[allow(dead_code)]
pub description: String, pub description: String,
pub database: serde_json::Value, pub database: serde_json::Value,
pub tests: Vec<TestCase>, pub tests: Vec<Case>,
} }

54
t10.json Normal file
View File

@ -0,0 +1,54 @@
[
[
"(SELECT jsonb_build_object(",
" 'id', organization_1.id,",
" 'type', CASE",
" WHEN organization_1.type = 'person' THEN",
" ((SELECT jsonb_build_object(",
" 'age', person_3.age,",
" 'archived', entity_5.archived,",
" 'created_at', entity_5.created_at,",
" 'first_name', person_3.first_name,",
" 'id', entity_5.id,",
" 'last_name', person_3.last_name,",
" 'name', entity_5.name,",
" 'type', entity_5.type",
" )",
" FROM agreego.person person_3",
" JOIN agreego.organization organization_4 ON organization_4.id = person_3.id",
" JOIN agreego.entity entity_5 ON entity_5.id = organization_4.id",
" WHERE",
" NOT entity_5.archived))",
" WHEN organization_1.type = 'bot' THEN",
" ((SELECT jsonb_build_object(",
" 'archived', entity_8.archived,",
" 'created_at', entity_8.created_at,",
" 'id', entity_8.id,",
" 'name', entity_8.name,",
" 'token', bot_6.token,",
" 'type', entity_8.type",
" )",
" FROM agreego.bot bot_6",
" JOIN agreego.organization organization_7 ON organization_7.id = bot_6.id",
" JOIN agreego.entity entity_8 ON entity_8.id = organization_7.id",
" WHERE",
" NOT entity_8.archived))",
" WHEN organization_1.type = 'organization' THEN",
" ((SELECT jsonb_build_object(",
" 'archived', entity_10.archived,",
" 'created_at', entity_10.created_at,",
" 'id', entity_10.id,",
" 'name', entity_10.name,",
" 'type', entity_10.type",
" )",
" FROM agreego.organization organization_9",
" JOIN agreego.entity entity_10 ON entity_10.id = organization_9.id",
" WHERE",
" NOT entity_10.archived))",
" ELSE NULL END",
")",
"FROM agreego.organization organization_1",
"JOIN agreego.entity entity_2 ON entity_2.id = organization_1.id",
"WHERE NOT entity_2.archived)"
]
]

164
t4.json Normal file
View File

@ -0,0 +1,164 @@
[
[
"(SELECT jsonb_build_object(",
" 'addresses',",
" (SELECT COALESCE(jsonb_agg(jsonb_build_object(",
" 'archived', entity_6.archived,",
" 'created_at', entity_6.created_at,",
" 'id', entity_6.id,",
" 'is_primary', contact_4.is_primary,",
" 'name', entity_6.name,",
" 'target',",
" (SELECT jsonb_build_object(",
" 'archived', entity_8.archived,",
" 'city', address_7.city,",
" 'created_at', entity_8.created_at,",
" 'id', entity_8.id,",
" 'name', entity_8.name,",
" 'type', entity_8.type",
" )",
" FROM agreego.address address_7",
" JOIN agreego.entity entity_8 ON entity_8.id = address_7.id",
" WHERE",
" NOT entity_8.archived",
" AND relationship_5.target_id = address_7.id),",
" 'type', entity_6.type",
" )), '[]'::jsonb)",
" FROM agreego.contact contact_4",
" JOIN agreego.relationship relationship_5 ON relationship_5.id = contact_4.id",
" JOIN agreego.entity entity_6 ON entity_6.id = relationship_5.id",
" WHERE",
" NOT entity_6.archived",
" AND contact_4.parent_id = entity_3.id),",
" 'age', person_1.age,",
" 'archived', entity_3.archived,",
" 'contacts',",
" (SELECT COALESCE(jsonb_agg(jsonb_build_object(",
" 'archived', entity_11.archived,",
" 'created_at', entity_11.created_at,",
" 'id', entity_11.id,",
" 'is_primary', contact_9.is_primary,",
" 'name', entity_11.name,",
" 'target', CASE",
" WHEN entity_11.target_type = 'address' THEN",
" ((SELECT jsonb_build_object(",
" 'archived', entity_17.archived,",
" 'city', address_16.city,",
" 'created_at', entity_17.created_at,",
" 'id', entity_17.id,",
" 'name', entity_17.name,",
" 'type', entity_17.type",
" )",
" FROM agreego.address address_16",
" JOIN agreego.entity entity_17 ON entity_17.id = address_16.id",
" WHERE",
" NOT entity_17.archived",
" AND relationship_10.target_id = address_16.id))",
" WHEN entity_11.target_type = 'email_address' THEN",
" ((SELECT jsonb_build_object(",
" 'address', email_address_14.address,",
" 'archived', entity_15.archived,",
" 'created_at', entity_15.created_at,",
" 'id', entity_15.id,",
" 'name', entity_15.name,",
" 'type', entity_15.type",
" )",
" FROM agreego.email_address email_address_14",
" JOIN agreego.entity entity_15 ON entity_15.id = email_address_14.id",
" WHERE",
" NOT entity_15.archived",
" AND relationship_10.target_id = email_address_14.id))",
" WHEN entity_11.target_type = 'phone_number' THEN",
" ((SELECT jsonb_build_object(",
" 'archived', entity_13.archived,",
" 'created_at', entity_13.created_at,",
" 'id', entity_13.id,",
" 'name', entity_13.name,",
" 'number', phone_number_12.number,",
" 'type', entity_13.type",
" )",
" FROM agreego.phone_number phone_number_12",
" JOIN agreego.entity entity_13 ON entity_13.id = phone_number_12.id",
" WHERE",
" NOT entity_13.archived",
" AND relationship_10.target_id = phone_number_12.id))",
" ELSE NULL END,",
" 'type', entity_11.type",
" )), '[]'::jsonb)",
" FROM agreego.contact contact_9",
" JOIN agreego.relationship relationship_10 ON relationship_10.id = contact_9.id",
" JOIN agreego.entity entity_11 ON entity_11.id = relationship_10.id",
" WHERE",
" NOT entity_11.archived",
" AND contact_9.parent_id = entity_3.id),",
" 'created_at', entity_3.created_at,",
" 'email_addresses',",
" (SELECT COALESCE(jsonb_agg(jsonb_build_object(",
" 'archived', entity_20.archived,",
" 'created_at', entity_20.created_at,",
" 'id', entity_20.id,",
" 'is_primary', contact_18.is_primary,",
" 'name', entity_20.name,",
" 'target',",
" (SELECT jsonb_build_object(",
" 'address', email_address_21.address,",
" 'archived', entity_22.archived,",
" 'created_at', entity_22.created_at,",
" 'id', entity_22.id,",
" 'name', entity_22.name,",
" 'type', entity_22.type",
" )",
" FROM agreego.email_address email_address_21",
" JOIN agreego.entity entity_22 ON entity_22.id = email_address_21.id",
" WHERE",
" NOT entity_22.archived",
" AND relationship_19.target_id = email_address_21.id),",
" 'type', entity_20.type",
" )), '[]'::jsonb)",
" FROM agreego.contact contact_18",
" JOIN agreego.relationship relationship_19 ON relationship_19.id = contact_18.id",
" JOIN agreego.entity entity_20 ON entity_20.id = relationship_19.id",
" WHERE",
" NOT entity_20.archived",
" AND contact_18.parent_id = entity_3.id),",
" 'first_name', person_1.first_name,",
" 'id', entity_3.id,",
" 'last_name', person_1.last_name,",
" 'name', entity_3.name,",
" 'phone_numbers',",
" (SELECT COALESCE(jsonb_agg(jsonb_build_object(",
" 'archived', entity_25.archived,",
" 'created_at', entity_25.created_at,",
" 'id', entity_25.id,",
" 'is_primary', contact_23.is_primary,",
" 'name', entity_25.name,",
" 'target',",
" (SELECT jsonb_build_object(",
" 'archived', entity_27.archived,",
" 'created_at', entity_27.created_at,",
" 'id', entity_27.id,",
" 'name', entity_27.name,",
" 'number', phone_number_26.number,",
" 'type', entity_27.type",
" )",
" FROM agreego.phone_number phone_number_26",
" JOIN agreego.entity entity_27 ON entity_27.id = phone_number_26.id",
" WHERE",
" NOT entity_27.archived",
" AND relationship_24.target_id = phone_number_26.id),",
" 'type', entity_25.type",
" )), '[]'::jsonb)",
" FROM agreego.contact contact_23",
" JOIN agreego.relationship relationship_24 ON relationship_24.id = contact_23.id",
" JOIN agreego.entity entity_25 ON entity_25.id = relationship_24.id",
" WHERE",
" NOT entity_25.archived",
" AND contact_23.parent_id = entity_3.id),",
" 'type', entity_3.type",
")",
"FROM agreego.person person_1",
"JOIN agreego.organization organization_2 ON organization_2.id = person_1.id",
"JOIN agreego.entity entity_3 ON entity_3.id = organization_2.id",
"WHERE NOT entity_3.archived)"
]
]