fix(validate): case-insensitive aesthetic-clause column lookup#444
Closed
dataders wants to merge 0 commit intoposit-dev:mainfrom
Closed
fix(validate): case-insensitive aesthetic-clause column lookup#444dataders wants to merge 0 commit intoposit-dev:mainfrom
dataders wants to merge 0 commit intoposit-dev:mainfrom
Conversation
ec87290 to
ac78cd4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
VISUALISEclauses in.gg.sqlfiles currently fail validation against case-preserving warehouses (Snowflake, Databricks, BigQuery quoted IDs) when the user writes lowercase column references. The columns come back from the warehouse in their declared case (typically UPPERCASE), and ggsql's column-name resolver does exact-case comparison, producing:DuckDB folds identifiers to lowercase by default, so the bug only bites case-preserving sources.
Fix
Case-insensitive column resolution with an exact-match-first preference:
execute::canonicalize_column_referencesruns after the source schema is fetched and rewrites layer mappings, remappings, andpartition_bycolumns to the schema's canonical case. This is what actually unblocks the SQL — without canonicalization, the generated SQL emits\"table_name\" AS \"__ggsql_aes_pos1__\"which a case-preserving warehouse rejects because the real column isTABLE_NAME. The user's original casing is preserved onAestheticValue::Column.original_nameso axis labels render the user's intent.writer::vegalite::layer::validate_layer_columnsalso uses a tolerant case-insensitive lookup as defense in depth, in case any code path slips past canonicalization.Resolution rules in both layers:
Evidence
This bug was independently reproduced from two different reader stacks against real Snowflake — confirming the root cause is ggsql-internal, not reader-side conversion:
Validation error: Column 'table_name' …againstanalytics_dev.information_schema.tables.Both readers correctly preserve the warehouse's casing; the validator was the only mismatched component.
Test plan
test_case_insensitive_column_lookup— DataFrame columnsTABLE_NAME,ROW_COUNT; aesthetics referencetable_name(lowercase) andROW_COUNT(exact case) — write succeeds.test_case_insensitive_lookup_ambiguity— DataFrame has bothFooandFOO; aesthetic referencesfoo— error includes "ambiguous".test_missing_column_errorstill passes (no regression to genuinely-missing-column path).cargo test --libgreen: 1365 passed, 0 failed, 1 ignored..gg.sql(left to maintainers — needs warehouse credentials).🤖 Generated with Claude Code