fix: Keep Computes with Aggregate when Filter follows (#5130)#5639
Merged
fix: Keep Computes with Aggregate when Filter follows (#5130)#5639
Conversation
Replace internal compiler error with user-facing error message when a bare lambda like `x -> y` is used at the top level. Now produces: "expected a table, but found a function" Fixes PRQL#4280 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
When using `group {} (sort a | take 1)`, the inner sort was being
incorrectly dropped because `sort_undone` was unconditionally set
to true for all groups. Now only set `sort_undone` when there's an
actual partition (non-empty group keys).
Fixes PRQL#5100
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The error message for bare lambda expressions (e.g., `x -> y`) was fixed in PR PRQL#5634. Move the test from bad_error_messages.rs to error_messages.rs since the error is now clear and helpful. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…QL#4432 Add inline snapshot tests to prevent regressions for bugs that have been fixed: - test_source_column_name: Using `source` as column name no longer causes "Ambiguous name" error (PRQL#5094) - test_column_inference_with_into: Column inference works correctly with `into` statement (PRQL#4723) - test_distinct_on_columns_propagated: DISTINCT ON propagates necessary columns to CTE (PRQL#4432) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
When `sort | take` was lowered to RQ, the sort was embedded in the Take struct. But postprocessing only tracked explicit SqlTransform::Sort, ignoring the embedded sort. This caused ORDER BY to be lost in CTEs. Now extracts `take.sort` for simple takes (no partition) where the sort was merged during RQ lowering. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
When aggregate expressions include operations like `(sum x) + (sum y)`, the pipeline was being split such that aggregation functions ended up in a CTE without GROUP BY, producing invalid SQL. The fix modifies `is_split_required` to not split between Compute and Filter when there's an Aggregate in the following transforms. Filter after Aggregate becomes HAVING in the same SELECT, so all preceding Computes must stay with the Aggregate for proper GROUP BY. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #5130: Filtering on combined aggregates was generating invalid SQL with CTEs missing GROUP BY.
The issue occurred when aggregate expressions contained operations like
(sum x) + (sum y). These expressions were being extracted as Compute transforms, but when the pipeline was split for a following Filter, the Computes ended up in a separate CTE from their Aggregate, losing the GROUP BY clause.Fix: In
is_split_required, don't split between Compute and Filter when there's an Aggregate later in the pipeline. This keeps aggregate-related Computes together with their GROUP BY.Trade-off: Some intermediate CTEs may include unused columns (documented in test comment). These don't leak to final output.
Test plan
(sum x) + (sum y)and(sum x) * 2with filter🤖 Generated with Claude Code