Skip to content

fix(prqlc, append): apply column order on CTEs too#5323

Merged
max-sixty merged 2 commits intoPRQL:mainfrom
Fanaen:add-positional-mapping-for-append-v3
Jun 11, 2025
Merged

fix(prqlc, append): apply column order on CTEs too#5323
max-sixty merged 2 commits intoPRQL:mainfrom
Fanaen:add-positional-mapping-for-append-v3

Conversation

@Fanaen
Copy link
Copy Markdown
Contributor

@Fanaen Fanaen commented Jun 10, 2025

This is my third attempt to solve the issue of append positional mapping.

It enables the workaround for #2680

Problem

  • SQL's UNION need the same number of column and in both queries
  • PRQL currently changes main query's columns without changing sub query as well
  • PRQL outputs a SQL query that leads to a type mismatch or a column count mismatch error. More elaborate cases are a mix of those two. You may try a simplified version of either with the Query Results tab in the playground:
# Column swap leading to type mismatch
from invoices | select { invoice_id, invoice_date }
append (
  from invoices | select { invoice_id, invoice_date }
)
select { invoice_date, invoice_id }
# -> E: Unimplemented type for cast (BIGINT -> TIMESTAMP)
from invoices | select { invoice_id, invoice_date }
append (
  from invoices | select { invoice_id, invoice_date }
)
select { invoice_date }
# -> E: Set operations can only apply to expressions with the same number of result columns

Solution

This fix operate in anchoring stage only.

There's two parts:

  • when a pipeline is compiled, it checks the columns before and after and remember the difference for each affected relation.
  • when a CTE relation is compiled, it looks if there a mapping for it and reorder or even drops required columns.

@max-sixty
Copy link
Copy Markdown
Member

thanks @Fanaen — I haven't gone through the whole thing to compare, but it looks like a reasonable approach

do you know why the tests are failing here? same issue as the prior PR? we can skip sqlite tests if the tradeoff is better support for the main dialects

@Fanaen
Copy link
Copy Markdown
Contributor Author

Fanaen commented Jun 10, 2025

Thanks @max-sixty! Yeah, no need to compare. This attempt is very different, and I think a lot better. All the extra baggage of the previous attempts would have been detrimental to the readability of this one, hence the new PR. I'll scrap the other one.

I'll keep that in mind! Yep, same tests, same issues: only this time, the SQL results are good!

This fix should be fully dialect independent. I wanted integration tests with different table sources, and I did not see that the table columns were not matching in some dialects.

This should be easy to fix. I'll look into that tomorrow.

@Fanaen Fanaen force-pushed the add-positional-mapping-for-append-v3 branch from 9566268 to 270b83c Compare June 11, 2025 08:41
@Fanaen Fanaen marked this pull request as ready for review June 11, 2025 09:48
@Fanaen
Copy link
Copy Markdown
Contributor Author

Fanaen commented Jun 11, 2025

We're good to go!

There is one issue I encountered while working on this. When there is a aggregate whose result is SELECT ... COUNT(*) AS col FROM tbl GROUP BY, this leads to a column connected to nothing, and it's removed. This case wasn't working properly before, so I believe it could be a future fix.

@max-sixty
Copy link
Copy Markdown
Member

nice, thanks a lot @Fanaen !

@max-sixty max-sixty merged commit 83258c2 into PRQL:main Jun 11, 2025
39 checks passed
@Fanaen Fanaen deleted the add-positional-mapping-for-append-v3 branch June 12, 2025 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants