feat: Add named parameter support for DuckDB read_parquet by max-sixty · Pull Request #5563 · PRQL/prql

max-sixty · 2025-11-17T21:26:45Z

Summary

Adds support for DuckDB's read_parquet optional boolean parameters to address issue #5548.

Users can now control DuckDB-specific behavior when reading Parquet files:

std.read_parquet 'data.parquet' union_by_name:true
std.read_parquet 'data.parquet' union_by_name:true binary_as_string:true

Changes

Added 4 optional boolean parameters to read_parquet function:
- binary_as_string (default: false) - Load binary columns as strings
- file_row_number (default: false) - Include file_row_number column
- hive_partitioning (default: null) - Interpret path as Hive partitioned
- union_by_name (default: false) - Union columns by name instead of position
Note: The filename parameter was intentionally excluded as it's deprecated since DuckDB v1.3.0 (automatically added as a virtual column)

Implementation Details

Parameters are defined in std.prql with defaults
DuckDB-specific SQL generation in std.sql.prql maps parameters to DuckDB's SQL named arguments
Generic SQL implementation accepts parameters for signature compatibility but only uses source (consistent with PRQL's dialect handling)

Test Results

✅ All 608 core tests pass
✅ New tests verify correct SQL generation with named arguments
✅ Backward compatibility maintained (existing code continues to work)

Example SQL Output

read_parquet(
  'data.parquet',
  binary_as_string = false,
  file_row_number = false,
  hive_partitioning = NULL,
  union_by_name = true
)

Fixes #5548

🤖 Generated with Claude Code

Add support for DuckDB's read_parquet optional boolean parameters: - binary_as_string: Load binary columns as strings - file_row_number: Include file_row_number column - hive_partitioning: Interpret path as Hive partitioned - union_by_name: Union columns by name instead of position Note: The 'filename' parameter was excluded as it's deprecated since DuckDB v1.3.0 (automatically added as virtual column). Fixes PRQL#5548 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

The read_parquet function now explicitly outputs all parameters with their default values in the generated SQL, making the behavior more transparent. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

max-sixty and others added 2 commits November 17, 2025 13:26

max-sixty merged commit a258339 into PRQL:main Nov 17, 2025
37 checks passed

max-sixty deleted the 5548 branch November 17, 2025 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add named parameter support for DuckDB read_parquet#5563

feat: Add named parameter support for DuckDB read_parquet#5563
max-sixty merged 2 commits intoPRQL:mainfrom
max-sixty:5548

max-sixty commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

max-sixty commented Nov 17, 2025

Summary

Changes

Implementation Details

Test Results

Example SQL Output

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant