Blog on Extending SQL to create own SQL Dialects#97
Blog on Extending SQL to create own SQL Dialects#97Adez017 wants to merge 7 commits intoapache:mainfrom
Conversation
|
I am feeling August is going to be a month of some amazing blog content |
Yep |
|
have a look on this one also @alamb |
Co-authored-by: Jax Liu <liugs963@gmail.com>
Co-authored-by: Jax Liu <liugs963@gmail.com>
|
i had made the changes as per your suggestion @goldmedal , please take a look around |
|
can we move forward to merge @goldmedal @alamb |
|
I plan to read / work on this PR later today |
|
I am starting to check this one out |
|
Thanks for the Adjustments @alamb , i highly appreciate your time and efforts |
alamb
left a comment
There was a problem hiding this comment.
Thank you @Adez017 -- this is a great start to a blog. I pushed a few more links to the intro and background
I think it would be best if we could improve the examples before publishing this. Please let me know what you think.
I have two major comments:
Apply to the motivating example?
The post says:
DataFusion provides an excellent example of custom SQL dialect implementation in their [sql_dialect.rs] example. Let's break down how it works and then apply the pattern to our
ATTACH DATABASEuse case.
I didn't see any mention / examples of how to use the ATTACH DATABASE syntax
The COPY TO parser example is strange
I tried the example for COPY TO in the sql and it basically worked without a custom parser:
> create table source_table as values (1);
0 row(s) fetched.
Elapsed 0.002 seconds.
> COPY source_table TO 'file.fasta' STORED AS FASTA;
Error during planning: There is no registered file format with ext fasta(I would expect a parser error for a statement) I know this is not anything introduced by this post.
What would you think about updating the sql_parser.rs example to do something more exciting, such as either:
- Actually implementing the motivating example from this blog post
CREATE EXTERNAL CATALOG my_catalog
STORED AS PARQUET
LOCATION 's3://my-bucket/data/'
OPTIONS (
'aws.region' = 'us-west-2',
'catalog.type' = 'hive_metastore'
);- Implementing the
ATTACHandDETACHstatements from DuckDB?
Thanks @alamb , i think the its a great idea we can follow up along . and it would be great start |
|
by the way i think i had somewhere mixed up thinks from duckDB , IG |
So let's come up with a plan @Adez017 -- would you like me to take a shot at updating the sql parser example or would you? Do you have any preference on what we should show ( |
i would suggest you @alamb to do as , you had better experience in the following , and for preferences i think we should move forward with |
|
any updates @alamb |
|
Not yet -- sorry -- I am currently planning to help get I just don't want to publish a blog on the DataFusion site that is confusing -- I think we have a history of high quality content so if we are going to publish something I want it to be really compelling. THis one has the potential, but I think it needs some more work -- at the very least we need to resolve the discrepancy between the motivating example and what is acutally shown |
Sure , i am open to help you anywhere , when needed please let me know @alamb |
|
I was thinking some more last night -- maybe a good example would be "you want to implement a SQL dialect where the FROM clause is first, so instead of You wanted to implement 🤔 I think custom DDL / statements are likely to be the most common usecases though 🤔 |
Logically , it make sense . Most of the time we use the DDL commands and having a modularity to create custom DDL would be a great example . Much appreciated @alamb |
|
Hi @alamb , just checking in. If you need any help, let me know, as you have many overheads. |
|
Hi @alamb , any updates ? just curious about this one . |
|
Hi @Adez017 -- I am not likely to be able to spend much time on this project for a while To be published, at minimum I think this blog needs to be updated to actually implement the motivating example (or perhaps change the motivating example so it matches the examples upstream) As I mentioned, I think we could make the upstream example significantly more compelling either with custom query syntax, or perhaps implementing the motivating example in the blog. However, that is a larger project which I don't have time for Please feel free to work on those changes if you have time. I would most appreciate it |
|
@theirix made a PR with pretty interesting example apache/datafusion#17633 |
Jefffrey
left a comment
There was a problem hiding this comment.
I will say upfront I don't usually review these blog posts so I'm not as familiar with the general standards and such, but I'll leave some notes below from my review.
I agree with above comments that the motivating example should be consistent throughout the post; it makes more sense if we structure this like a case study with one specific example that we follow from top to bottom, instead of introducing different examples.
Some other minor notes:
- Do we need to mention sqlparser-rs anywhere? There's lots of mention of DataFusion parsing but I believe sqlparser-rs does a lot of the heavy lifting here, unless it's under the umbrella of the DataFusion keyword
- For the conclusion I think should use a consistent call to action, like so:
datafusion-site/content/blog/2025-09-21-custom-types-using-metadata.md
Lines 327 to 341 in 31f9668
Hi @alamb @goldmedal, I have drafted the blog on the topic and need you to review it for suggestions.