Add parse_multipart_identifier function to parser#860
Conversation
Pull Request Test Coverage Report for Build 4969813312
💛 - Coveralls |
ankrgyl
left a comment
There was a problem hiding this comment.
From what I can tell this can only be invoked directly (by calling parse_multipart_identifier()) and not automatically while parsing a query. Is that correct/intended?
|
Yes that's correct @ankrgyl It was partly inspired by how Spark has a similar method: https://github.com/apache/spark/blob/f8604ad14b24e8c657a0305b4fb8ad7efcb84060/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala#L65-L70 To allow downstream to parse identifiers by themselves (not necessarily as part of a query) |
|
Ah got it. Personally I'd prefer to integrate it into the main parser so that we increase our overall support for Spark SQL and Postgres. @alamb wdyt? |
| } | ||
|
|
||
| /// Parse identifiers of form ident1[.identN]* | ||
| pub fn parse_multipart_identifier(&mut self) -> Result<Vec<Ident>, ParserError> { |
There was a problem hiding this comment.
Looking at this function more closely, I wonder how if we can't reuse parse_identifiers from above?
Or maybe we could add some more documentation (like docstrings / tests) for this function to make it clearer it is designed to parse strings into identifiers 🤔
There was a problem hiding this comment.
I think parse_identifiers above is the odd one out, as it accepts any delimiter between words/idents (except equal sign) which means it would parse test + one + two as a valid sequence of identifiers. This is why I introduced the new parse_multipart_identifier to be more strict about what constitutes a compound/multipart identifier.
Though I did base parse_multipart_identifier primarily on Spark SQL/Postgres syntax, so maybe parse_identifiers is generic to accommodate some other type of syntax
There was a problem hiding this comment.
Upon some more reflection, I think we should accept this code and update the documentation strings on parse_multipart_identifier and parse_identifiers to explain in more detail what they do and how they are different.
I can try and find time to update the documentation maybe next week -- or @Jefffrey do you have time to do so?
There was a problem hiding this comment.
I've update the doc a bit, let me know any further suggestions
| } | ||
|
|
||
| /// Parse identifiers of form ident1[.identN]* | ||
| pub fn parse_multipart_identifier(&mut self) -> Result<Vec<Ident>, ParserError> { |
There was a problem hiding this comment.
Upon some more reflection, I think we should accept this code and update the documentation strings on parse_multipart_identifier and parse_identifiers to explain in more detail what they do and how they are different.
I can try and find time to update the documentation maybe next week -- or @Jefffrey do you have time to do so?
|
|
||
| /// Parse identifiers of form ident1[.identN]* | ||
| /// | ||
| /// Similar in functionality to [parse_identifiers], with difference |
* Support identifiers beginning with digits in MySQL (apache#856) * support COPY INTO in snowflake (apache#841) Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com> * Add `dialect_from_str` and improve `Dialect` documentation (apache#848) * Add `dialect_from_str` and improve `Dialect` documentation * cleanup * fix compilation with nostd * Support multiple-table DELETE syntax (apache#855) * Support `DISTINCT ON (...)` (apache#852) * Support "DISTINCT ON (...)" * a test * fix the merge * Test trailing commas (apache#859) * test: add tests for trailing commas * tweaks * Add support for query source in COPY .. TO statement (apache#858) * Add support for query source in COPY .. TO statement * Fix compile error * Fix logical merge conflict (apache#865) * Fix tiny typo in custom_sql_parser.md (apache#864) * Make Expr::Interval its own struct (apache#872) * Make Expr::Interval its own struct * Add test interval display * Fix cargo fmt * Include license file in published crate (apache#871) * Add support for multiple expressions, order by in aggregations (apache#879) * Add support for multiple expressions, order by in aggregations * Fix formatting errors * Resolve linter errors * Add parse_multipart_identifier function to parser (apache#860) * Add parse_multipart_identifier function to parser * Update doc for parse_multipart_identifier * Fix conflict * feat: Add custom operator (apache#868) * feat: Add custom operator From apache#863 - It doesn't parse anything — I'm not sure how to parse ` SELECT 'a' REGEXP '^[a-d]';` with `REGEXP` as the operator... (but fine for my narrow purpose) - If we need tests, where would I add them? * Update src/ast/operator.rs --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * feat: Support MySQL's `DIV` operator (apache#876) * feat: MySQL's DIV operator * fix: do not use `_` prefix for used variable --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * truncate: table as optional keyword (apache#883) Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com> * feat: add DuckDB dialect (apache#878) * feat: add DuckDB dialect * formatting * fix conflict * support // in GenericDialect * add DucDbDialect to all_dialects * add comment from suggestion Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * fix: support // in GenericDialect --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * Add support for first, last aggregate function parsing (apache#882) * Add order by parsing to functions * Fix doc error * minor changes * Named window frames (apache#881) * after over clause, named window can be parsed with window ... as after having clause * Lint errors are fixed * Support for multiple windows * fix lint errors * simplifications * rename function * Rewrite named window search in functional style * Test added and some minor changes * Minor changes on tests and namings, and semantic check is removed --------- Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai> Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com> * Fix merge conflict (apache#885) * Update CHANGELOG for `0.34.0` release (apache#884) * chore: Release sqlparser version 0.34.0 * Update criterion requirement from 0.4 to 0.5 in /sqlparser_bench (apache#890) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --------- Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com> Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: AviRaboah <91826424+AviRaboah@users.noreply.github.com> Co-authored-by: pawel.leszczynski <leszczynski.pawel@gmail.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Aljaž Mur Eržen <aljazerzen@users.noreply.github.com> Co-authored-by: Armin Primadi <aprimadi@gmail.com> Co-authored-by: Okue <nogideca@gmail.com> Co-authored-by: Andrew Kane <acekane1@gmail.com> Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com> Co-authored-by: Jeffrey <22608443+Jefffrey@users.noreply.github.com> Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com> Co-authored-by: eitsupi <50911393+eitsupi@users.noreply.github.com> Co-authored-by: Maciej Obuchowski <obuchowski.maciej@gmail.com> Co-authored-by: Berkay Şahin <124376117+berkaysynnada@users.noreply.github.com> Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai> Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Closes #805
To be used for parsing identifiers which when could be quoted, can get quite complex (rather than relying on simple split on dot)
Pulled from DataFusion with one change: allow whitespaces between parts in the identifier, as seems this is supported by Spark SQL and Postgres
So following idents would be parsed with the same result: