[postgres] Add support for custom binary operators#548
Conversation
More details about operators in general are at: https://www.postgresql.org/docs/current/sql-createoperator.html. This patch attempts to parse `SELECT` queries that reference an operator using `OPERATOR(<optional_schema>.<operator_name>)` syntax. This is a PostgreSQL extension. There are no provisions for user-defined operators in the SQL standard.
|
@alamb could you please help with triggering the workflow? |
There was a problem hiding this comment.
Thank you for the contribution @iskakaushik -- I'll trigger the tests
Pull Request Test Coverage Report for Build 2805285978
💛 - Coveralls |
alamb
left a comment
There was a problem hiding this comment.
Thanks @iskakaushik -- this is looking close.
src/parser.rs
Outdated
| } | ||
| } | ||
| Keyword::XOR => Some(BinaryOperator::Xor), | ||
| Keyword::OPERATOR if dialect_of!(self is PostgreSqlDialect) => { |
There was a problem hiding this comment.
Given @ovr 's comment here, perhaps we should also add support when parsing in GenericDialect? apache/datafusion#3037 (comment)
There was a problem hiding this comment.
Done! I kept the name as PGCustomOperator but happy to change it as well.
src/ast/operator.rs
Outdated
| pub struct PGCustomOperator { | ||
| pub schema: Option<String>, | ||
| pub name: String, |
There was a problem hiding this comment.
What would you think about using the pre-existing Identifier here (that supports various qualified names)
like:
| pub struct PGCustomOperator { | |
| pub schema: Option<String>, | |
| pub name: String, | |
| pub struct PGCustomOperator { | |
| pub ident: ObjectName |
Thank you could use parse_object_name instead of custom parsing logic.
This would likely require less code as well as handle parsing more cases (like OPERATOR("~") -- aka quoted strings)
There was a problem hiding this comment.
I tried doing this, but looks like Identifier doesn't allow for Tokens that are un-quoted operators, for e.g: ~ is not a valid identifier. I am not sure if extending the parse_identifier to allow for un-quoted operators is the right call here.
IMO we could start with the existing mechanism of schema qualifier operator name (though I would've liked to use the parse_object_name logic) and can extend it as needed later. Let me know.
|
@iskakaushik I am planning to make a sqlparser release sometime soon -- I think this PR has just a few more items and it could be ready to merge. Let me know what you think. |
I will address the feedback today 😊 |
src/ast/operator.rs
Outdated
| pub schema: Option<String>, | ||
| pub name: String, |
There was a problem hiding this comment.
Thank you for trying -- Something about hard coding "schema" seems not right to me here -- at the very least because there can also be a database in identifiers https://www.postgresql.org/docs/current/ddl-schemas.html -- e.g. database.schema.table, so maybe it should be a Vec<String>
I think given the special set of allowed characters described in https://www.postgresql.org/docs/current/sql-createoperator.html in The operator name is a sequence of up to NAMEDATALEN-1 (63 by default) characters from the following list: to properly parse this limitation some extra code would be needed
Let me give it a shot
|
@iskakaushik here is my proposal: iskakaushik#1 let me know what you think |
|
@alamb I like you approach. The comment you made about converting to Thanks for fixing up my patch, really appreciate it! |
|
I will fix CI |
|
Thanks again @iskakaushik |

More details about operators in general are at: https://www.postgresql.org/docs/current/sql-createoperator.html. This patch attempts to parse
SELECTqueries that reference an operator usingOPERATOR(<optional_schema>.<operator_name>)syntax. Tests added in the patch illustrate a common use-case of this feature.This is a PostgreSQL extension. There are no provisions for user-defined operators in the SQL standard.