Add field trait method to WindowUDFImpl, remove return_type/nullable#12374
Add field trait method to WindowUDFImpl, remove return_type/nullable#12374alamb merged 55 commits intoapache:mainfrom
field trait method to WindowUDFImpl, remove return_type/nullable#12374Conversation
| impl Expr { | ||
| /// Common method for window functions that applies type coercion | ||
| /// to all arguments of the window function to check if it matches | ||
| /// its signature. | ||
| /// | ||
| /// If successful, this method returns the data type and | ||
| /// nullability of the window function's result. | ||
| /// | ||
| /// Otherwise, returns an error if there's a type mismatch between | ||
| /// the window function's signature and the provided arguments. | ||
| fn data_type_and_nullable_with_window_function( | ||
| &self, | ||
| schema: &dyn ExprSchema, | ||
| window_function: &WindowFunction, | ||
| ) -> Result<(DataType, bool)> { |
There was a problem hiding this comment.
Extracted a common method to handle type coercion for all window function types (built-in, udaf and udwf) which is then reused by methods:,
data_type_and_nullable,get_typeand,nullable
| /// Return the type of the function given its input types | ||
| /// | ||
| /// See [`WindowUDFImpl::return_type`] for more details. | ||
| pub fn return_type(&self, args: &[DataType]) -> Result<DataType> { | ||
| self.inner.return_type(args) | ||
| } | ||
|
|
There was a problem hiding this comment.
Removed return_type.
| /// Returns if column values are nullable for this window function. | ||
| /// Returns the field of the final result of evaluating this window function. | ||
| /// | ||
| /// See [`WindowUDFImpl::nullable`] for more details. | ||
| pub fn nullable(&self) -> bool { | ||
| self.inner.nullable() |
| ) | ||
| ) | ||
| })?; | ||
| let (_, function_name) = self.qualified_name(); |
There was a problem hiding this comment.
Use Expr::qualified_name which also handles:
Expr::Columnand,Expr::Alias
| WindowFunctionDefinition::WindowUDF(fun) => fun | ||
| .field(WindowUDFFieldArgs::new(input_expr_types, display_name)) | ||
| .map(|field| field.data_type().clone()), |
There was a problem hiding this comment.
Return data type for udwf.
|
Thanks @jayzhan211 |
There was a problem hiding this comment.
EDIT:
If this is "too late" for this kind of comment, please let me know and I'll delete. I hadn't seen the issue / PR work until today.
One performance consideration is that Field::new allocates a new string on each invocation.
Could that be why WindowUDF (and both ScalarUDF and AggregateUDF) preferred to keep the methods separate?
I dug into it because needing to add empty &str to all the tests in datafusion/expr/src/expr.rs made me think it's not the right abstraction.
Lastly, do we want to diverge Aggregate and Window UDFs?
I thought the recent trend had been to unify them, like in this PR #11550 by @timsaucer?
I don't think there is any issue given that we usually require the whole Field Note that
Maybe historically issue?
I don't think so, they should be separated, no any good reason to mix them |
@Michael-J-Ward Appreciate the extra set of eyes. Thank you, for reviewing the code 🙌
datafusion/datafusion/expr/src/expr.rs Lines 704 to 724 in 5cc7d06 The When @Michael-J-Ward @jayzhan211 Thank you. |
field trait method to WindowUDFImplfield trait method to WindowUDFImpl, remove return_type
field trait method to WindowUDFImpl, remove return_typefield trait method to WindowUDFImpl, remove return_type/nullable
alamb
left a comment
There was a problem hiding this comment.
👨🍳 👌
Looks really nice to me -- thank you @jcsherin and @jayzhan211
I also have it on my list to file some more sub task tickets for #8709 to remove the rest of the built in WindowFunctions
|
Thanks again @jayzhan211 and @jcsherin |
| fn nullable(&self) -> bool { | ||
| true | ||
| } | ||
| /// The [`Field`] of the final result of evaluating this window function. |
There was a problem hiding this comment.
Could be useful to document here how the "name" for the returned field is supposed to be set :)
There was a problem hiding this comment.
Agreed. It's a great suggestion. I'll implement in a follow-up PR.
Thanks @Blizzara
…llable` (apache#12374) * Adds new library `functions-window-common` * Adds `FieldArgs` struct for field of final result * Adds `field` method to `WindowUDFImpl` trait * Minor: fixes formatting * Fixes: udwf doc test * Fixes: implements missing trait items * Updates `datafusion-cli` dependencies * Fixes: formatting of `Cargo.toml` files * Fixes: implementation of `field` in udwf example * Pass `FieldArgs` argument to `field` * Use `field` in place of `return_type` for udwf * Update `field` in udwf implementations * Fixes: implementation of `field` in udwf example * Revert unrelated change * Mark `return_type` for udwf as unreachable * Delete code * Uses schema name of udwf to construct `FieldArgs` * Adds deprecated notice to `return_type` trait method * Add doc comments to `field` trait method * Reify `input_types` when creating the udwf window expression * Rename name field to `schema_name` in `FieldArgs` * Make `FieldArgs` opaque * Minor refactor * Removes `nullable` trait method from `WindowUDFImpl` * Add doc comments * Rename to `WindowUDFResultArgs` * Minor: fixes formatting * Copy edits for doc comments * Renames field to `function_name` * Rename struct to `WindowUDFFieldArgs` * Add comments for unreachable code * Copy edit for `WindowUDFImpl::field` trait method * Renames module * Fix warning: unused doc comment * Minor: rename bindings * Minor refactor * Minor: copy edit * Fixes: use `Expr::qualified_name` for window function name * Fixes: apply previous fix to `Expr::nullable` * Refactor: reuse type coercion for window functions * Fixes: clippy errors * Adds name parameter to `WindowFunctionDefinition::return_type` * Removes `return_type` field from `SimpleWindowUDF` * Add doc comment for helper method * Rewrite doc comments * Minor: remove empty comment * Remove `WindowUDFImpl::return_type` * Fixes doc test
I filed a few more tickets to hopefully get this process started. |
Which issue does this PR close?
Closes #12373.
Rationale for this change
The result field from evaluating the user-defined window function is composed from the
return_typeandnullabletrait methods inWindowUDFImpl.This change explores folding both methods into a single trait method. The user-defined window functions have to implement only the
fieldtrait method which makes the intent more explicit.The current implementation for a user-defined window function (without field trait method) looks like this:
The implementation for a user-defined window function after this change:
What changes are included in this PR?
fieldtrait method:return_typetrait method.datafusion/datafusion/expr/src/udwf.rs
Lines 282 to 284 in a08f923
nullabletrait method which was added in Convert built-inrow_numberto user-defined window function #12030.WindowUDFFieldArgs:Are these changes tested?
Yes, against existing tests in CI.
Are there any user-facing changes?
Yes, this is a breaking change for user-defined window functions API.