New UDFs Roadmap #4820
Replies: 3 comments 1 reply
-
@kevinzwang I wonder if this would be better as a discussion? Colin did something similar for the flotilla roadmap, but made it a discussion instead. |
Beta Was this translation helpful? Give feedback.
-
What is the advantage of making it a discussion? I forgot who said this but I recall that we wanted to use issues for asks that would need changes in Daft, and discussions otherwise. |
Beta Was this translation helpful? Give feedback.
-
Some Q/A on scalar function behavior: Q: What happens if the function closes on outside state?
Q: How are Series handled?
Q: Do we support optional/defaults/variadics/kwargs?
Q: What if I don't give a type hint e.g.
Q: Can I call it like
Q: Can I create a scalar class-based UDF?
Q: How is a
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Background
As Daft becomes used for more and more multimodal/AI workflows, we see some increasing patterns around the usage of UDFs, and we'd like to redesign our UDFs to better work with these patterns. This issue tracks the progress on this redesign.
The major differences between the designs of the existing (legacy) and new UDFs:
concurrency
parameter. The new UDFs will not be stateful. Instead, to do stateful things, we will introduce the concept of "resources". Resources are not covered in this roadmap but will be a separate issue.In addition, the scope of this work also includes some new ways to use UDFs, such as multi-column outputs, generator UDFs, async UDFs, and ergonomics around conversions between Python and Daft types.
Examples
Simple scalar UDF
Generator UDF
Async UDF
Batch UDF
Type checking
Roadmap
These tasks do not necessarily need to be done in order
Beta Was this translation helpful? Give feedback.
All reactions