feat: Async open ai llm generate #4879

colin-ho · 2025-07-31T05:05:31Z

Changes Made

Use asyncio to make concurrent requests for llm generate with openai. Additionally this PR changes the default batch size for openai to 128 instead of 1024.

Related Issues

Checklist

Documented in API Docs (if applicable)
Documented in User Guide (if applicable)
If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

ccmao1130 · 2025-07-31T07:41:11Z

being nosy... just gentle reminder to check docstring format in build

colin-ho · 2025-08-05T20:51:53Z

daft/functions/llm.py

+            import logging
+
+            logger = logging.getLogger(__name__)
+            logger.exception(e)


Currently exceptions that are unserializable are not handled properly: #4881, so this is a workaround until it is fixed.

colin-ho · 2025-08-05T20:55:45Z

being nosy... just gentle reminder to check docstring format in build

i think it looks fine right?

greptile-apps

Greptile Summary

This PR introduces asynchronous OpenAI API support to the LLM generation functionality in Daft. The core change migrates from using the synchronous openai.OpenAI client to openai.AsyncOpenAI, enabling concurrent API requests through asyncio.gather() for improved performance when processing multiple prompts. The implementation adds event loop management that attempts to use existing running loops or creates new ones when needed.

Additionally, the PR makes batch sizes provider-specific: OpenAI now defaults to 128 (down from 1024) while vLLM retains 1024. This reflects the different optimal batch characteristics between API-based inference (OpenAI) and local inference (vLLM), where smaller concurrent batches may be more efficient due to API rate limits and latency considerations.

The changes integrate with Daft's existing LLM function infrastructure by maintaining the same public interface while internally switching to async operations. The test suite has been updated accordingly, replacing openai.OpenAI patches with openai.AsyncOpenAI and implementing proper async mocks for the completion creation flow.

Confidence score: 2/5

This PR introduces complex async event loop management that could cause issues in different execution contexts, particularly when nested event loops are involved
Score reflects concerns about the event loop initialization logic and potential runtime errors when run_until_complete() is called from within an already running loop
Pay close attention to daft/functions/llm.py lines 143-175 where event loop management and async execution could fail in certain environments

_{2 files reviewed, 3 comments}

_{Edit Code Review Bot Settings | Greptile}

greptile-apps · 2025-08-05T22:43:56Z

daft/functions/llm.py

+        try:
+            self.loop = asyncio.get_running_loop()
+        except RuntimeError:
+            self.loop = asyncio.new_event_loop()


logic: Event loop management is problematic. Creating a new event loop in __init__ without setting it as the current loop can cause issues. The loop may not be properly configured for the execution context.

greptile-apps · 2025-08-05T22:43:57Z

daft/functions/llm.py

+            return await asyncio.gather(*tasks)
+
+        try:
+            outputs = self.loop.run_until_complete(gather_completions())


logic: Using run_until_complete() on a potentially running loop will raise RuntimeError if called from within an async context. This will break in environments like Jupyter notebooks or async UDF execution.

Watch out for these and do some testing from within Jupyter? I did see some users complaining about needing to do some shenanigans with nested_async to get our stuff working in a notebook.

Generally needing to manually handle self.loop feels kinda icky to me. Is it not possible to just grab the currently available loop?

Yeah can cfm that it works in jupyter (native + ray).

self.loop is to store either the current available loop, or create a new one if it does not exist

daft/functions/llm.py

codecov · 2025-08-14T03:44:23Z

Codecov Report

❌ Patch coverage is 0% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.14%. Comparing base (5a7f4b0) to head (c897fde).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
daft/functions/llm.py	0.00%	26 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4879      +/-   ##
==========================================
- Coverage   78.56%   78.14%   -0.43%     
==========================================
  Files         917      917              
  Lines      128066   129997    +1931     
==========================================
+ Hits       100621   101589     +968     
- Misses      27445    28408     +963

Files with missing lines	Coverage Δ
daft/functions/llm.py	`15.62% <0.00%> (-6.60%)`	⬇️

... and 43 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

everettVT · 2025-08-15T00:11:51Z

Awesome PR.

Next steps structured gen syntax? It varies between vllm/sglang/tensorRT-llm and normal OpenAI

jaychia · 2025-08-16T05:37:39Z

Awesome PR.

Next steps structured gen syntax? It varies between vllm/sglang/tensorRT-llm and normal OpenAI

This would be super interesting. Open an issue? Would love to consider.

I was just reading this: https://simmering.dev/blog/openai_structured_output/

async

f05aa06

github-actions bot added the feat label Jul 31, 2025

add runnable example

705bef1

log errors

918a348

colin-ho commented Aug 5, 2025

View reviewed changes

fix test 1

ee5bec2

colin-ho added 2 commits August 5, 2025 15:05

async magic mock

58f520e

Merge branch 'main' into colin/async-open-ai

7aa2ba6

colin-ho marked this pull request as ready for review August 5, 2025 22:43

greptile-apps bot reviewed Aug 5, 2025

View reviewed changes

colin-ho added 5 commits August 5, 2025 18:24

Merge branch 'main' into colin/async-open-ai

281024f

add back vllm skip

bc32682

Merge branch 'main' into colin/async-open-ai

baed38e

threadsafe

0700b0e

nvm

c897fde

colin-ho requested a review from kevinzwang August 14, 2025 01:29

kevinzwang approved these changes Aug 14, 2025

View reviewed changes

colin-ho merged commit 4533257 into main Aug 14, 2025
55 of 56 checks passed

colin-ho deleted the colin/async-open-ai branch August 14, 2025 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Async open ai llm generate #4879

feat: Async open ai llm generate #4879

colin-ho commented Jul 31, 2025 •

edited

Loading

Uh oh!

ccmao1130 commented Jul 31, 2025

Uh oh!

colin-ho Aug 5, 2025

Uh oh!

colin-ho commented Aug 5, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Aug 5, 2025

Uh oh!

greptile-apps bot Aug 5, 2025

Uh oh!

jaychia Aug 14, 2025

Uh oh!

colin-ho Aug 14, 2025

Uh oh!

Uh oh!

codecov bot commented Aug 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

everettVT commented Aug 15, 2025

Uh oh!

jaychia commented Aug 16, 2025

Uh oh!

Uh oh!

feat: Async open ai llm generate #4879

feat: Async open ai llm generate #4879

Conversation

colin-ho commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Made

Related Issues

Checklist

Uh oh!

ccmao1130 commented Jul 31, 2025

Uh oh!

colin-ho Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

colin-ho commented Aug 5, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Summary

Confidence score: 2/5

Uh oh!

greptile-apps bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

jaychia Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

colin-ho Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

everettVT commented Aug 15, 2025

Uh oh!

jaychia commented Aug 16, 2025

Uh oh!

Uh oh!

colin-ho commented Jul 31, 2025 •

edited

Loading

codecov bot commented Aug 14, 2025 •

edited

Loading