Skip to content

feat: Enable Builds for Python Free-Threaded Mode #4870

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

srilman
Copy link
Contributor

@srilman srilman commented Jul 29, 2025

Changes Made

Set up changes such that we can now build Daft using a free-threaded (no-GIL) version of Python, in particular Python 3.13.t.

This was a followup to the "Parallel stateless UDF" PR because ideally, using threads to execute separate UDFs should be faster than processes:

  • Less extraneous processes, thus less OS overhead
  • Less shared memory communication overhead
  • No need to dynamically detection between using threads and processes, just always use threads

Reality's not that great though 🥲. Turns out that free-threaded Python 3.13 disables some interpreter optimizations, leading to a 40% slowdown in single-core performance. 3.14 should mostly mitigate it, down to 5-10% of a slowdown (https://docs.python.org/3.14/whatsnew/3.14.html#free-threaded-mode). Since it's officially releasing in about a month, we're a bit early!

Performance benchmarks on a WARC -> HTML parsing example on a M4 Max (14 cores) using BeautifulSoup show:

Runner Time
3.13 57s
3.13 w/ Process 22s (2.6x)
3.13t 27s (2.1x)
3.14t (estimated) ~18s (3.1x)

There are some unrelated followup required to enable builds entirely, so this PR just adds the baseline support and manual builds. TODOs

  • Upgrade some dev dependencies for 3.13
  • Set up testing on a subset of tests (and include Windows)

@github-actions github-actions bot added the feat label Jul 29, 2025
@srilman
Copy link
Contributor Author

srilman commented Aug 1, 2025

I think the rust test is just flakey

@srilman srilman requested a review from colin-ho August 1, 2025 20:14
@srilman srilman marked this pull request as ready for review August 1, 2025 20:14
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR introduces foundational support for Python's free-threaded mode (no-GIL Python 3.13t) to enable builds of Daft using free-threaded Python. The primary goal is to eventually enable faster parallel UDF execution using threads instead of processes, reducing OS overhead and shared memory communication costs.

The changes are systematic and comprehensive, touching nearly every PyO3 Python binding in the codebase. The main modifications include:

  1. PyO3 Class Immutability: Added the frozen attribute to ~50 PyO3 pyclass declarations across data structures like PySeries, PySchema, PyDataType, configuration objects, and enums. This makes Python objects immutable, preventing concurrent modification that could cause data races in free-threaded environments.

  2. Thread-Safe Mutability Patterns: For structs that need mutable operations, the code either wraps internal state in Mutex (like StreamingPartitionIterator, AdaptivePhysicalPlanScheduler) or marks them as unsendable to prevent cross-thread access (RaySwordfishWorker, PySqlCatalog).

  3. Method Signature Updates: Changed method signatures from &mut self to &self where internal mutability is handled through thread-safe mechanisms like Arc<RwLock<T>> or DashMap.

  4. Bug Fixes: Fixed critical bugs in file globbing where merge() operations weren't properly accumulating results (daft/runners/ray_runner.py, daft/runners/native_runner.py).

  5. Build Infrastructure: Updated GitHub workflows to support building free-threaded wheels, created specialized requirements file for free-threaded builds, and upgraded pre-commit hooks.

  6. Core Library Changes: Added gil_used = false to the main PyO3 module and removed Unpin trait bounds that were incompatible with free-threaded execution.

While current benchmarks show mixed performance results due to Python 3.13's disabled optimizations (40% single-core slowdown), the changes position Daft to benefit from Python 3.14's improved free-threaded performance (estimated 5-10% slowdown). The modifications maintain full backward compatibility while enabling the new execution model.

Confidence score: 4/5

  • This PR requires careful testing but represents a well-structured approach to enabling free-threaded Python support
  • Score reflects the systematic nature of changes and adherence to PyO3 best practices, though comprehensive testing will be needed
  • Pay close attention to test files that may be broken by the FileInfos merge signature change and mutex-based performance implications

46 files reviewed, 1 comment

Edit Code Review Bot Settings | Greptile

pytest-benchmark==4.0.0
pytest-cov==4.1.0
pytest-lazy-fixture==0.6.3
# memray==1.17.2; platform_system != "Windows" # Unknown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Add specific reason why memray is disabled for free-threaded (e.g., 'not compatible with free-threaded')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant