Skip to content

refactor: make DaftExtension class definition static #4968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Jay-ju
Copy link
Contributor

@Jay-ju Jay-ju commented Aug 13, 2025

Changes Made

daft.datatype._ensure_registered_super_ext_type.<locals>.DaftExtension vs <class 'daft.datatype.DaftExtension'>

Moves the DaftExtension class definition to the module level. Previously, this class was defined at runtime from within the _ensure_registered_super_ext_type function. This meant that the class definition itself was dynamic, which causes problems for some specific cases of schema handling while writing out to some data formats.

Related Issues

Checklist

  • Documented in API Docs (if applicable)
  • Documented in User Guide (if applicable)
  • If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
  • Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

@github-actions github-actions bot added the fix label Aug 13, 2025
@Jay-ju Jay-ju mentioned this pull request Aug 13, 2025
4 tasks
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR refactors the DaftExtension class definition in daft/datatype.py by moving it from being dynamically created inside the _ensure_registered_super_ext_type() function to being defined at module level. The change improves code organization and static analysis capabilities without altering the class's functionality.

Previously, the DaftExtension class was created dynamically within the _ensure_registered_super_ext_type() function, which made it invisible to static type checkers, IDEs, and other code analysis tools. The new implementation defines the class statically at lines 1178-1199, making it part of the module's permanent structure. The _ensure_registered_super_ext_type() function (lines 1201-1216) has been simplified to just assign the pre-defined class to _STATIC_DAFT_EXTENSION and register it with PyArrow when needed.

This change fits well within Daft's Arrow integration architecture, where the framework extends PyArrow's type system for custom data handling. The DaftExtension serves as the base class for Arrow extension types in Daft, and making it statically available improves the overall maintainability of the Arrow integration layer. The double-checked locking pattern is preserved to ensure thread-safe registration with PyArrow.

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it's a code organization refactoring that maintains identical functionality
  • Score reflects straightforward refactoring with no logic changes and preservation of thread safety mechanisms
  • No files require special attention as the change is isolated to a single module with clear intent

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link

codecov bot commented Aug 13, 2025

Codecov Report

❌ Patch coverage is 92.30769% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 79.13%. Comparing base (6ef8e52) to head (1417ab9).
⚠️ Report is 86 commits behind head on main.

Files with missing lines Patch % Lines
daft/datatype.py 92.30% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4968      +/-   ##
==========================================
+ Coverage   78.84%   79.13%   +0.29%     
==========================================
  Files         890      917      +27     
  Lines      123743   126969    +3226     
==========================================
+ Hits        97559   100483    +2924     
- Misses      26184    26486     +302     
Files with missing lines Coverage Δ
daft/datatype.py 94.30% <92.30%> (ø)

... and 340 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Jay-ju
Copy link
Contributor Author

Jay-ju commented Aug 13, 2025

@malcolmgreaves When you have time, please take a look at this PR. It aims to solve the UT issue corresponding to #4940.

@srilman srilman requested a review from malcolmgreaves August 13, 2025 20:11
Copy link
Contributor

@malcolmgreaves malcolmgreaves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Jay-ju! I think these changes LGTM. But since these changes exist to make #4940 work, I'd like to actually combine both PRs into one. That way we can make sure that these changes are indeed solving the problem we expect. I want to avoid merging in this PR, updating #4940 with it, then realizing that it didn't fix the problem.

@malcolmgreaves
Copy link
Contributor

I made #4975 as a test of both this and the other PR's changes.

@malcolmgreaves
Copy link
Contributor

This PR's title and description don't actually describe what's happening here. So I will change them to be:

title:
"""
refactor: make DaftExtension class definition static
"""

PR message changes:
"""
Moves the DaftExtension class definition to the module level. Previously, this class was defined at runtime from within the _ensure_registered_super_ext_type function. This meant that the class definition itself was dynamic, which causes problems for some specific cases of schema handling while writing out to some data formats.
"""

@Jay-ju Jay-ju changed the title fix: use DaftExtension as arrow extend type based class refactor: make DaftExtension class definition static Aug 15, 2025
@github-actions github-actions bot added refactor and removed fix labels Aug 15, 2025
@Jay-ju
Copy link
Contributor Author

Jay-ju commented Aug 15, 2025

This PR's title and description don't actually describe what's happening here. So I will change them to be:

title: """ refactor: make DaftExtension class definition static """

PR message changes: """ Moves the DaftExtension class definition to the module level. Previously, this class was defined at runtime from within the _ensure_registered_super_ext_type function. This meant that the class definition itself was dynamic, which causes problems for some specific cases of schema handling while writing out to some data formats. """

@malcolmgreaves fix it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants