-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Add run_on_latest_version support for backfill and clear operations #52177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add run_on_latest_version support for backfill and clear operations #52177
Conversation
1f4b2e3
to
bfc2a49
Compare
bfc2a49
to
661e02a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. There are some questions about the usage of SchedulerDagBag
in core-api.
airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py
Show resolved
Hide resolved
UI for dagrun clear and task instance clear look good. We'll add this to backfills next? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Design question: instead of --run-on-latest-version
, what do you think of --bundle-version "latest"
as default?
I think it makes more sense for backfill but for clearing tasks, we can still keep run on the latest version or do you suggest we should show a form for the user to fill the version they want to run after clearing? |
Also, by default, it runs with the version the DAG initially used, if it has run before. if we default --bundle-version with the latest, then most rerun backfill will fail if the bundle has changed |
aah ok |
Will we ever have a situation where we have both |
I think if we need it, then it should be mutually exclusive. |
I connected with Jed, and we agreed to mark the flag as experimental for now. This way, if we decide to introduce |
9e4faa4
to
8713b05
Compare
Sounds good |
8713b05
to
87aa44c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I have one comment around UI-backend, not a blocker for the PR if it is needed. Thanks, Ephraim!
Found the change in migrations, which is already done
airflow-core/src/airflow/ui/src/components/Clear/TaskInstance/ClearTaskInstanceDialog.tsx
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/ui/src/components/Clear/Run/ClearRunDialog.tsx
Outdated
Show resolved
Hide resolved
13f44b4
to
b7ab41b
Compare
With this option, users are able to choose the dag version they want to run their dag/task after clearing or when running backfill. This only applies to versioned bundles as non-versioned bundles run with the latest dag version. When the user choose the run with latest version, the bundle_version associated with the dagrun is updated to the latest and the associated serialized dag version updated to the latest. Choosing not to run with latest version which is the default means that the bundle version and serialized dag version that the dag ran with initially would be used in running it again. For backfill, there's now --run-on-latest-version flag that makes it run with the latest version, otherwise it will run with the original bundle the dagrun was created with. Note that it's only useful when rerunning a dagrun using backfill. The default behaviour is using the initial bundle/version and this is intentional otherwise running backfill will fail if there was task rename in the latest version. Summary of changes: - Use SchedulerDagBag instead of DagBag for execution API - Add run_on_latest_version field to DAGRunClearBody and ClearTaskInstancesBody models - Add --run-on-latest-version CLI flag for backfill command - Update backfill.py to support running tasks with latest DAG version - Add UI checkbox for "Run with latest version" in clear dialogs - Update SchedulerDagBag to handle latest version parameter - Update API endpoints to support run_on_latest_version parameter
Co-authored-by: Jed Cunningham <[email protected]>
Co-authored-by: Jed Cunningham <[email protected]>
b7ab41b
to
71ec6c1
Compare
With this option, users are able to choose the dag version they want
to run their dag/task after clearing or when running backfill. This
only applies to versioned bundles as non-versioned bundles run with
the latest dag version.
When the user choose the run with latest version, the bundle_version
associated with the dagrun is updated to the latest and the associated
serialized dag version updated to the latest. Choosing not to run
with latest version which is the default means that the bundle version
and serialized dag version that the dag ran with initially would be used
in running it again.
For backfill, there's now --run-on-latest-version flag that makes it run
with the latest version, otherwise it will run with the original bundle
the dagrun was created with. Note that it's only useful when rerunning
a dagrun using backfill. The default behaviour is using the initial bundle/version
and this is intentional otherwise running backfill will fail if there was
task rename in the latest version.
Summary of changes:
closes: #49007, closes: #49047