Skip to content

Conversation

@schustmi
Copy link
Contributor

@schustmi schustmi commented Dec 3, 2025

Describe changes

This PR adds an index attribute for each pipeline run, which specifies the index of the run within its pipeline.

Additionally, the pipeline attribute of the PipelineRunRequest was removed as the information is already included in the snapshot that is part of the request.

Limitations:

  • If a run fails to be created (could be because a run with the same name exists or other reasons), the run number will still be bumped and that index will simply be unused
  • When running a pipeline client-side, the following happens: We first create a pipeline run in the database, and then ask the orchestrator to submit the pipeline. If this submission fails, we currently delete the pipeline run. This will also cause a gap in the indexes of pipeline runs.

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • IMPORTANT: I made sure that my changes are reflected properly in the following resources:
    • ZenML Docs
    • Dashboard: Needs to be communicated to the frontend team.
    • Templates: Might need adjustments (that are not reflected in the template tests) in case of non-breaking changes and deprecations.
    • Projects: Depending on the version dependencies, different projects might get affected.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

@github-actions github-actions bot added internal To filter out internal PRs and issues enhancement New feature or request labels Dec 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

Documentation Link Check Results

Absolute links check passed
Relative links check passed
Last checked: 2025-12-08 09:16:27 UTC

@schustmi schustmi force-pushed the feature/pipeline-run-ordering branch 2 times, most recently from cf5dea6 to b0c1aa3 Compare December 3, 2025 10:33
@schustmi schustmi added the release-notes Release notes will be attached and used publicly for this PR. label Dec 3, 2025
@schustmi schustmi force-pushed the feature/pipeline-run-ordering branch 6 times, most recently from a9cb52f to e6c8b05 Compare December 4, 2025 03:10
@schustmi schustmi force-pushed the feature/pipeline-run-ordering branch 3 times, most recently from 1fa5cf3 to 8788126 Compare December 4, 2025 09:55
@schustmi schustmi changed the title Implement unique numbers for pipeline runs Pipeline run index attribute Dec 4, 2025
@schustmi schustmi changed the title Pipeline run index attribute Add pipeline run index attribute Dec 4, 2025
@schustmi schustmi force-pushed the feature/pipeline-run-ordering branch from 8788126 to cdc3723 Compare December 4, 2025 10:26
@bcdurak bcdurak linked an issue Dec 4, 2025 that may be closed by this pull request
1 task
index_within_pipeline = 0
run_updates: list[dict[str, Any]] = []
run_counts: dict[str, int] = {}
for row in result:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally I would advocate to execute data population for optional values in a post-deployment or post-migration step. Let's think about the worst case scenario in terms of execution time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the index should be non-optional and the column type is changed at the end of the DB migration to reflect that, which is why we need some values for each existing row. I guess we could populate it with a placeholder and then later on replace by real values, not sure it's worth the effort here though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah let's no change the migration logic, let's just estimate the execution time for the backfill (worst case scenario). The general rule is migrations should complete quickly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally I would advocate to execute data population for optional values in a post-deployment or post-migration step

@Json-Andriopoulos I'm curious, why is it not acceptable to do these sort of changes in the alembic migration script ? what are the pros/cons ?

session.execute(
update(PipelineSchema)
.where(col(PipelineSchema.id) == pipeline_id)
.values(run_count=col(PipelineSchema.run_count) + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good one! I think this should be ok with the default isolation level in MySQL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I think so! Also in my tests it seemed to work just fine. I also didn't add a unique constraint on pipeline_id, index, so even if this for some reason fails to generate unique values, we will not fail to create a run but instead only have a duplicate index.

@schustmi schustmi force-pushed the feature/pipeline-run-ordering branch from 042c2f0 to 5fb6fcd Compare December 5, 2025 04:12
Copy link
Contributor

@Json-Andriopoulos Json-Andriopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@schustmi schustmi added the run-slow-ci Tag that is used to trigger the slow-ci label Dec 5, 2025
@schustmi schustmi force-pushed the feature/pipeline-run-ordering branch from 5fb6fcd to 3394503 Compare December 8, 2025 08:18
@schustmi schustmi merged commit c7b1dd7 into develop Dec 8, 2025
77 of 84 checks passed
@schustmi schustmi deleted the feature/pipeline-run-ordering branch December 8, 2025 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request internal To filter out internal PRs and issues release-notes Release notes will be attached and used publicly for this PR. run-slow-ci Tag that is used to trigger the slow-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add run numbers to pipeline runs to make them easier to see

5 participants