-
Notifications
You must be signed in to change notification settings - Fork 1
Fix CI/CD Pipeline Inconsistencies #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
lalvarezt
merged 6 commits into
main
from
claude/fix-cicd-inconsistencies-011CUrHwbNj9K2r2jwNMkdmh
Nov 6, 2025
Merged
Fix CI/CD Pipeline Inconsistencies #13
lalvarezt
merged 6 commits into
main
from
claude/fix-cicd-inconsistencies-011CUrHwbNj9K2r2jwNMkdmh
Nov 6, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit addresses several critical issues in the CI/CD workflows: 1. Fix broken baseline artifact storage: - benchmark.yml now downloads baseline from update-baseline.yml workflow - Previously tried to download from itself after baseline upload was removed 2. Standardize Rust toolchain versions to stable: - CI workflow: Changed test and rustfmt jobs from nightly to stable - CD workflow: Changed deb publish job from nightly to stable - Ensures consistent builds and testing across all jobs These changes restore benchmark comparisons and ensure consistent CI/CD behavior.
Add new workflow that allows repository owners to trigger benchmark comparisons between any two refs via PR comments. Features: - Command syntax: /bench <ref1> <ref2> - Owner-only security: Only repo owner can trigger - Works on PR comments only (not regular issues) - Compares any two commits, branches, or tags - Posts detailed comparison report to PR - Interactive reactions (👀 → 🚀 or 😕) Use cases: - Compare feature branch vs stable release - Validate optimizations between specific commits - Test performance across version boundaries - Ad-hoc performance debugging Documentation: - Updated scripts/README.md with complete workflow documentation - Added security model and usage examples - Documented all three benchmark workflows
Removed automatic benchmarking workflows and consolidated to a single
on-demand command-based system with enhanced flexibility.
Changes:
- Removed benchmark.yml (automatic benchmarks on every PR/push)
- Removed update-baseline.yml (manual baseline updates)
- Enhanced bench-command.yml with optional parameters:
* /bench <ref1> <ref2> [iterations] [sizes]
* Defaults: iterations=100, sizes=1000,5000,10000
* Examples: /bench main HEAD
/bench v0.12.0 v0.13.0 100 1000,5000,10000
New features:
- Graceful handling when benchmark tool doesn't exist in refs
- Posts helpful error messages with commit info
- Validates all parameters before running
- Supports custom iterations and sizes per comparison
- Build logs included in artifacts for debugging
Benefits:
- No automatic runs = lower CI costs
- Owner controls when benchmarks run
- Flexible parameters per comparison
- Clear error messages for invalid refs
- Only one workflow to maintain
Documentation:
- Completely rewrote scripts/README.md
- Added comprehensive usage examples
- Documented all error cases
- Added use case scenarios
- Simplified troubleshooting guide
Address critical issues to make the workflow less flaky and more reliable.
Fixes:
1. Permissions
- Added issues: write and pull-requests: write to check-permission job
- Allows posting comments and reactions without permission errors
2. Concurrency control
- Added concurrency group: bench-${{ issue.number }}
- Prevents overlapping benchmark runs on the same PR
- Uses cancel-in-progress: false to queue instead of cancel
3. Ref fetching
- Added targeted git fetch before validation
- Fetches refs, tags, and heads explicitly
- Prevents "ref not found" errors for remote branches/tags
4. Shell robustness
- Added set -euo pipefail to all bash scripts
- Ensures early failure on any error
- Prevents silent failures and undefined variables
5. Early parse feedback
- Added continue-on-error to parse step
- Posts immediate error message on invalid format
- Shows usage examples with all parameters
- Prevents workflow from proceeding with bad params
Benefits:
- More reliable ref resolution
- Clear error messages for invalid input
- No concurrent runs causing conflicts
- Proper permissions for all operations
- Fail-fast behavior prevents wasted CI time
Critical fixes for YAML parsing and workflow reliability: 1. YAML Syntax Errors (CRITICAL) - Fixed template literals with markdown causing YAML parse errors - Converted all multi-line template literals to array.join() pattern - Lines starting with ** were interpreted as YAML alias markers - Affected: parse error message, acknowledgment, results, error handler 2. Graceful Error Handling - Removed process.exit(78) from "Post missing tool error" step - Non-zero exit triggers handle-error job unnecessarily - Existing if guards properly skip subsequent steps - Workflow now exits gracefully without false failures 3. Concurrency Behavior - Changed cancel-in-progress from false to true - Previous: Multiple /bench commands queue and run sequentially - Now: Newer /bench commands cancel older pending runs - Saves CI time and provides faster feedback All changes maintain functionality while fixing syntax and improving UX.
Implement the intended benchmark logic to ensure correct comparisons: 1. Automatic commit ordering - Determines which ref is older (baseline) vs newer (current) - Uses git commit timestamps for ordering - Ensures baseline is always the older commit regardless of argument order - Example: /bench v0.13.0 main → baseline=v0.13.0, current=main (if main is newer) 2. Same-commit detection - Checks if both refs resolve to the same SHA - Exits early with helpful message if refs are identical - Prevents wasteful benchmark runs on identical code - Example: /bench main origin/main → detects same commit, posts message 3. Updated all steps to use determined ordering - Renamed steps: check_ref1_tool → check_baseline_tool, check_ref2_tool → check_current_tool - Updated benchmark steps to use baseline_ref and current_ref - Updated artifact names: benchmark_ref1.json → benchmark_baseline.json - Updated build logs: build_ref1.log → build_baseline.log - Results always show: Baseline (older) vs Current (newer) 4. Improved error messages - Missing tool errors now indicate "baseline (older)" vs "current (newer)" - Same-commit message explains refs are identical - Clear distinction between user input and determined ordering Benefits: - Consistent baseline/current semantics regardless of argument order - No wasted CI time on identical commits - Clear, predictable results in every scenario - Better user experience with informative messages
lalvarezt
added a commit
that referenced
this pull request
Nov 6, 2025
* fix(ci): resolve CI/CD inconsistencies and broken baseline storage
This commit addresses several critical issues in the CI/CD workflows:
1. Fix broken baseline artifact storage:
- benchmark.yml now downloads baseline from update-baseline.yml
workflow
- Previously tried to download from itself after baseline upload was
removed
2. Standardize Rust toolchain versions to stable:
- CI workflow: Changed test and rustfmt jobs from nightly to stable
- CD workflow: Changed deb publish job from nightly to stable
- Ensures consistent builds and testing across all jobs
These changes restore benchmark comparisons and ensure consistent CI/CD
behavior.
* feat(ci): add /bench command for on-demand PR benchmark comparisons
Add new workflow that allows repository owners to trigger benchmark
comparisons between any two refs via PR comments.
Features:
- Command syntax: /bench <ref1> <ref2>
- Owner-only security: Only repo owner can trigger
- Works on PR comments only (not regular issues)
- Compares any two commits, branches, or tags
- Posts detailed comparison report to PR
- Interactive reactions (👀 → 🚀 or 😕)
Use cases:
- Compare feature branch vs stable release
- Validate optimizations between specific commits
- Test performance across version boundaries
- Ad-hoc performance debugging
Documentation:
- Updated scripts/README.md with complete workflow documentation
- Added security model and usage examples
- Documented all three benchmark workflows
* refactor(ci): simplify benchmark system to single /bench command
Removed automatic benchmarking workflows and consolidated to a single
on-demand command-based system with enhanced flexibility.
Changes:
- Removed benchmark.yml (automatic benchmarks on every PR/push)
- Removed update-baseline.yml (manual baseline updates)
- Enhanced bench-command.yml with optional parameters:
* /bench <ref1> <ref2> [iterations] [sizes]
* Defaults: iterations=100, sizes=1000,5000,10000
* Examples: /bench main HEAD
/bench v0.12.0 v0.13.0 100 1000,5000,10000
New features:
- Graceful handling when benchmark tool doesn't exist in refs
- Posts helpful error messages with commit info
- Validates all parameters before running
- Supports custom iterations and sizes per comparison
- Build logs included in artifacts for debugging
Benefits:
- No automatic runs = lower CI costs
- Owner controls when benchmarks run
- Flexible parameters per comparison
- Clear error messages for invalid refs
- Only one workflow to maintain
Documentation:
- Completely rewrote scripts/README.md
- Added comprehensive usage examples
- Documented all error cases
- Added use case scenarios
- Simplified troubleshooting guide
* fix(ci): improve bench workflow robustness and reliability
Address critical issues to make the workflow less flaky and more
reliable.
Fixes:
1. Permissions
- Added issues: write and pull-requests: write to check-permission
job
- Allows posting comments and reactions without permission errors
2. Concurrency control
- Added concurrency group: bench-${{ issue.number }}
- Prevents overlapping benchmark runs on the same PR
- Uses cancel-in-progress: false to queue instead of cancel
3. Ref fetching
- Added targeted git fetch before validation
- Fetches refs, tags, and heads explicitly
- Prevents "ref not found" errors for remote branches/tags
4. Shell robustness
- Added set -euo pipefail to all bash scripts
- Ensures early failure on any error
- Prevents silent failures and undefined variables
5. Early parse feedback
- Added continue-on-error to parse step
- Posts immediate error message on invalid format
- Shows usage examples with all parameters
- Prevents workflow from proceeding with bad params
Benefits:
- More reliable ref resolution
- Clear error messages for invalid input
- No concurrent runs causing conflicts
- Proper permissions for all operations
- Fail-fast behavior prevents wasted CI time
* fix(ci): fix YAML syntax errors and improve workflow behavior
Critical fixes for YAML parsing and workflow reliability:
1. YAML Syntax Errors (CRITICAL)
- Fixed template literals with markdown causing YAML parse errors
- Converted all multi-line template literals to array.join() pattern
- Lines starting with ** were interpreted as YAML alias markers
- Affected: parse error message, acknowledgment, results, error
handler
2. Graceful Error Handling
- Removed process.exit(78) from "Post missing tool error" step
- Non-zero exit triggers handle-error job unnecessarily
- Existing if guards properly skip subsequent steps
- Workflow now exits gracefully without false failures
3. Concurrency Behavior
- Changed cancel-in-progress from false to true
- Previous: Multiple /bench commands queue and run sequentially
- Now: Newer /bench commands cancel older pending runs
- Saves CI time and provides faster feedback
All changes maintain functionality while fixing syntax and improving UX.
* feat(ci): add intelligent commit ordering and same-commit detection
Implement the intended benchmark logic to ensure correct comparisons:
1. Automatic commit ordering
- Determines which ref is older (baseline) vs newer (current)
- Uses git commit timestamps for ordering
- Ensures baseline is always the older commit regardless of argument
order
- Example: /bench v0.13.0 main → baseline=v0.13.0, current=main (if
main is newer)
2. Same-commit detection
- Checks if both refs resolve to the same SHA
- Exits early with helpful message if refs are identical
- Prevents wasteful benchmark runs on identical code
- Example: /bench main origin/main → detects same commit, posts
message
3. Updated all steps to use determined ordering
- Renamed steps: check_ref1_tool → check_baseline_tool,
check_ref2_tool → check_current_tool
- Updated benchmark steps to use baseline_ref and current_ref
- Updated artifact names: benchmark_ref1.json →
benchmark_baseline.json
- Updated build logs: build_ref1.log → build_baseline.log
- Results always show: Baseline (older) vs Current (newer)
4. Improved error messages
- Missing tool errors now indicate "baseline (older)" vs "current
(newer)"
- Same-commit message explains refs are identical
- Clear distinction between user input and determined ordering
Benefits:
- Consistent baseline/current semantics regardless of argument order
- No wasted CI time on identical commits
- Clear, predictable results in every scenario
- Better user experience with informative messages
---------
Co-authored-by: Claude <noreply@anthropic.com>
lalvarezt
added a commit
that referenced
this pull request
Nov 9, 2025
* fix(ci): resolve CI/CD inconsistencies and broken baseline storage
This commit addresses several critical issues in the CI/CD workflows:
1. Fix broken baseline artifact storage:
- benchmark.yml now downloads baseline from update-baseline.yml
workflow
- Previously tried to download from itself after baseline upload was
removed
2. Standardize Rust toolchain versions to stable:
- CI workflow: Changed test and rustfmt jobs from nightly to stable
- CD workflow: Changed deb publish job from nightly to stable
- Ensures consistent builds and testing across all jobs
These changes restore benchmark comparisons and ensure consistent CI/CD
behavior.
* feat(ci): add /bench command for on-demand PR benchmark comparisons
Add new workflow that allows repository owners to trigger benchmark
comparisons between any two refs via PR comments.
Features:
- Command syntax: /bench <ref1> <ref2>
- Owner-only security: Only repo owner can trigger
- Works on PR comments only (not regular issues)
- Compares any two commits, branches, or tags
- Posts detailed comparison report to PR
- Interactive reactions (👀 → 🚀 or 😕)
Use cases:
- Compare feature branch vs stable release
- Validate optimizations between specific commits
- Test performance across version boundaries
- Ad-hoc performance debugging
Documentation:
- Updated scripts/README.md with complete workflow documentation
- Added security model and usage examples
- Documented all three benchmark workflows
* refactor(ci): simplify benchmark system to single /bench command
Removed automatic benchmarking workflows and consolidated to a single
on-demand command-based system with enhanced flexibility.
Changes:
- Removed benchmark.yml (automatic benchmarks on every PR/push)
- Removed update-baseline.yml (manual baseline updates)
- Enhanced bench-command.yml with optional parameters:
* /bench <ref1> <ref2> [iterations] [sizes]
* Defaults: iterations=100, sizes=1000,5000,10000
* Examples: /bench main HEAD
/bench v0.12.0 v0.13.0 100 1000,5000,10000
New features:
- Graceful handling when benchmark tool doesn't exist in refs
- Posts helpful error messages with commit info
- Validates all parameters before running
- Supports custom iterations and sizes per comparison
- Build logs included in artifacts for debugging
Benefits:
- No automatic runs = lower CI costs
- Owner controls when benchmarks run
- Flexible parameters per comparison
- Clear error messages for invalid refs
- Only one workflow to maintain
Documentation:
- Completely rewrote scripts/README.md
- Added comprehensive usage examples
- Documented all error cases
- Added use case scenarios
- Simplified troubleshooting guide
* fix(ci): improve bench workflow robustness and reliability
Address critical issues to make the workflow less flaky and more
reliable.
Fixes:
1. Permissions
- Added issues: write and pull-requests: write to check-permission
job
- Allows posting comments and reactions without permission errors
2. Concurrency control
- Added concurrency group: bench-${{ issue.number }}
- Prevents overlapping benchmark runs on the same PR
- Uses cancel-in-progress: false to queue instead of cancel
3. Ref fetching
- Added targeted git fetch before validation
- Fetches refs, tags, and heads explicitly
- Prevents "ref not found" errors for remote branches/tags
4. Shell robustness
- Added set -euo pipefail to all bash scripts
- Ensures early failure on any error
- Prevents silent failures and undefined variables
5. Early parse feedback
- Added continue-on-error to parse step
- Posts immediate error message on invalid format
- Shows usage examples with all parameters
- Prevents workflow from proceeding with bad params
Benefits:
- More reliable ref resolution
- Clear error messages for invalid input
- No concurrent runs causing conflicts
- Proper permissions for all operations
- Fail-fast behavior prevents wasted CI time
* fix(ci): fix YAML syntax errors and improve workflow behavior
Critical fixes for YAML parsing and workflow reliability:
1. YAML Syntax Errors (CRITICAL)
- Fixed template literals with markdown causing YAML parse errors
- Converted all multi-line template literals to array.join() pattern
- Lines starting with ** were interpreted as YAML alias markers
- Affected: parse error message, acknowledgment, results, error
handler
2. Graceful Error Handling
- Removed process.exit(78) from "Post missing tool error" step
- Non-zero exit triggers handle-error job unnecessarily
- Existing if guards properly skip subsequent steps
- Workflow now exits gracefully without false failures
3. Concurrency Behavior
- Changed cancel-in-progress from false to true
- Previous: Multiple /bench commands queue and run sequentially
- Now: Newer /bench commands cancel older pending runs
- Saves CI time and provides faster feedback
All changes maintain functionality while fixing syntax and improving UX.
* feat(ci): add intelligent commit ordering and same-commit detection
Implement the intended benchmark logic to ensure correct comparisons:
1. Automatic commit ordering
- Determines which ref is older (baseline) vs newer (current)
- Uses git commit timestamps for ordering
- Ensures baseline is always the older commit regardless of argument
order
- Example: /bench v0.13.0 main → baseline=v0.13.0, current=main (if
main is newer)
2. Same-commit detection
- Checks if both refs resolve to the same SHA
- Exits early with helpful message if refs are identical
- Prevents wasteful benchmark runs on identical code
- Example: /bench main origin/main → detects same commit, posts
message
3. Updated all steps to use determined ordering
- Renamed steps: check_ref1_tool → check_baseline_tool,
check_ref2_tool → check_current_tool
- Updated benchmark steps to use baseline_ref and current_ref
- Updated artifact names: benchmark_ref1.json →
benchmark_baseline.json
- Updated build logs: build_ref1.log → build_baseline.log
- Results always show: Baseline (older) vs Current (newer)
4. Improved error messages
- Missing tool errors now indicate "baseline (older)" vs "current
(newer)"
- Same-commit message explains refs are identical
- Clear distinction between user input and determined ordering
Benefits:
- Consistent baseline/current semantics regardless of argument order
- No wasted CI time on identical commits
- Clear, predictable results in every scenario
- Better user experience with informative messages
---------
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.