Skip to content

Conversation

@drazisil-codecov
Copy link
Contributor

@drazisil-codecov drazisil-codecov commented Dec 10, 2025

Description

Centralizes retry handling in BaseCodecovTask.retry() and fixes infinite retry bugs in bundle analysis tasks.

Changes:

  • Override BaseCodecovTask.retry() to handle max_retries checks, logging, metrics, and exponential backoff
  • Remove max_retries parameter from LockManager.locked(); delegate retry limits to tasks
  • Remove safe_retry() and _retry() helper methods
  • Add contextual retry logging (repoid, commitid, report_type) nested under context key
  • Fix infinite retry bugs in BundleAnalysisProcessorTask and BundleAnalysisNotifyTask
  • Refactor tests to remove mocks of internal retry() logic and test real behavior

Impact:

  • Prevents infinite retries when max_retries is exceeded
  • Standardizes retry behavior across all tasks
  • Improves observability with structured retry logs
  • Simplifies retry logic by removing redundant helpers

Note

Unifies and hardens retry behavior across worker tasks while simplifying lock handling and improving observability.

  • Centralized retries: Override BaseCodecovTask.retry() to enforce max_retries, apply exponential backoff, update attempts header, emit metrics, and add structured context in logs; adds default max_retries from config and _emit_queue_metrics() helper
  • Lock manager simplification: Remove max_retries from LockManager.locked(); it now raises LockRetry(countdown) and logs release in finally
  • Task updates: Migrate bundle_analysis_processor, bundle_analysis_notify, upload, upload_finisher, preprocess_upload, manual_trigger, test_results_finisher, and test_analytics_notifier to use self.request.retries and the new retry(); replace custom/safe retry paths and explicit max-retry checks; handle MaxRetriesExceededError where needed
  • Tests: Refactor to exercise real retry behavior (no internal retry mocks), add coverage for capped backoff, queue-time metrics, and max-retries-exceeded cases to prevent infinite retries
  • Minor: Improve DB error analysis handling, tidy logging/messages, and small API/typing cleanups

Written by Cursor Bugbot for commit e1947a7. This will update automatically on new commits. Configure here.

…essor

- Added logic to prevent infinite retries when the maximum retry limit is exceeded.
- Introduced error logging to capture details when max retries are reached.
- Enhanced unit tests to verify behavior for both exceeding and not exceeding max retries.
@sentry
Copy link

sentry bot commented Dec 10, 2025

Codecov Report

❌ Patch coverage is 90.29126% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.90%. Comparing base (aca0c54) to head (e1947a7).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
apps/worker/tasks/base.py 90.00% 6 Missing ⚠️
apps/worker/tasks/test_results_finisher.py 20.00% 4 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main     #608    +/-   ##
========================================
  Coverage   93.89%   93.90%            
========================================
  Files        1286     1285     -1     
  Lines       46807    46685   -122     
  Branches     1517     1512     -5     
========================================
- Hits        43951    43841   -110     
+ Misses       2547     2535    -12     
  Partials      309      309            
Flag Coverage Δ
workerintegration 58.57% <52.42%> (+0.31%) ⬆️
workerunit 91.34% <90.29%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-notifications
Copy link

codecov-notifications bot commented Dec 10, 2025

Codecov Report

❌ Patch coverage is 90.29126% with 10 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
apps/worker/tasks/base.py 90.00% 6 Missing ⚠️
apps/worker/tasks/test_results_finisher.py 20.00% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

- Updated the LockManager's locked method to eliminate the max_retries parameter, simplifying retry handling.
- Adjusted related unit tests to reflect the removal of max_retries, ensuring proper logging and behavior for retry scenarios.
- Enhanced error handling to ensure retries are managed through the task's retry method instead.
@drazisil-codecov drazisil-codecov changed the title feat(worker): Implement max retries handling for bundle analysis processor feat(worker): Centralize retry logic in BaseCodecovTask and fix infinite retry bugs Dec 10, 2025
- Introduced a sentinel object to differentiate between "not provided" and "explicitly None" for max_retries.
- Updated the retry method to accommodate unlimited retries when max_retries is set to None, aligning with Celery's behavior.
- Simplified logic for determining the effective max_retries value during task retries.
@drazisil-codecov drazisil-codecov marked this pull request as ready for review December 10, 2025 15:23
current_retries = self.request.retries if hasattr(self, "request") else 0
task_max_retries = max_retries if max_retries is not None else self.max_retries

request = getattr(self, "request", None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this just return out if self.request is None? The next line is going to freak out trying to call retries on None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed by adding an early check for request is None and handling it gracefully. The code now safely handles the case where request might not exist, using default values instead of potentially accessing attributes on None.

Comment on lines 250 to 253
# Resolve max_retries for checking/logging
# If max_retries is not provided, use self.max_retries from task class
# If max_retries=None is explicitly provided, means unlimited retries (matches Celery behavior)
# self.max_retries is inherited from BaseCodecovTask (TASK_MAX_RETRIES_DEFAULT) or overridden by task
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we eliminate some of the comments in this file, it's getting hard to read the code, either aggregate the important chunks at the top of the function definition or just remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Reduced excessive inline comments and consolidated the important information. The code is now much more readable while still maintaining clarity on the key logic.

log = logging.getLogger("worker")

# Sentinel object to distinguish "not provided" from "explicitly None" for max_retries
_NOT_PROVIDED = object()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a little bit weird to make a comparison to object(). I'd prefer is this were a singleton class or some other primitive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! Replaced object() with a singleton class _NotProvided which is clearer and more maintainable. This makes the sentinel pattern more explicit and easier to understand.

# If max_retries is not provided, use self.max_retries from task class
# If max_retries=None is explicitly provided, means unlimited retries (matches Celery behavior)
# self.max_retries is inherited from BaseCodecovTask (TASK_MAX_RETRIES_DEFAULT) or overridden by task
if max_retries is _NOT_PROVIDED:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the whole point of this is to override a subclasses max_retries, seems like we should just rely on the subclass having the correct max set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! The ability to override max_retries is actually needed in practice. For example, upload_processor.py uses error-specific retry limits (self.retry(max_retries=error.max_retries, ...)) where different errors may have different retry limits than the task default. This provides flexibility for dynamic retry behavior based on error conditions while still defaulting to the task's max_retries when not specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks AI, can we rename max_retries then to something like max_retries_override

Comment on lines 261 to 263
request_kwargs = {}
if request and hasattr(request, "kwargs"):
request_kwargs = request.kwargs if request.kwargs is not None else {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to not just be request.kwargs? seems like we don't need to copy the data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Simplified kwargs handling - we now use request.kwargs directly when available instead of creating unnecessary copies. The code is cleaner and more efficient.

if request and hasattr(request, "kwargs"):
request_kwargs = request.kwargs if request.kwargs is not None else {}
# Note: kwargs parameter is Celery's special parameter for task kwargs on retry
retry_kwargs = kwargs if kwargs is not None else {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just default kwargs in the function def to {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! However, using a mutable default (kwargs={}) is a Python anti-pattern. I kept kwargs=None as the default and handle it in the body, which is safer and follows Python best practices. The code now properly handles both None and empty dict cases.

Comment on lines 313 to 320
# Pass max_retries to Celery:
# - If not provided, use self.max_retries (inherited from BaseCodecovTask or overridden)
# - If explicitly provided (including None), pass as-is (None = unlimited in Celery)
if max_retries is _NOT_PROVIDED:
# Not provided, use task's max_retries (inherited from BaseCodecovTask or overridden)
celery_max_retries = self.max_retries
else:
# Explicitly provided, pass as-is (None means unlimited in Celery, matches superclass behavior)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, this is hard to read with so many comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! Removed excessive comments and simplified the code. The important information is now in the docstring, and the implementation is cleaner and easier to read.

f"Task {self.name} exceeded max retries",
extra={
"task_name": self.name,
"context": context if context else None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be its own var instead of just explicitly putting the variables that we pull from kwargs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, this is the UploadContext. It's being added to the kwargs with this PR so the base class retry logic has access to it for logging. These logs match the structure of the other logs.

Comment on lines 286 to 289
if UploadFlow.has_begun():
UploadFlow.log(UploadFlow.UNCAUGHT_RETRY_EXCEPTION)
if TestResultsFlow.has_begun():
TestResultsFlow.log(TestResultsFlow.UNCAUGHT_RETRY_EXCEPTION)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is too specific for a base class to have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point! Removed UploadFlow and TestResultsFlow logging from the base class as they are too specific. These flows can handle their own logging in their specific task implementations if needed. The base class should remain generic.

- Changed the logging of context fields to use a nested dictionary for improved clarity.
- Updated error and warning logs to include the new context structure, enhancing the detail of logged information during task retries and max retries exceeded scenarios.
Comment on lines 503 to 513

def test_bundle_analysis_process_upload_rate_limit_error(
caplog,
dbsession,
mocker,
mock_storage,
):
"""Test that rate_limit_error (retryable) causes retry with correct countdown"""
storage_path = (
"v1/repos/testing/ed1bdd67-8fd2-4cdb-ac9e-39b99e4a3892/bundle_report.sqlite"
)

This comment was marked as outdated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! This bug was already fixed in the implementation. The code initializes result: ProcessingResult | None = None before the try block (line 161 in apps/worker/tasks/bundle_analysis_processor.py) and checks if result is not None: in the finally block (line 218) before accessing result.bundle_report. This ensures that if an exception occurs before result is assigned, the finally block safely skips cleanup without raising a NameError.

"""Sentinel class to distinguish 'not provided' from 'explicitly None' for max_retries"""


_NOT_PROVIDED = _NotProvided()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasrockhu-codecov I don't understand why a named class is not the same as a unnamed class. This is different from how I do singletons in JavaScript. Also, while the comment says "singleton", is that really what we are doing here? The idea is to check for "set vs not-set", because Celery treats None differently from "not set".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm... I guess this is kind of weird too huh? can we make _NOT_PROVIDED = -1? I am assuming that retries should never be negative

unrelated, here's a reasonable writeup on how to make singletons in Python https://www.geeksforgeeks.org/python/singleton-pattern-in-python-a-complete-guide/

Copy link
Contributor Author

@drazisil-codecov drazisil-codecov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

This is a solid PR that centralizes retry logic and fixes important infinite retry bugs. The approach is clean and well-tested. Here are some suggestions for improvement:

Strengths

✅ Centralizes retry logic in one place
✅ Fixes infinite retry bugs
✅ Good test coverage (95.35%)
✅ Structured logging with context
✅ Removes redundant helper methods

Suggestions for Improvement

1. Missing Test Coverage (2 lines in base.py)

The coverage report indicates 2 lines are missing coverage. Based on the code, these are likely:

  • Lines 254-258: The if request is None: edge case branch
  • Line 262: The hasattr(request, "kwargs") check when request.kwargs is falsy

Recommendation: Add tests for these edge cases to reach 100% coverage.

2. Redundant Check in request.kwargs Handling

request_kwargs = (
    request.kwargs if hasattr(request, "kwargs") and request.kwargs else {}
)

The and request.kwargs check is redundant since hasattr only checks existence. If request.kwargs exists but is None, this would still use {}, which is correct, but the check could be simplified.

Recommendation: Consider simplifying to:

request_kwargs = getattr(request, "kwargs", None) or {}

3. Inconsistent Retry Check in Bundle Analysis

In bundle_analysis_processor.py line 182, there's still a manual self.request.retries < self.max_retries check before calling self.retry(). This is now redundant since self.retry() handles max_retries internally.

Recommendation: Remove the manual check to be consistent with the centralized approach:

if result.error and result.error.is_retryable:
    log.warn(...)
    self.retry(countdown=30 * (2**self.request.retries))

4. Code Duplication in retry() Method

The super().retry() calls at lines 316-330 are nearly identical, differing only by the kwargs parameter.

Recommendation: Simplify to:

retry_kwargs_dict = {"countdown": countdown, "exc": exc, "max_retries": celery_max_retries}
if kwargs:
    retry_kwargs_dict["kwargs"] = kwargs
retry_kwargs_dict.update(other_kwargs)
return super().retry(**retry_kwargs_dict)

5. Documentation Enhancement

The docstring for retry() could be clearer about the sentinel pattern usage.

Recommendation: Add a note explaining why _NOT_PROVIDED is used:

"""
Override Celery's retry() to automatically check max_retries and track metrics.

The max_retries parameter uses a sentinel pattern (_NOT_PROVIDED) to distinguish:
- Not provided: Uses self.max_retries from the task class
- Explicitly None: Unlimited retries (matches Celery behavior)
- Explicit integer: Override the task's max_retries for this retry
"""

6. Potential Edge Case: Empty String Values

The context extraction checks is not None, but doesn't handle empty strings. If commitid="", it would still be included in context.

Recommendation: Consider filtering out empty strings:

if all_kwargs.get("commitid") not in (None, ""):
    context["commitid"] = all_kwargs.get("commitid")

7. Lock Manager Documentation

The docstring for LockManager.locked() mentions that tasks should handle max_retries via self.retry(), but it might be helpful to add an example.

Recommendation: Add a brief example in the docstring showing the pattern:

"""
Raises:
    LockRetry: If lock cannot be acquired, with countdown for retry scheduling.
              The calling task should handle max_retries checking via self.retry().
              
Example:
    try:
        with lock_manager.locked(LockType.UPLOAD, retry_num=self.request.retries):
            # do work
    except LockRetry as e:
        self.retry(countdown=e.countdown)
"""

Minor Issues

  • Line 184 in bundle_analysis_processor.py: log.warn should be log.warning (though warn is deprecated but still works)

Questions

  1. Should the request is None case ever happen in production? If not, should it raise an exception instead of silently using defaults?
  2. Is there a reason to keep the manual max_retries check in bundle_analysis_processor.py, or should it be removed for consistency?

Overall, this is a well-executed refactoring that improves code maintainability and fixes critical bugs. The suggestions above are mostly minor improvements and edge case handling.

Copy link
Contributor

@thomasrockhu-codecov thomasrockhu-codecov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the comments, but this is fine

log.info(
"Releasing lock",
extra={
"repoid": self.repoid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: alphabetize

"""Sentinel class to distinguish 'not provided' from 'explicitly None' for max_retries"""


_NOT_PROVIDED = _NotProvided()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm... I guess this is kind of weird too huh? can we make _NOT_PROVIDED = -1? I am assuming that retries should never be negative

unrelated, here's a reasonable writeup on how to make singletons in Python https://www.geeksforgeeks.org/python/singleton-pattern-in-python-a-complete-guide/

"""
request = getattr(self, "request", None)
if request is None:
# If no request, can't retry - this shouldn't happen in normal operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually want this behavior? I think if there's no request we should throw an error and return

else:
current_retries = request.retries if hasattr(request, "retries") else 0
request_kwargs = (
request.kwargs if hasattr(request, "kwargs") and request.kwargs else {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same is true here, do we actually want to continue if kwargs isn't in the request? We can probably group this with above

# If max_retries is not provided, use self.max_retries from task class
# If max_retries=None is explicitly provided, means unlimited retries (matches Celery behavior)
# self.max_retries is inherited from BaseCodecovTask (TASK_MAX_RETRIES_DEFAULT) or overridden by task
if max_retries is _NOT_PROVIDED:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks AI, can we rename max_retries then to something like max_retries_override

if max_retries is _NOT_PROVIDED:
task_max_retries = self.max_retries
else:
task_max_retries = max_retries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just task_max_retries = self.max_retries if max_retries is NOT_PROVIDED else max_retries or similar

Comment on lines 274 to 280
context = {}
if all_kwargs.get("commitid") is not None:
context["commitid"] = all_kwargs.get("commitid")
if all_kwargs.get("repoid") is not None:
context["repoid"] = all_kwargs.get("repoid")
if all_kwargs.get("report_type") is not None:
context["report_type"] = all_kwargs.get("report_type")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be cleaner, since we only care about those 3 attributes, just loop through them

},
)

celery_max_retries = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be different than task_max_retries

- Enhanced error handling for MaxRetriesExceededError in BundleAnalysisNotifyTask, TestResultsFinisherTask, and UploadFinisherTask.
- Updated retry logic to ensure consistent behavior when max retries are exceeded.
- Refactored import statements for clarity and organization.
- Added a utility function to retrieve request headers from the Celery request object.
TASK_TIME_IN_QUEUE = Histogram(
"worker_tasks_timers_time_in_queue_seconds",
"Time in seconds spent waiting in the queue before being run",
"Time in {TODO} spent waiting in the queue before being run",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO placeholder accidentally committed in metric description

The metric description contains a {TODO} placeholder that was accidentally committed. The original text "Time in seconds spent waiting in the queue before being run" was modified to "Time in {TODO} spent waiting in the queue before being run". This will appear in production monitoring dashboards and Prometheus metric descriptions, making the metric confusing for operators.

Fix in Cursor Fix in Web

if max_retries_override != _NOT_PROVIDED:
task_max_retries = max_retries_override
elif "max_retries" in other_kwargs:
task_max_retries = other_kwargs.pop("max_retries")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-pop causes wrong max_retries passed to Celery

When max_retries is passed via **other_kwargs (e.g., self.retry(max_retries=5, countdown=10)), it's popped from other_kwargs at line 316 to set task_max_retries. Later at line 373, the code checks "max_retries" in other_kwargs again, but since it was already popped, the check is False and falls through to celery_max_retries = self.max_retries. This means tasks using the old max_retries=X pattern will have the BaseCodecovTask check use X but Celery's retry will use self.max_retries (possibly a different value), causing inconsistent retry behavior. Tasks like http_request.py, manual_trigger.py, and test_analytics_notifier.py use this pattern.

Additional Locations (1)

Fix in Cursor Fix in Web

error=Errors.INTERNAL_RETRYING,
)
self.retry(countdown=retry.countdown)
except MaxRetriesExceededError:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing INTERNAL_OUT_OF_RETRIES breadcrumb when max retries exceeded

When max retries are exceeded, the code sends Errors.INTERNAL_RETRYING breadcrumb before attempting self.retry(), then catches MaxRetriesExceededError and returns silently. The old code explicitly checked max retries first and sent Errors.INTERNAL_OUT_OF_RETRIES breadcrumb when exceeded. Now INTERNAL_OUT_OF_RETRIES is never sent, and INTERNAL_RETRYING is incorrectly sent even when the task is about to give up. This affects observability for systems monitoring upload finisher retry exhaustion.

Additional Locations (1)

Fix in Cursor Fix in Web

- Changed the description of TASK_TIME_IN_QUEUE for clarity.
- Updated retry handling in BundleAnalysisNotifyTask and TestResultsFinisherTask to use self.request.retries instead of self.attempts for consistency.
- Ensured safe access to the request object in BaseCodecovTask's _emit_queue_metrics method.
- Removed custom MaxRetriesExceededError class and updated import statements for clarity.
- Streamlined retry logic to utilize self.request.retries consistently across the task.
- Enhanced error handling for LockRetry exceptions, ensuring proper retry behavior without exceeding max retries.
- Updated unit tests to reflect changes in retry logic and error handling.
- Updated retry logic to use task_max_retries directly for Celery's max_retries parameter.
- Enhanced request object safety checks to ensure proper handling when attributes are missing.
error=Errors.INTERNAL_RETRYING,
)
self.retry(countdown=retry.countdown)
except MaxRetriesExceededError:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing INTERNAL_OUT_OF_RETRIES breadcrumb when max retries exceeded

The refactored code logs Errors.INTERNAL_RETRYING breadcrumb before calling self.retry(), then silently returns if MaxRetriesExceededError is caught. The old code first checked if max attempts were exceeded and logged Errors.INTERNAL_OUT_OF_RETRIES before returning. Now when max retries are exceeded, the wrong error (INTERNAL_RETRYING) is logged and INTERNAL_OUT_OF_RETRIES is never recorded, degrading observability and debugging capability for retry exhaustion scenarios. This pattern appears in both run_impl and _handle_finisher_lock methods.

Additional Locations (1)

Fix in Cursor Fix in Web

now = datetime.now()
delta = now - enqueued_time

queue_name = request.get("delivery_info", {}).get("routing_key", None)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing delivery_info type check may cause AttributeError

The new _emit_queue_metrics method chains .get() calls without checking if delivery_info is a dict. The old code used isinstance(delivery_info, dict) to guard against non-dict values. If delivery_info is explicitly set to None or another non-dict type, request.get("delivery_info", {}).get("routing_key", None) will raise an AttributeError since the default {} only applies when the key is missing, not when it exists with a non-dict value.

Fix in Cursor Fix in Web

max_retries=self.max_retries,
countdown=30 * (2**self.request.retries),
)
self.retry(countdown=30 * (2**self.request.retries))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upload state not updated when max retries exceeded

When a retryable error occurs and max retries are exceeded, self.retry() at line 190 raises MaxRetriesExceededError, causing result.update_upload(carriedforward=carriedforward) at line 191 to never execute. The old code explicitly handled this scenario by calling update_upload() to set the upload state to "error" before returning. Now the upload remains in its current state (likely "started" or "processing") instead of being marked as "error", leaving the database in an inconsistent state that misrepresents the actual processing outcome.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants