fix(sentry apps): Self-heal/regenereate broken ServiceHooks #96712

Christinarlong · 2025-07-29T22:49:10Z

Finally got around to doing this... In the past some Sentry Apps have been updated and their ServiceHooks(read: webhooks) weren't updated properly for various reasons. Leaving people with broken webhooks. That also caused these Sentry errors to surface to us

This PR adds a task that will gradually regenerate the broken webhooks by re-running the create_or_update_service_hooks_for_installation function. That function is what should've been run during the initial webhook update but for various reason (timeouts, errors etc.) didn't complete.

Christinarlong · 2025-07-29T22:54:12Z

src/sentry/sentry_apps/tasks/sentry_apps.py

+def regenerate_service_hook_for_installation(
+    installation_id: int, servicehook_events: list[str] | None
+) -> None:


This function would be used whenever we encounter the SentryAppWebhookFailureReason.MISSING_SERVICEHOOK or SentryAppWebhookFailureReason.EVENT_NOT_IN_SERVCEHOOK failure cases.

In both cases we usually should have an installation with a working ServiceHook a la we use the Sentry app's event list as the source of truth for all installations

This logic was put into another task because I didn't want another logically separate operation to potentially pollute the results/info of another task. And similarly allows us to get better data on when this operation is ran.

codecov · 2025-07-29T23:06:02Z

Codecov Report

❌ Patch coverage is 37.50000% with 10 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/sentry/sentry_apps/tasks/sentry_apps.py	33.33%	10 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #96712      +/-   ##
==========================================
- Coverage   80.63%   80.15%   -0.49%     
==========================================
  Files        8475     8483       +8     
  Lines      373193   379104    +5911     
  Branches    24205    24205              
==========================================
+ Hits       300929   303857    +2928     
- Misses      71891    74874    +2983     
  Partials      373      373

src/sentry/sentry_apps/tasks/sentry_apps.py

Christinarlong · 2025-07-30T20:24:00Z

src/sentry/sentry_apps/tasks/sentry_apps.py

+    assert installation is not None, "Installation must exist to regenerate service hooks"
+    app_events = installation.sentry_app.events
+
+    if servicehook_events is None or set(servicehook_events) != set(app_events):


thinking about this more idk if we should even have these checks, like I think we can assume the caller will have done the validation/check that the ServiceHook is in a bad state

i agree with your intuition: i think this task should only be called if this if statement is true.

iamrajjoshi

can we add some a metric to the task so we can monitor how often it is triggers, and create a redash dashboard/query to show us the burndown rate of the broken servicehooks?

iamrajjoshi · 2025-07-30T20:59:17Z

src/sentry/sentry_apps/tasks/sentry_apps.py

@@ -799,6 +799,38 @@ def send_webhooks(installation: RpcSentryAppInstallation, event: str, **kwargs:
    )


+@instrumented_task(
+    "sentry.sentry_apps.tasks.sentry_apps.regenerate_service_hook_for_installation",
+    taskworker_config=TaskworkerConfig(


just making sure we don't have to add anything extra from TASK_OPTIONS in the taskworker config

er im not sure if we need to add anything else. O I did bork the namespace though. I highkey just copied this from the other tasks

iamrajjoshi · 2025-07-30T21:00:38Z

src/sentry/sentry_apps/tasks/sentry_apps.py

+    installation_id: int, servicehook_events: list[str] | None
+) -> None:
+    installation = app_service.installation_by_id(id=installation_id)
+    assert installation is not None, "Installation must exist to regenerate service hooks"


nit: maybe we raise some integrity exception and add more context ie, installation_id, etc.
another idea is to add a logger.info with this context (it becomes a breadcrumb in the sentry error)

iamrajjoshi · 2025-07-30T21:01:23Z

src/sentry/sentry_apps/tasks/sentry_apps.py

+    assert installation is not None, "Installation must exist to regenerate service hooks"
+    app_events = installation.sentry_app.events
+
+    if servicehook_events is None or set(servicehook_events) != set(app_events):


i agree with your intuition: i think this task should only be called if this if statement is true.

Christinarlong · 2025-07-30T22:52:11Z

can we add some a metric to the task so we can monitor how often it is triggers, and create a redash dashboard/query to show us the burndown rate of the broken servicehooks?

I think we can piggy back off the general task metric to see how often the task is triggered.

As for some stat/visualization of the burndown,

A counter that shows the number of installations whose SH events dont match the ones of their sentry app counter part - https://redash.getsentry.net/queries/9297/source#12084

Also I was thinking we could track this via the open Sentry issues in the PR description. Since if those go down we can correlate that to burning down the broken servicehooks

src/sentry/sentry_apps/tasks/sentry_apps.py

Christinarlong added 2 commits July 29, 2025 14:35

initial commit for backfilling servicehook logic

d0b195f

add task to regenerate broken servicehooks

d1bbd6d

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jul 29, 2025

nit

950e8f0

Christinarlong commented Jul 29, 2025

View reviewed changes

vercel bot deployed to Preview July 29, 2025 22:58 View deployment

fix typing

6469652

vercel bot deployed to Preview July 29, 2025 23:21 View deployment

Christinarlong marked this pull request as ready for review July 30, 2025 00:06

Christinarlong requested review from a team as code owners July 30, 2025 00:06

This comment was marked as outdated.

Sign in to view

seer-by-sentry bot reviewed Jul 30, 2025

View reviewed changes

src/sentry/sentry_apps/tasks/sentry_apps.py Outdated Show resolved Hide resolved

Christinarlong commented Jul 30, 2025

View reviewed changes

iamrajjoshi reviewed Jul 30, 2025

View reviewed changes

add logger context for installation failure and remove extra checks

2dbbc0a

vercel bot deployed to Preview July 30, 2025 22:54 View deployment

cursor bot reviewed Jul 30, 2025

View reviewed changes

src/sentry/sentry_apps/tasks/sentry_apps.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix(sentry apps): Self-heal/regenereate broken ServiceHooks #96712

fix(sentry apps): Self-heal/regenereate broken ServiceHooks #96712

Christinarlong commented Jul 29, 2025

Uh oh!

Christinarlong Jul 29, 2025

Uh oh!

codecov bot commented Jul 29, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Christinarlong Jul 30, 2025

Uh oh!

iamrajjoshi Jul 30, 2025

Uh oh!

iamrajjoshi left a comment

Uh oh!

iamrajjoshi Jul 30, 2025

Uh oh!

Christinarlong Jul 30, 2025

Uh oh!

iamrajjoshi Jul 30, 2025

Uh oh!

iamrajjoshi Jul 30, 2025

Uh oh!

Christinarlong commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix(sentry apps): Self-heal/regenereate broken ServiceHooks #96712

Are you sure you want to change the base?

fix(sentry apps): Self-heal/regenereate broken ServiceHooks #96712

Conversation

Christinarlong commented Jul 29, 2025

Uh oh!

Christinarlong Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Christinarlong Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

iamrajjoshi Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

iamrajjoshi left a comment

Choose a reason for hiding this comment

Uh oh!

iamrajjoshi Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Christinarlong Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

iamrajjoshi Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

iamrajjoshi Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Christinarlong commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jul 29, 2025 •

edited

Loading