Skip to content

Incidents may get stuck open #935

@elfjes

Description

@elfjes

We have a couple of incidents that the source system cannot close (ie END event). When sending an END event, the event is happily accepted, and nothing happens. Upon further investigation, I tracked it to the following code in EventViewSet.perform_create

        try:
            self.validate_event_type_for_incident(event_type, incident)
        except serializers.ValidationError as e:
            # Allow any event from source systems (as long as the user is allowed to post the event type),
            # even if the posted type is invalid for the incident - like if an `INCIDENT_END` event
            # is sent after the incident has been manually closed
            if not user.is_source_system:
                raise e

In case of an error (ie The incident already has a related event of type 'END'.) the end event is created, but not applied. The associated exception is swallowed. It would be really nice if instead of just dropping the exception, at least a message would be logged.

The incidents that can't be ENDed currently have an end_time == infinity and should be able to be close. Apparently something went wrong during the first END event, although I haven't figured out what yet. Unfortunately our logs don't go back long enough to see what happened the first time the incident was supposed to be ENDed so I can't pinpoint what was the issue there

I think that sending an END event now should still close the incident. Maybe it's better to look at the current end time, and end the incident if end_time==infinity, regardless of whether an END event was previously created. (Perhaps the same holds true for START events, but I'm not sure). Now we've ended up (pun intended) in a situation where an incident cannot be closed by the source system.

So, my suggestions

  • log a message when an event is accepted but ignored
  • update the logic on when an end event is excecuted based on its current end_time

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

🏗 In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions