You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: oteps/4333-recording-exceptions-on-logs.md
+89-79Lines changed: 89 additions & 79 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,11 +12,11 @@ Exceptions recorded on logs have the following advantages over span events:
12
12
- they can have different severity levels to reflect how critical the exception is
13
13
- they are already reported natively by many frameworks and libraries
14
14
15
-
Recording exception on logs is essential for troubleshooting. But regardless of how they are recorded, they could be noisy:
15
+
Recording exceptions is essential for troubleshooting. Regardless of how exceptions are recorded, they could be noisy:
16
16
- distributed applications experience transient errors at the rate proportional to their scale and exceptions in logs could be misleading -
17
17
individual occurrence of transient errors are not necessarily indicative of a problem.
18
18
- exception stack traces can be huge. Corresponding attribute value can frequently reach several KBs resulting in high costs
19
-
associated with ingesting and storing such logs. It's also common to log exceptions multiple times while they bubble up
19
+
associated with ingesting and storing them. It's also common to log exceptions multiple times while they bubble up
20
20
leading to duplication and aggravating the verbosity problem.
21
21
22
22
In this OTEP, we'll provide guidance around recording exceptions that minimizes duplication, allows to reduce noise with configuration and
@@ -29,37 +29,29 @@ starting point, but they are encouraged to adjust it to their needs.
29
29
30
30
This guidance boils down to the following:
31
31
32
-
- we should record full exception details including stack traces only for unhandled exceptions (by default).
33
-
- we should log error details and context when the error happens. These records should not include
34
-
exception stack traces unless this exception is unhandled.
35
-
- we should avoid logging the same error multiple times as it propagates up through the stack.
36
-
- we should log errors with appropriate severity ranging from `Trace` to `Fatal`.
32
+
Instrumentations should record exception information (along with other context) on the log record and
33
+
use appropriate severity - only unhandled exceptions should be recorded as `Error` or higher. They
34
+
should strive to report each exception once.
37
35
38
-
> [!NOTE]
39
-
>
40
-
> Based on this guidance non-native instrumentations should record exceptions in top-level instrumentations only (#2 in [Details](#details))
41
-
42
-
> [!Important]
43
-
>
44
-
> OTel should provide APIs like `setException` when creating log record that will record only necessary information depending
45
-
> on the configuration and log severity. See [API changes](#api-changes) for the details.
36
+
Instrumentation should provide the whole exception instance to the OTel (instead of individual attributes)
37
+
and the OTel SDK should, based on user configuration, decide which information to record. As a default,
38
+
this OTEP proposes to record exception stack traces on log with `Error` or higher severity.
46
39
47
40
### Details
48
41
49
-
1. Exceptions should be recorded as[logs](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/exceptions/exceptions-logs.md)
42
+
1. Exceptions should be recorded on[logs](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/exceptions/exceptions-logs.md)
50
43
or [log-based events](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/events.md)
51
44
52
45
2. Instrumentations for incoming requests, message processing, background job execution, or others that wrap user code and usually create local root spans, should record logs
53
-
for unhandled exceptions with `Error` severity and [`exception.escaped`](https://github.com/open-telemetry/semantic-conventions/blob/v1.29.0/docs/attributes-registry/exception.md) flag set to `true`.
46
+
for unhandled exceptions with `Error` severity.
54
47
55
-
<!-- TODO: do we need an `exception.unhandled` attribute instead of `exception.escaped`? -->
56
48
Some runtimes and frameworks provide global exception handler that can be used to record exception logs. Priority should be given to the instrumentation point where the operation context is available.
57
49
58
-
3. It's recommended to record exception stack traces only for unhandled exceptions in cases outlined in #2 above.
50
+
3. Native instrumentations should record log describing an error and the context it happened in
51
+
when this error is detected (or where the most context is available).
59
52
60
-
4. Native instrumentations should record log describing an error and the context it happened in
61
-
when this error is detected. Corresponding log record should not contain exception stack
62
-
traces (if an exception was thrown/caught) unless such exceptions usually remain unhandled.
53
+
4. It's not recommended to record the same error as it propagates through the stack trace or
54
+
attach the same instance of exception to multiple log records.
63
55
64
56
5. An error should be logged with appropriate severity depending on the available context.
65
57
@@ -68,15 +60,25 @@ This guidance boils down to the following:
68
60
- Unhandled exceptions that don't result in application shutdown should be recorded with severity `Error`
69
61
- Errors that result in application shutdown should be recorded with severity `Fatal`
70
62
71
-
6. Instrumentations should not log errors or exceptions that are handled or
72
-
are propagated as is, except ones handled in global exception handlers (see #2 below)
63
+
6. When recording exception on logs, user applications and instrumentations are encouraged to put additional attributes
64
+
to describe the context that the exception was thrown in.
65
+
They are also encouraged to define their own error events and enrich them with exception details.
73
66
74
-
If a new exception is created based on the original one or a new details about the error become available,
75
-
instrumentation may record another error (without stack trace)
67
+
7. OTel SDK should record stack traces on exceptions with severity `Error`or higher and should allow users to
68
+
change the threshold.
76
69
77
-
7. When recording exception on logs, user applications and instrumentations are encouraged to put additional attributes
78
-
to describe the context that the exception was thrown in.
79
-
They are also encouraged to define their own error events and enrich them with `exception.*` attributes.
70
+
See [logback exception config](https://logback.qos.ch/manual/layouts.html#ex) for an example of configuration that
71
+
records stack trace conditionally.
72
+
73
+
74
+
> [!NOTE]
75
+
>
76
+
> Based on this guidance non-native instrumentations should record exceptions in top-level instrumentations only (#2 in [Details](#details))
77
+
78
+
> [!Important]
79
+
>
80
+
> OTel should provide API like `setException` when creating log record that will record only necessary information depending
81
+
> on the configuration and log severity. See [API changes](#api-changes) for the details.
80
82
81
83
## API changes
82
84
@@ -85,8 +87,8 @@ Library may write logs providing exception instance through a log bridge and not
85
87
86
88
It also maybe desirable by some vendors/apps to record all the exception details.
87
89
88
-
OTel Logs API should provide additional methods that enrich log record with exception details such as
89
-
`setException(exception)`(`setUnhandledException`, etc), similar to [RecordException](../specification/trace/api.md?plain=1#L682)
90
+
OTel Logs API should provide methods that enrich log record with exception details such as
91
+
`setException(exception)` similar to [RecordException](../specification/trace/api.md?plain=1#L682)
90
92
method on span.
91
93
92
94
OTel SDK should implement such methods and set exception attributes based on configuration
@@ -108,27 +110,22 @@ try {
108
110
// we don't record exception here, but may record a log record without exception info
0 commit comments