Skip to content

TBS: Add reporting to improve troubleshooting for common issues #18084

@isaacaflores2

Description

@isaacaflores2

Background

There are some common issues that prevent tail-based sampling from performing as expected. Additional reporting/observability can empower users to troubleshoot these issues on their own. Common issues:

  • storage limit reached
  • traces missing a root transaction

Potential Solutions

Metrics

  • Expose a metric to track TBS related errors. This can be a simple counter with an error label which covers the storage limit reached error

  • Expose a metric which observes sampling decisions The idea is to surface a metric that can show a user scenarios when traces are missing a root transaction. This can be a counter which tracks each time a transaction group is sampled. Or a metric which can track valid unsampled traces which have a root transaction.

    • The exact metric may depend on the current TBS implementation. We should explore possible solutions as part of this issue

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions