Skip to content

Initialize hosting trace with OTEL tags for sampling #62090

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Aug 19, 2025

Conversation

rkargMsft
Copy link
Contributor

@rkargMsft rkargMsft commented May 23, 2025

Feature for adding tags at Activity creation time

  • You've read the Contributor Guide and Code of Conduct.
  • You've included unit or integration tests for your change, where applicable.
  • You've included inline docs for your change, where applicable.
  • There's an open issue for the PR that you are making. If you'd like to propose a new feature or change, please open an issue to discuss the change or find an existing issue.

Summary of the changes

Adding server.address and server.port tags at Activity creation. This is to support using those tags when making sampling decisions as only tags provided in the CreateActivity/StartActivity are available at that time.

Description

OpenTelemetry spec also says that http.request.method SHOULD be added but not including that in this PR due to the extra logic needed to compute that value.

Fixes #50488

@github-actions github-actions bot added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label May 23, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 23, 2025
@BrennanConroy BrennanConroy requested a review from JamesNK May 23, 2025 19:20
@rkargMsft
Copy link
Contributor Author

It may make sense to actually have the ActivityCreationTags property be IEnumerable<KeyValuePair<string, object?>>? which is what ActivityCreator takes as a parameter and then let the implementation decide what to return.

public static Activity? CreateFromRemote(
ActivitySource activitySource,
DistributedContextPropagator propagator,
object distributedContextCarrier,
DistributedContextPropagator.PropagatorGetterCallback propagatorGetter,
string activityName,
ActivityKind kind,
IEnumerable<KeyValuePair<string, object?>>? tags,
IEnumerable<ActivityLink>? links,
bool diagnosticsOrLoggingEnabled)

@JamesNK
Copy link
Member

JamesNK commented May 26, 2025

What's the scenario where you want to set activity tags on creation? This seems like kind of niche feature and I'd prefer to see whether the problem can be solved with existing APIs.

For example, there is a feature to get the current request activity, IHttpActivityFeature, and there are many events during the corse of a HTTP request. Could you use one of those existing events, get the activity using IHttpActivityFeature and then set the tags?

@JamesNK
Copy link
Member

JamesNK commented May 26, 2025

This is to support using those tags when making sampling decisions as only tags provided in the CreateActivity/StartActivity are available at that time.

Is the scenario that you set a feature with tags earlier in the request. Tags could be different depending on the request.

An activity is created very early in a request so the number of places to set it would be limited. Can you give an example of this being used? Also, what does the activity source do with the tags?

@rkargMsft
Copy link
Contributor Author

I'm specifically using YARP as a reverse proxy for a large number of services. There are calls for many different host names. I want to make tracing sampling decision for those calls and would like to have separate buckets for each hostname. For example, in a given time window, I want a minimum number of traces to always be sampled and then start doing probabilistic sampling after that.

As it is now, I can only get the base operation name (just HTTP <METHOD> at the time of sampling) and no tags. This means that if I have one very high throughput service and one very low throughput service then they're all in the same bucket for sampling (there's not host name or other data in tags yet to differentiate requests) and probabilistically, the low throughput service basically never gets sampled.

The only way to get tags to be available at the time of making the sampling decision is for them to be passed in to the CreateActivity or StartActivity call and the existing code hard codes that to tags:null.

The flow is:

  • Create/Start Activity
    • Sampling logic determines if an Activity is returned or null based on only the parameters passed in on Create/Start
  • If non-null (because sampling decided to keep the span) then additional tags or modifications can be made

I don't know of any other way to get information available to the sampler than to pass the tags in on Create/StartActivity.

This is extra work for every request, even those that end up not sampled, which is why it does nothing by default and would require a user to specifically opt in with their own implementation for specifically the data points they need in their particular scenario. For example, the vast majority of our services would not use this functionality; only our YARP usage would define an implementation to just add the host/server.address tag.

@rkargMsft
Copy link
Contributor Author

The OpenTelemetry spec has this SHOULD requirement for tags available at span creation time:

The following attributes can be important for making sampling decisions and SHOULD be provided at span creation time (if provided at all):

http.request.method
server.address
server.port
url.full

It's not free to add those so making it possible but explicitly opt-in seems like a good change.

@JamesNK
Copy link
Member

JamesNK commented May 27, 2025

Right now tracing in ASP.NET Core doesn't follow the OTEL standard. When we do then those tags should be set immediately. It sounds like server address and port are what you want to use when making sampling decisions, right?

The performance cost shouldn't be a concern. We would only call ActivitySource if someone is listening to the activity, so people who care about perf and aren't listening won't be impacted.

@rkargMsft
Copy link
Contributor Author

For our specific use case we would only use server.address.

Performance impact of this change would be initializing the feature if it's registered and doing the feature lookup if there's a listener:

Listener Feature Registered Change
No No -
Yes No Feature lookup
No Yes Feature initialization logic (implementation specific)
Yes Yes Feature initialization logic (implementation specific) + Feature lookup

@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Jun 3, 2025
@rkargMsft
Copy link
Contributor Author

/azp run

Copy link

Commenter does not have sufficient privileges for PR 62090 in repo dotnet/aspnetcore

@rkargMsft
Copy link
Contributor Author

What are next steps on this to keep it from going stale?

@rkargMsft
Copy link
Contributor Author

@JamesNK This continues to be a problem for us, preventing us from getting useful trace sampling on our YARP load balancer. I don't see any workaround we can do without this change or something else that updates how the Activity is created internally.

Is there anything else that can help with this this considered for the .NET 10 timeframe?

@ReubenBond
Copy link
Member

Summarizing my understanding:

  • Applications need trace sampling to reduce the trace volume to a manageable level,
  • The sampling decision often relies on information which should be available in tags,
  • ASP.NET Core sets tags using SetTags, after span creation,
  • ... but sampling occurs at span creation time,
  • ... so no tags are available to make a sampling decision on,
  • This PR changes that by adding a new feature which lets middleware propagate tags to spans at creation time

Does that sound correct, @rkargMsft?

It sounds like a compelling scenario. This is for a service which high enough traffic volume that simple probabilistic sampling is causing high-volume endpoints from drowning out the traffic of low-volume endpoints to the point where the latter do not get sampled.

One follow-on question is, how should tags set at creation time interact with tags set after creation time (i.e, what is already there)?

@rkargMsft
Copy link
Contributor Author

That sounds like an accurate summary.

The existing tags added after Activity creation are made with AddTag

activity.AddTag(tag.Key, tag.Value);

so if there's a Feature registered that sets a tag (like server.address) then that will not get overwritten after sampling. It will remain with whatever value the Feature populated it with.

That seems like reasonable behavior as long as it's documented.

@rkargMsft
Copy link
Contributor Author

An other option to consider could be hard coding an opinionated set of tags at Activity creation (like server.address and server.port) and always add those at Activity creation time.

The downsides of that are

  • this cost gets paid by everyone, regardless of whether they use tags for sampling or not. This gets paid for every Activity, even the ones that eventually get sampled out. It's unclear what a good way to opt-out would be that is better than this proposed PR would offer (with this PR being strictly opt-in)
  • this set of tags may not satisfy all use cases so an extension point may still be needed anyway

@JamesNK
Copy link
Member

JamesNK commented Jun 25, 2025

I don't want to add a new feature. I'd prefer that we followed what the standard proposes and makes the here about including some tags at creation time.

The problem is there is a lot of the standard we don't follow around traces, and I doubt we'll have time in .NET 10 to switch to following it.

@rkargMsft
Copy link
Contributor Author

If the cost of setting those tags (which should be minimal) is acceptable then those could get added.
PR updated to reflect the below proposal.

Would an incremental step be adding just:

  • server.address
  • server.port
    to the tags at creation time? Those are both properties already available from the HttpContext that's passed in as an argument to HostingApplicationDiagnostics.StartActivity.

Basically replace the Feature implementation in this PR with this:

        var tagsForCreation = new TagList();
        if (_activitySource.HasListeners() && _context.Request.Host.HasValue)
        {
            var (host, port) = _context.Request.Host.HostAndPort;
            tags.Add("server.address", host);
            if (port is not null)
            {
                tags.Add("server.port", port);
            }
            else if (string.Equals("https", httpContext.Request.Scheme, StringComparison.OrdinalIgnoreCase))
            {
                tags.Add("server.port", 443);
            }
            else if (string.Equals("http", httpContext.Request.Scheme, StringComparison.OrdinalIgnoreCase))
            {
                tags.Add("server.port", 80);
            }
        }

Setting the http.request.method would require calling HostingMetrics.ResolveHttpMethod which is currently private.

url.full is only defined in the HTTP Client span tags so that wouldn't apply here.

If just host/port get added then it shouldn't be a breaking change later when the full OTel compliance gets implemented (since it would just be adding the method tag with the resolved value).

There's also the opportunity to optimize the .Host and .Port access on HostString as there isn't a way to get both of those items in one call even though the internal implementation parses both of those string when you ask for either. There's an existing GetHostAndPort usage in HostMatcherPolicy that could benefit from this as well as the current use case. A (string host, int? port) HostString.HostAndPort property would remove redundant searching for splitting the host/port.

@rkargMsft
Copy link
Contributor Author

The internal HostAndPort property could be moved to a separate PR if desired.

@rkargMsft
Copy link
Contributor Author

Is this direction acceptable?
If full OTel compliance is the eventual goal then this is an incremental step in that direction that wouldn't need to be changed in the future and also enables our use case (tracing sampling requests through YARP).

@BrennanConroy
Copy link
Member

This seems like a good step forward. James is more aware of the OTel space so I'll leave the final call to @JamesNK.

@rkargMsft
Copy link
Contributor Author

What's the next step to bring this to a resolution?

@JamesNK JamesNK force-pushed the http-activity-tags branch from 7c5e970 to 9a45fbf Compare August 15, 2025 09:17
@JamesNK JamesNK changed the title Feature for setting activity tags on creation Initialize hosting trace with OTEL tags for sampling Aug 15, 2025
@JamesNK JamesNK added area-hosting Includes Hosting and removed area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun labels Aug 15, 2025
@JamesNK
Copy link
Member

JamesNK commented Aug 15, 2025

@rkargMsft I took a look at your PR, how it interacts with OTEL libraries, and how it aligns with our long term goal to set OTEL attributes on this activity by default. End result: I think populating these OTEL attributes on activity creation is a good approach.

I don't think it will cause interop problems with OTEL libraries that add tags to the activity, but some more testing during RC period is required. So we might need to revert this (or default it to off and enable with app context switch) if problems come up.

I've made some improvements to your PR. Adding more recommended created attributes, small perf optimizations, adding tests, adding an opt out switch incase this creates problems for anyone.

Are you ok with these changes?

Comment on lines +29 to +39
private static readonly FrozenDictionary<string, string> KnownHttpMethods = FrozenDictionary.ToFrozenDictionary([
KeyValuePair.Create(HttpMethods.Connect, HttpMethods.Connect),
KeyValuePair.Create(HttpMethods.Delete, HttpMethods.Delete),
KeyValuePair.Create(HttpMethods.Get, HttpMethods.Get),
KeyValuePair.Create(HttpMethods.Head, HttpMethods.Head),
KeyValuePair.Create(HttpMethods.Options, HttpMethods.Options),
KeyValuePair.Create(HttpMethods.Patch, HttpMethods.Patch),
KeyValuePair.Create(HttpMethods.Post, HttpMethods.Post),
KeyValuePair.Create(HttpMethods.Put, HttpMethods.Put),
KeyValuePair.Create(HttpMethods.Trace, HttpMethods.Trace)
], StringComparer.OrdinalIgnoreCase);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add QUERY given #63260 went in yesterday?

Copy link
Member

@JamesNK JamesNK Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is based on the known values list at https://opentelemetry.io/docs/specs/semconv/registry/attributes/http

image

You're welcome to create an issue with OTEL folks to add QUERY to the standard. Then we'll add it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I just took at quick look at the table at the top and because it didn't have everything, just assumed that it was more open:

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkargMsft
Copy link
Contributor Author

This looks good. We can also run the RCs and provide feedback.

@JamesNK
Copy link
Member

JamesNK commented Aug 15, 2025

In case this change causes problems or we decide that it's undesirable to partially support OTEL spec, @rkargMsft what do you think of defaulting it to off and using the AppContext switch to turn it on?

i.e. in .NET 10 by default there are no creation tags and to get them you would add AppContext.SetSwitch("Microsoft.AspNetCore.Hosting.SuppressActivityOpenTelemetryData", false) to your ASP.NET Core app startup code. Later in .NET 11 we would fully support OTEL HTTP spec. It would be enabled by default and you could remove the switch. Would that be acceptable in your app(s)?

@rkargMsft
Copy link
Contributor Author

That should work just fine.

@JamesNK JamesNK merged commit 49bd4c1 into dotnet:main Aug 19, 2025
29 checks passed
@JamesNK
Copy link
Member

JamesNK commented Aug 19, 2025

/backport to release/10.0-rc1

Copy link
Contributor

Started backporting to release/10.0-rc1: https://github.com/dotnet/aspnetcore/actions/runs/17083435926

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-hosting Includes Hosting community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow additional Tags for activity creation
6 participants