-
Notifications
You must be signed in to change notification settings - Fork 925
OTEP: Extending attributes to support complex values #4485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Complex attributes: first round of feedback and clean up
Clarify metrics and cleanup
Simplify, move options to the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, LGTM. There are still some things that would be good to have clarified.
Co-authored-by: Trask Stalnaker <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left couple of small comments around Exemplars wording. looks good otherwise.
instrumentation scope, or metric attribute where equality may be | ||
extensively used to identify tracer/meter/logger or the time series. | ||
|
||
Equality of `KeyValueList`s MUST be unaffected by the ordering of their `KeyValue` pairs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems odd. I'm assuming KeyValueList
represents a heterogeneous array as defined in the glossary. Why wouldn't order be used in computation of equality? What is the difference between the data model of a heterogeneous array and a map at that point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back at the proto definitions, I likely am mistaken. The KeyValueList
represents a map not an heterogeneous array.
Can we either add proto definitions to the glossary or only use definitions defined here in the OTEP to avoid this type of confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to
When
AnyValue
contains one or more lists of key-value pairs:
- the equality of
AnyValue
instances MUST NOT be affected by the ordering of
the key-value pairs within the list.- the equality behavior is unspecified if duplicate keys are present.
Still working on the added itAnyValue
definition
to updating comments like [this one](https://github.com/open-telemetry/opentelemetry-proto/blob/be5d58470429d0255ffdd49491f0815a3a63d6ef/opentelemetry/proto/trace/v1/trace.proto#L209-L213) | ||
and adding changelog record. | ||
|
||
## Trade-offs and mitigations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What seems to be lacking is a considered alternative ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hdost could you elaborate? what's the alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point here is that the solution seems to be leading the problem here. I am sure that the intentions are good, but I'd argue that adding this first class support for these complex attributes risks increasing the burden of comprehensability as well as introducing inefficiencies unnecessarily to the 99% case.
We're now making attributes not as universal creating caveats to when you should use certain types of attributes.
My alternative would be don't add this, and if desired add extensions to improve their support without altering the standard interface.
The prescient thing here is that there is no mentioned alternative meaning to any reader that there really hasn't been any alternatives considered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hdost the motivation is mentioned in the Why section.
The data we want to express is sometimes complex and flattening is not an option. Complex attributes are already in the spec, proto and are allowed on the logs API. Users already put text blobs serialized in all possible (sometimes invalid) ways.
If you can propose a reasonable alternative that allows to express complex data, I'd be happy to mention it here and consider changing the direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to your point Logs already carry and allow for this burden. Logs already have associated trace identifiers, so why then is it not sufficient to have users Log with these attributes? What is the advantage vs using logging?
span links, and as descriptive entity attributes. | ||
|
||
OTel API MAY support setting complex attributes on metrics, resources, | ||
instrumentation scope, span events, and as identifying entity attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the last week Spec call, we still have some concerns on this line
(along with "Exposing multiple types of attribute sets is NOT RECOMMENDED, such as having "ExtendedAttributes" in addition to "Attributes".).
@dyladan please correct me if I'm wrong, but the summary is that:
- we don't have a real use-case for complex attributes on entities (less important)
- it feels wrong to allow it and immediately tell users not to use it (more important)
I wanted to discuss a couple of points on the spec call tomorrow:
While we don't have a use-case in semantic conventions or instrumentations, it does not mean that end users don't have it or won't have it.
There are at least a few hypothetical use-cases where complex and structured data naturally appears out there in the wild and can be useful on metrics/entities
- Cloud resource information such as VM metadata
- container image manifests
- identity - some of service principal info can easily appear on entities / metrics and can be fairly complex
- rarely occurring metrics where cardinality is not a issue - e.g. GitHub PR info is full of nested objects
Complex data can appear on any signal and is discouraged in the same way on each of them. We're adding it to spans not because it's awesome, but because we need a way to record complex data that appears out there in the wild.
Complex data, especially if it's small, does not create much overhead over the same properties recorded as flat ones, but the bonus is that data on telemetry could match something well known defined outside of the telemetry.
…ents to implementations that allow using complex attrs on metrics, resources, entities, instr scope
Related to #4468, #4414
This OTEP suggests a path forward on how to support complex attributes. It proposes to:
Having experimental phase for complex attributes limits the impact of this change and gives backends a graceful period to update.