[chore] Non-normative guidance for status metrics #2472

braydonk · 2025-07-03T15:40:08Z

Changes

This PR adds non-normative guidance for designing status metrics. This is a common metric pattern that has confused contributors in the past, so this document exists to clear up some misconceptions and outline a unified design plan.

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
[N/A] Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with [chore]
[N/A] schema-next.yaml updated with changes to existing conventions.

This PR adds non-normative guidance for designing status metrics. This is a common metric pattern that has confused contributors in the past, so this document exists to clear up some misconceptions and outline a unified design plan.

joaopgrassi

Thank you for putting this together!

joaopgrassi · 2025-07-04T06:59:23Z

docs/non-normative/status-metrics.md

+
+### Instrument
+
+The metric is instrumented as an `UpDownCounter` rather than a `Gauge`. This is


We should probably link to the recommendation on how to record consistent UpDownCounters: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/metrics.md#consistent-updowncounter-timeseries

joaopgrassi · 2025-07-04T07:08:14Z

docs/non-normative/status-metrics.md

+be in Kubernetes, where the word "phase" equally represents both and there
+is no acceptable alternative. In this case, using a metric name suffix is
+recommended to avoid the naming clash, i.e. naming the attribute `k8s.phase` and
+the metric `k8s.phase.current`.


Looking, I see there's some metrics where the name + attributes are the same, thus already going against the recommendation here. For ex:

semantic-conventions/model/k8s/metrics.yaml

Line 79 in 83b5544

metric_name: k8s.container.status.state

,

semantic-conventions/model/k8s/metrics.yaml

Line 92 in 83b5544

- id: metric.k8s.container.status.reason

Could those be somehow adapted to not clash like it is proposed here? Would be unfortunate that we publish the guidance while we itself already do not follow it, unless we can give reasons.

I agree! We have several metrics of this type in K8s already:

semantic-conventions/model/k8s/metrics.yaml

Line 159 in 83b5544

- id: metric.k8s.node.condition.status

semantic-conventions/model/k8s/metrics.yaml

Line 604 in 83b5544

- id: metric.k8s.namespace.phase

semantic-conventions/model/k8s/metrics.yaml

Line 79 in 83b5544

metric_name: k8s.container.status.state

semantic-conventions/model/k8s/metrics.yaml

Line 92 in 83b5544

- id: metric.k8s.container.status.reason

I wonder if we should not enforce the wording here but only the modeling instead. Maybe using .state can just come as a generic suggestion rather than "stricter" guidance?

I can loosen the guidance here. But isn't it pretty problematic to have an attribute that is the exact same name as a metric? They are un-clashed in tooling by their Weaver identifiers, but couldn't this cause problems or at least confusion for people interpreting the conventions?

I haven't seen this confusing people so far (in PRs' reviews). But if people in general think it's an issue I'd fine changing it.

I was confused when reviewing some of the new k8s metrics that include a required attribute that exactly matches the metric name. I prefer the guidance suggested in the PR.

I'll propose it also makes translating to/from OTLP just a tiny bit harder.

When converting from the highly-structured OTLP format to other, non-otlp formats, being able to distinguish between a metric's name and the metric's datapoint's attributes is helpful. In OTLP we have the structure to distinguish between a metric's name and a datapoint attribute name, but in other formats we do have that structure.

Specifically, having the values match makes it hard to use the metric's name as a key and the datapoint attribute name as a key in a flat structure.

OTLP of course allows metric names and datapoint attributes to match, and we would never change that, but in our semantic conventions I like requiring metric name and datapoint attribute name to be different. I think it provides the most flexibility for to/from translations without putting a big restriction on ourselves.

I have opened an issue to get clarification around this in general here: #2476

joaopgrassi · 2025-07-04T07:11:00Z

This can also close this old issue #199. CC @riverar

lmolkova · 2025-07-04T18:14:07Z

docs/non-normative/status-metrics.md

+aliases: [status-metrics]
+--->
+
+# State Metrics


wdyt about moving it to https://github.com/open-telemetry/semantic-conventions/tree/main/docs/non-normative/how-to-write-conventions ?

I believe @jsuereth wanted to consolidate all the guidance there and maybe we'll move it altogether to a different place (some of the meta-guidance is or could be normative)

lmolkova · 2025-07-04T18:15:11Z

docs/non-normative/status-metrics.md

+aliases: [status-metrics]
+--->
+
+# State Metrics


could you please add a link to it from https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/how-to-define-semantic-conventions.md#defining-metrics ?

lmolkova · 2025-07-04T18:15:27Z

docs/non-normative/status-metrics.md

+- [Design](#design)
+  - [Naming](#naming)
+  - [Instrument](#instrument)
+  - [Why not Resource Attributes?](#why-not-resource-attributes)


Suggested change

- [Why not Resource Attributes?](#why-not-resource-attributes)

- [Why not Entity Attributes?](#why-not-resource-attributes)

?

lmolkova · 2025-07-04T18:16:26Z

docs/non-normative/status-metrics.md

+- ref: example.state
+```
+
+### Naming


could you summarize it to one sentence and add it (with link to this doc) to https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/naming.md#metrics ?

lmolkova · 2025-07-04T18:17:02Z

docs/non-normative/status-metrics.md

+It is a common mistake to attach a "state" as a Resource Attribute. This is
+not recommended for two reasons:
+
+* Resource is intended to be immutable, thus adding an attribute like "state"


does it still apply considering entities and descriptive attributes being mutable?

Yes I was also thinking the same when I reviewed but wasn't sure. 🤔

riverar · 2025-07-15T19:34:28Z

This can also close this old issue #199. CC @riverar

Can you elaborate on this? I read the document and didn't really come away with any clarity with regards to my very specific desktop app monitoring scenario. Maybe I missed something?

[chore] Non-normative guidance for status metrics

e2eee01

This PR adds non-normative guidance for designing status metrics. This is a common metric pattern that has confused contributors in the past, so this document exists to clear up some misconceptions and outline a unified design plan.

braydonk requested review from a team as code owners July 3, 2025 15:40

github-project-automation bot added this to Semantic Conventions Triage Jul 3, 2025

github-project-automation bot moved this to Untriaged in Semantic Conventions Triage Jul 3, 2025

generate toc

f17bfde

joaopgrassi reviewed Jul 4, 2025

View reviewed changes

braydonk mentioned this pull request Jul 4, 2025

Clarify metrics containing an attribute sharing the same name #2476

Open

lmolkova reviewed Jul 4, 2025

View reviewed changes

ChrsMark mentioned this pull request Jul 10, 2025

Add k8s.pod.phase and k8s.pod.status.reason metrics #2488

Open

3 tasks


		### Instrument

		The metric is instrumented as an `UpDownCounter` rather than a `Gauge`. This is

	- [Why not Resource Attributes?](#why-not-resource-attributes)
	- [Why not Entity Attributes?](#why-not-resource-attributes)

[chore] Non-normative guidance for status metrics #2472

Are you sure you want to change the base?

[chore] Non-normative guidance for status metrics #2472

Uh oh!

Conversation

braydonk commented Jul 3, 2025

Changes

Merge requirement checklist

Uh oh!

joaopgrassi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TylerHelmuth Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joaopgrassi commented Jul 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmolkova Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riverar commented Jul 15, 2025

Uh oh!

Uh oh!

TylerHelmuth Jul 14, 2025 •

edited

Loading

lmolkova Jul 4, 2025 •

edited

Loading