Skip to content

Feature Request: Support for node_pool Label in certain server metrics #25933

Open
@econsult-devops

Description

@econsult-devops

Proposal

Hi,
we are working on scaling our nodes per node_pool based on blocked job evaluations, in cases where there are insufficient resources.
Currently, if scaling is based solely on resource usage thresholds (e.g., memory usage, CPU), it may fail to trigger scaling for jobs with large resource requirements. These jobs remain unschedulable even though the threshold metrics do not indicate a need to scale out.

The metrics we would need the node_pool label attached to would be:

nomad.nomad.blocked_evals.cpu
nomad.nomad.blocked_evals.memory

Without the node_pool label we can't determine the correct node_pool where a scaling operation is needed. We noticed that the node_pool attribute is not presently queried in the evaluation data, which may need a bit more implementation work to support this feature.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Needs Roadmapping

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions