-
Notifications
You must be signed in to change notification settings - Fork 2k
scheduler: basic cluster reconciler safety properties for batch jobs #26172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// We can never have more placements than the count | ||
if len(result.Place) > tg.Count { | ||
result.Place = result.Place[:tg.Count] | ||
result.DesiredTGUpdates[tg.Name].Place = uint64(tg.Count) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test case that discovered this:
- group count 1
- alloc with desired=evict and client=lost (alloc was lost and missing heartbeat)
- alloc with desired=run and client=failed
- we get stop=2, place=2 with both placements being for the failed alloc
This patch is a bit of a sledgehammer on the test but seems like an obvious thing to assert as well. TBD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following test shows that this is a problem only in the reconciler and not the scheduler overall:
func TestServiceSched_PlacementOvercount(t *testing.T) {
ci.Parallel(t)
h := tests.NewHarness(t)
lostNode := mock.Node()
lostNode.Status = structs.NodeStatusDown
must.NoError(t, h.State.UpsertNode(structs.MsgTypeTestSetup, h.NextIndex(), lostNode))
node := mock.Node()
must.NoError(t, h.State.UpsertNode(structs.MsgTypeTestSetup, h.NextIndex(), node))
job := mock.Job()
job.TaskGroups[0].Count = 1
must.NoError(t, h.State.UpsertJob(structs.MsgTypeTestSetup, h.NextIndex(), nil, job))
lostAlloc := mock.AllocForNode(lostNode)
lostAlloc.Job = job
lostAlloc.JobID = job.ID
lostAlloc.Name = "my-job.web[0]"
lostAlloc.DesiredStatus = structs.AllocDesiredStatusEvict
lostAlloc.ClientStatus = structs.AllocClientStatusLost
failedAlloc := mock.AllocForNode(node)
failedAlloc.Job = job
failedAlloc.JobID = job.ID
failedAlloc.Name = "my-job.web[0]"
failedAlloc.DesiredStatus = structs.AllocDesiredStatusRun
failedAlloc.ClientStatus = structs.AllocClientStatusFailed
allocs := []*structs.Allocation{lostAlloc, failedAlloc}
must.NoError(t, h.State.UpsertAllocs(structs.MsgTypeTestSetup, h.NextIndex(), allocs))
eval := &structs.Evaluation{
Namespace: structs.DefaultNamespace,
ID: uuid.Generate(),
Priority: job.Priority,
TriggeredBy: structs.EvalTriggerAllocStop,
JobID: job.ID,
Status: structs.EvalStatusPending,
AnnotatePlan: true,
}
must.NoError(t, h.State.UpsertEvals(
structs.MsgTypeTestSetup, h.NextIndex(), []*structs.Evaluation{eval}))
err := h.Process(NewServiceScheduler, eval)
must.NoError(t, err)
must.Len(t, 1, h.Plans)
must.Eq(t, 1, h.Plans[0].Annotations.DesiredTGUpdates["web"].Place)
must.Eq(t, 1, h.Plans[0].Annotations.DesiredTGUpdates["web"].Stop)
}
However, the check we're doing here isn't correct either, because result.Place
is the results from all task groups. We need to check for the specific task group and only remove result.Place
for other task groups. I'll pull this out to its own PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the check we're doing here isn't correct either, because
result.Place
is the results from all task groups.
Whoops, this isn't true either! This is computeGroup
. It'll get merged later with the other group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
773b2b5
to
60946b9
Compare
if int(tgUpdates.Place) > tgCount { | ||
t.Fatal("group placements should never exceed group count") | ||
} | ||
if int(tgUpdates.DestructiveUpdate) > tgCount { | ||
t.Fatal("destructive updates should never exceed group count") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes me wonder: shouldn't these also hold for service jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I assume there's going to be a whole lot of overlap between the properties we want to test in our two PRs. Once we've got a set we're mostly-happy with we can refactor the test to pull out the common set.
60946b9
to
7f71eb9
Compare
7f71eb9
to
d51264a
Compare
d51264a
to
4829f2f
Compare
4829f2f
to
874530c
Compare
3d01dc2
to
31a487a
Compare
To help break down the larger property tests we're doing in #26167 and #26172 into more manageable chunks, pull out a property test for just the `reconcileReconnecting` method. This method helpfully already defines its important properties, so we can implement those as test assertions. Ref: https://hashicorp.atlassian.net/browse/NMD-814 Ref: #26167 Ref: #26172
To help break down the larger property tests we're doing in #26167 and #26172 into more manageable chunks, pull out a property test for just the `reconcileReconnecting` method. This method helpfully already defines its important properties, so we can implement those as test assertions. Ref: https://hashicorp.atlassian.net/browse/NMD-814 Ref: #26167 Ref: #26172
To help break down the larger property tests we're doing in #26167 and #26172 into more manageable chunks, pull out a property test for just the `reconcileReconnecting` method. This method helpfully already defines its important properties, so we can implement those as test assertions. Ref: https://hashicorp.atlassian.net/browse/NMD-814 Ref: #26167 Ref: #26172
While working on property testing in #26172 we discovered there are scenarios where the reconciler will produce more than the expected number of placements. Testing of those scenarios at the whole-scheduler level shows that this gets handled correctly downstream of the reconciler, but this makes it harder to reason about reconciler behavior. Cap the number of placements in the reconciler. Ref: #26172
While working on property testing in #26172 we discovered there are scenarios where the reconciler will produce more than the expected number of placements. Testing of those scenarios at the whole-scheduler level shows that this gets handled correctly downstream of the reconciler, but this makes it harder to reason about reconciler behavior. Cap the number of placements in the reconciler. Ref: #26172
Property test assertions for the core safety proprerties of the cluster reconciler, for batch jobs. The changeset includes fixes for any bugs found during work-in-progress, which will get pulled out to their own PRs. Ref: https://hashicorp.atlassian.net/browse/NMD-814 Ref: #26167
31a487a
to
cc849de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Property test assertions for the core safety properties of the cluster reconciler, for batch jobs.
Ref: https://hashicorp.atlassian.net/browse/NMD-814
Ref: #26167