March 8th, 2022How to

How to track state in your Kubernetes Operator

Ezo Saleh

State can get very messy in a distributed environment like Kubernetes.

In simple environments, state machines are very adequate to define and track state. You quickly get to understand if your computation is misbehaving, and report progress to users or other programs.

But, In the context of an Operator, you can’t just pay attention to your own workload . You also need to be aware of external factors monitored by Kubernetes itself.

Here’s how to think about state when building your own Operator (based on lessons learnt from building our own ).

Let’s get into it!

Spec is desired, status is actual

Let’s start with the basics. All Kubernetes objects define spec and status sub-objects to manage State. This includes custom resources monitored by your own Operator’s custom controller.

Here, spec is the specification of the object where you declare the desired state of the resource. On the other hand, the status is where Kubernetes provides visibility into the actual state of the resource.

For example, our Artillery Operator monitors a custom resource named LoadTest. As above, our LoadTest has a spec field specifying the number of load tests to run in parallel. Our custom controller processes our load test. It creates the required parallel workers (Pods) to match the spec’s desired state therefore updating the actual state.

State in Artillery Kubernetes Operator

The key takeaway from this diagram is that observing state is a constant happenstance to ensure desired becomes actual.

But we need to understand how our custom resource is progressing. So, what exactly should we track in the status sub-object?

State machines are rigid

The answer to the previous question is: do not use single fields with state machine like values to track state.

Kubernetes advocates for managing state using Conditions in our status sub-object. This gives us a less rigid and more ‘open-world’ perspective.

This makes sense. Processes in a distributed system interact in unforeseen ways creating unforeseen states. It’s hard to preemptively determine what these states will be.

Monitoring aggregate state

Let’s look at the example of a typical custom controller.

This controller creates and manages a few child objects. It is aware of state across all of these objects. And, it constantly examines its ‘aggregate’ state to inform its actual state.

We can define these aggregate states using a state machine. As an example, let’s say we have the following states: initializing, running, completed, stoppedFailed

We’ll create a field called CurrentState in our status object, and update it with the current state. Our CurrentState is now running.

Unforeseen states

… Out of the blue a Kubernetes node running our workload fails!!

This drops a child object into an unforeseen (to us) state. Our custom controller’s ‘aggregate’ state has now shifted. But, we continue to report it as running to our users and downstream clients.

We need a fix! We update our custom controller with logic to better examine node failure for a child object. We introduce a new CurrentState for this scenario: awaitingRestart .

Some downstream clients will find this new controller state useful. We now have to inform them to monitor for this new state. All is well in the world, for now .

… What!?? out of the blue another an accounted for state happens!

Our CurrentState can’t explain the issue. So, we update controller logic, add an adequate state for … etc.. etc..

You can see how this state machine based approach quickly becomes rigid.

Conditions

Conditions 👀, provide a richer set of information for a controller’s actual state over time. You can think of it as a controller’s personal changelog.

Conditions are present in all Kubernetes objects. Here’s an example of a Deployment . Have a read I’ll wait… 👀

If our typical custom controller, from the last section, had a Deployment child object that exhibited the following conditions.


status:
  availableReplicas: 2
 
  conditions:
    - lastTransitionTime: 2016-10-04T12:25:39Z
      lastUpdateTime: 2016-10-04T12:25:39Z
      message: Replica set "nginx-deployment-4262182780" is progressing.
      reason: ReplicaSetUpdated
      status: 'True'
      type: Progressing
 
    - lastTransitionTime: 2016-10-04T12:25:42Z
      lastUpdateTime: 2016-10-04T12:25:42Z
      message: Deployment has minimum availability.
      reason: MinimumReplicasAvailable
      status: 'True'
      type: Available

It can use the following logic to figure out its own actual state,

Go through the Deployment’s .status.conditions
Find the latest .status.conditions[..].type and it’s .status.conditions[..].status
Weigh if the other Deployment .status.conditions had any alarming statuses

Here, it determines the Deployment is still progressing and not yet completed . Then it can add a condition with a ‘Progress’ like type to its Conditions list.

Creating your own Conditions

So, how do we create conditions for our own custom controller?

We’ve just seen an example of a Deployment’s Conditions, specifically how open ended they are. E.g type: Progressing may mean either of progressing and complete .

As a general rule of thumb,

We have a smaller set of types .status.conditions[..].type that explain general behaviour.
Unforeseen states, can be easily signalled using the Unknown .status.conditions[..].status value.
Failure for a condition type .status.conditions[..].type can be explained using False as .status.conditions[..].status value with an informative .status.conditions[..].reason.

Let’s checkout an example and see how these tips play out.

Conditions by example

If you recall, earlier, we used the Artillery Operator to explore spec and status sub-objects. Let’s get into the details of how it makes use of Conditions to manage state.

The Artillery Operator uses a LoadTest custom resource to spec out a load test run. The spec includes the number of workers to run in parallel. This information is used to generate a Kubernetes Job and Pods to run the actual load test workloads.

Artillery Kubernetes Operator Architecture diagram

The Job is a key component here. Tracking our aggregate state relies on the progress of the Job.

We also observed that,

There are three statuses we care about based on a Job’s status…

Namely, LoadTestInactive, LoadTestActive and LoadTestCompleted.

We should add extra fields to status …

These fields will give us a fine grained view of how many workers are active, succeeded or failed.


// The number of actively runningLoadTest worker pods.
Active int32 `json:"active,omitempty"`
 
// The number ofLoadTest worker pods which reached phaseSucceeded.
Succeeded int32 `json:"succeeded,omitempty"`
 
// The number ofLoadTest worker pods which reached phaseFailed.
Failed int32 `json:"failed,omitempty"`

Using our Conditions rule of thumb, two Condition types are required:

Progressing, with True, False or Unknown as a value.
Completed, with True, False or Unknown as a value.

These easily explained the progress and completed states. But, how do explain a Load Test has failed?

In our distributed load testing domain, a load test always completes. And, will only flag failed workers. Restarting a failed worker/Pod messes with the load test metrics, so we avoid it at all costs.

So, for the controller, the Completed condition set to true was more than enough. Failed workers were flagged in the status field (using a failed field to track failed count).

Other implementations may treat the failure condition differently (e.g. Deployment uses the Progressing Condition type with value false).

Our Conditions ensure users can track a Load Test’s observed status to infer progress. While helping clients monitor the conditions that matter to them.

Feel free to check the full Conditions implementation . 👀

Conditions all the way down

The more you deal with Kubernetes object to track any form of state, the more you deal with Conditions. They’re everywhere.

They help create a simple reactive system that is open to change. Hopefully, this articles gives a good understanding of how to get started.