Implementing in-place update hooks
Introduction
The proposal for in-place updates in Cluster API introduced extensions allowing users to execute changes on existing machines without deleting the machines and creating a new one.
Notably, the Cluster API user experience remain the same as of today no matter of the in-place update feature is enabled or not e.g. in order to trigger a MachineDeployment rollout, you have to rotate a template, etc.
Users should care ONLY about the desired state (as of today).
Cluster API is responsible to choose the best strategy to achieve desired state, and with the introduction of update extensions, Cluster API is expanding the set of tools that can be used to achieve the desired state.
If external update extensions can not cover the totality of the desired changes, CAPI will fall back to Cluster API’s default, immutable rollouts.
Cluster API will be also responsible to determine which Machine/MachineSet should be updated, as well as to handle rollout options like MaxSurge/MaxUnavailable. With this regard:
- Machines updating in-place are considered not available, because in-place updates are always considered as potentially disruptive.
- For control plane machines, if maxSurge is one, a new machine must be created first, then as soon as there is
“buffer” for in-place, in-place update can proceed.
- KCP will not use in-place in case it will detect that it can impact health of the control plane.
- For workers machines, if maxUnavailable is zero, a new machine must be created first, then as soon as there
is “buffer” for in-place, in-place update can proceed.
- When in-place is possible, the system should try to in-place update as many machines as possible. In practice, this means that maxSurge might be not fully used (it is used only for scale up by one if maxUnavailable=0).
- No in-place updates are performed for workers machines when using rollout strategy
OnDelete.
- For control plane machines, if maxSurge is one, a new machine must be created first, then as soon as there is
“buffer” for in-place, in-place update can proceed.
Guidelines
All guidelines defined in Implementing Runtime Extensions apply to the implementation of Runtime Extensions for upgrade plan hooks as well.
In summary, Runtime Extensions are components that should be designed, written and deployed with great caution given that they can affect the proper functioning of the Cluster API runtime. A poorly implemented Runtime Extension could potentially block upgrade transitions from happening.
Following recommendations are especially relevant:
Definitions
For additional details about the OpenAPI spec of the upgrade plan hooks, please download the runtime-sdk-openapi.yaml
file and then open it from the Swagger UI.
CanUpdateMachine
This hook is called by KCP when performing the “can update in-place” for a control plane machine.
Example request:
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
kind: CanUpdateMachineRequest
settings: <Runtime Extension settings>
current:
machine:
apiVersion: cluster.x-k8s.io/v1beta2
kind: Machine
metadata:
name: test-cluster
namespace: test-ns
spec:
...
infrastructureMachine:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachine
metadata:
name: test-cluster
namespace: test-ns
spec:
...
boostrapConfig:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
metadata:
name: test-cluster
namespace: test-ns
spec:
...
desired:
machine:
...
infrastructureMachine:
...
boostrapConfig:
...
Note:
- All the objects will have the latest API version known by Cluster API.
- Only spec is provided, status fields are not included
- In a future release, when registering more than one extension for the
CanUpdateMachinewill be supported, the current state will already include changes that can be handled in-place by other runtime extensions.
Example Response:
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
kind: CanUpdateMachineResponse
status: Success # or Failure
message: "error message if status == Failure"
machinePatch:
patchType: JSONPatch
patch: <JSON-patch>
infrastructureMachinePatch:
...
boostrapConfigPatch:
...
Note:
- Extensions should return per-object patches to be applied on current objects to indicate which changes they can handle in-place.
- Only fields in Machine/InfraMachine/BootstrapConfig spec have to be covered by patches
- Patches must be in JSONPatch or JSONMergePatch format
CanUpdateMachineSet
This hook is called by the MachineDeployment controller when performing the “can update in-place” for all the Machines controlled by a MachineSet.
Example request:
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
kind: CanUpdateMachineSetRequest
settings: <Runtime Extension settings>
current:
machineSet:
apiVersion: cluster.x-k8s.io/v1beta2
kind: MachineSet
metadata:
name: test-cluster
namespace: test-ns
spec:
...
infrastructureMachineTemplate:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
name: test-cluster
namespace: test-ns
spec:
...
boostrapConfigTemplate:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: test-cluster
namespace: test-ns
spec:
...
desired:
machineSet:
...
infrastructureMachineTemplate:
...
boostrapConfigTemplate:
...
Note:
- All the objects will have the latest API version known by Cluster API.
- Only spec is provided, status fields are not included
- In a future release, when registering more than one extension for the
CanUpdateMachineSetwill be supported, the current state will already include changes that can be handled in-place by other runtime extensions.
Example Response:
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
kind: CanUpdateMachineSetResponse
status: Success # or Failure
message: "error message if status == Failure"
machineSetPatch:
patchType: JSONPatch
patch: <JSON-patch>
infrastructureMachineTemplatePatch:
...
boostrapConfigTemplatePatch:
...
Note:
- Extensions should return per-object patches to be applied on current objects to indicate which changes they can handle in-place.
- Only fields in Machine/InfraMachine/BootstrapConfig spec have to be covered by patches
- Patches must be in JSONPatch or JSONMergePatch format
UpdateMachine
This hook is called by the Machine controller when performing the in-place updates for a Machine.
Example request:
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
kind: UpdateMachineRequest
settings: <Runtime Extension settings>
desired:
machine:
apiVersion: cluster.x-k8s.io/v1beta2
kind: Machine
metadata:
name: test-cluster
namespace: test-ns
spec:
...
infrastructureMachineTemplate:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
name: test-cluster
namespace: test-ns
spec:
...
boostrapConfigTemplate:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: test-cluster
namespace: test-ns
spec:
...
Note:
- Only desired is provided (the external updater extension should know current state of the Machine).
- Only spec is provided, status fields are not included
Example Response:
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
kind: UpdateMachineSetResponse
status: Success # or Failure
message: "error message if status == Failure"
retryAfterSeconds: 10
Note:
- The status of the update operation is determined by the CommonRetryResponse fields:
- Status=Success + RetryAfterSeconds > 0: update is in progress
- Status=Success + RetryAfterSeconds = 0: update completed successfully
- Status=Failure: update failed