Getting started with the Operator SDK
About the Operator SDK
The Operator Framework is an open source toolkit to manage Kubernetes native applications, called Operators, in an effective, automated, and scalable way. Operators take advantage of Kubernetes' extensibility to deliver the automation advantages of cloud services like provisioning, scaling, and backup and restore, while being able to run anywhere that Kubernetes can run.
Operators make it easy to manage complex, stateful applications on top of Kubernetes. However, writing an Operator today can be difficult because of challenges such as using low-level APIs, writing boilerplate, and a lack of modularity, which leads to duplication.
The Operator SDK is a framework designed to make writing operators easier by providing:
-
High-level APIs and abstractions to write the operational logic more intuitively
-
Tools for scaffolding and code generation to quickly bootstrap a new project
-
Extensions to cover common operator use cases
Operator SDK Workflow
The SDK provides the following workflow to develop a new operator:
-
Create a new Operator project using the SDK command line interface (CLI).
-
Define new resource APIs by adding Custom Resource Definitions (CRDs).
-
Specify resources to watch using the SDK API.
-
Define the Operator reconciling logic in a designated handler and use the SDK API to interact with resources.
-
Use the SDK CLI to build and generate the Operator deployment manifests.
At a high level, an Operator using the SDK processes events for watched resources in a user-defined handler and takes actions to reconcile the state of the application.
Manager
The main program for the Operator is the manager cmd/manager/main.go
file. The manager automatically registers the scheme for all custom resources defined under pkg/apis/
and runs all controllers under pkg/controller/
.
The manager can restrict the namespace that all controllers watch for resources:
mgr, err := manager.New(cfg, manager.Options{Namespace: namespace})
By default, this is the namespace that the Operator is running in. To watch all namespaces, you can leave the namespace option empty:
mgr, err := manager.New(cfg, manager.Options{Namespace: ""})
Installing the Operator SDK CLI
The Operator SDK has a CLI tool that assists developers in creating, building, and deploying a new Operator project. You can install the SDK CLI on your workstation so you are prepared to start authoring your own Operators.
-
Clone an
operator-sdk
repository:$ mkdir -p $GOPATH/src/github.com/operator-framework $ cd $GOPATH/src/github.com/operator-framework $ git clone https://github.com/operator-framework/operator-sdk $ cd operator-sdk
-
Check out the desired release branch:
$ git checkout master
-
Install the SDK CLI tool:
$ make dep $ make install
This installs the CLI binary
operator-sdk
at $GOPATH/bin. -
Verify that the CLI tool was installed correctly:
$ operator-sdk -h
Building a Memcached Operator using the Operator SDK
The Operator SDK makes it easier to build Kubernetes native applications, a process that can require deep, application-specific operational knowledge. The SDK not only lowers that barrier, but it also helps reduce the amount of boilerplate code needed for many common management capabilities, such as metering or monitoring.
This procedure walks through an example of building a simple Memcached Operator using tools and libraries provided by the SDK.
-
Operator SDK CLI installed on the development workstation
-
Operator Lifecycle Manager (OLM) installed on a Kubernetes-based cluster (v1.8 or above to support the
apps/v1beta2
API group), for example OKD 4.0 -
Access to the cluster using an account with
cluster-admin
permissions -
kubectl
v1.11.3+ (can alternatively useoc
)
-
Create a new project.
Use the CLI to create a new
memcached-operator
project:$ cd $GOPATH/src/github.com/example-inc/ $ operator-sdk new memcached-operator $ cd memcached-operator
See Appendices to learn about the project directory structure created by the previous commands.
-
Add a new Custom Resource Definition (CRD).
-
Use the CLI to add a new CRD API called
Memcached
, withAPIVersion
set tocache.example.com/v1apha1
andKind
set toMemcached
:$ operator-sdk add api \ --api-version=cache.example.com/v1alpha1 \ --kind=Memcached
This scaffolds the Memcached resource API under
pkg/apis/cache/v1alpha1/
. -
Modify the spec and status of the
Memcached
Custom Resource (CR) at thepkg/apis/cache/v1alpha1/memcached_types.go
file:type MemcachedSpec struct { // Size is the size of the memcached deployment Size int32 `json:"size"` } type MemcachedStatus struct { // Nodes are the names of the memcached pods Nodes []string `json:"nodes"` }
-
After modifying the
*_types.go
file, always run the following command to update the generated code for that resource type:$ operator-sdk generate k8s
-
-
Add a new Controller.
-
Add a new Controller to the project to watch and reconcile the Memcached resource:
$ operator-sdk add controller \ --api-version=cache.example.com/v1alpha1 \ --kind=Memcached
This scaffolds a new Controller implementation under
pkg/controller/memcached/
. -
For this example, replace the generated controller file
pkg/controller/memcached/memcached_controller.go
with the example implementation.The example controller executes the following reconciliation logic for each
Memcached
CR:-
Create a Memcached Deployment if it does not exist.
-
Ensure that the Deployment size is the same as specified by the
Memcached
CR spec. -
Update the
Memcached
CR status with the names of the Memcached Pods.
The next two sub-steps inspect how the Controller watches resources and how the reconcile loop is triggered. You can skip skip these steps step to go directly to building and running the Operator.
-
-
Inspect the Controller implementation at the
pkg/controller/memcached/memcached_controller.go
file to see how the Controller watches resources.The first watch is for the Memcached type as the primary resource. For each Add, Update, or Delete event, the reconcile loop is sent a reconcile
Request
(a<namespace>:<name>
key) for that Memcached object:err := c.Watch( &source.Kind{Type: &cachev1alpha1.Memcached{}}, &handler.EnqueueRequestForObject{})
The next watch is for Deployments, but the event handler maps each event to a reconcile
Request
for the owner of the Deployment. In this case, this is the Memcached object for which the Deployment was created. This allows the controller to watch Deployments as a secondary resource:err := c.Watch(&source.Kind{Type: &appsv1.Deployment{}}, &handler.EnqueueRequestForOwner{ IsController: true, OwnerType: &cachev1alpha1.Memcached{}, })
-
Every Controller has a Reconciler object with a
Reconcile()
method that implements the reconcile loop. The reconcile loop is passed theRequest
argument which is a<namespace>:<name>
key used to lookup the primary resource object, Memcached, from the cache:func (r *ReconcileMemcached) Reconcile(request reconcile.Request) (reconcile.Result, error) { // Lookup the Memcached instance for this reconcile request memcached := &cachev1alpha1.Memcached{} err := r.client.Get(context.TODO(), request.NamespacedName, memcached) ... }
Based on the return value of
Reconcile()
the reconcileRequest
may be requeued and the loop may be triggered again:// Reconcile successful - don't requeue return reconcile.Result{}, nil // Reconcile failed due to error - requeue return reconcile.Result{}, err // Requeue for any reason other than error return reconcile.Result{Requeue: true}, nil
-
-
Build and run the Operator.
-
Before running the Operator, the CRD must be registered with the Kubernetes API server:
$ kubectl create \ -f deploy/crds/cache_v1alpha1_memcached_crd.yaml
-
After registering the CRD, there are two options for running the Operator:
-
As a Deployment inside a Kubernetes cluster
-
As Go program outside a cluster
Choose one of the following methods.
-
Option A: Running as a Deployment inside the cluster.
-
Build the
memcached-operator
image and push it to a registry:$ operator-sdk build quay.io/example/memcached-operator:v0.0.1
-
The Deployment manifest is generated at
deploy/operator.yaml
. Update the Deployment image as follows since the default is just a placeholder:$ sed -i 's|REPLACE_IMAGE|quay.io/example/memcached-operator:v0.0.1|g' deploy/operator.yaml
-
Ensure you have an account on quay.io for the next step, or substitute your preferred container registry. On the registry, create a new public image repository named
memcached-operator
. -
Push the image to the registry:
$ docker push quay.io/example/memcached-operator:v0.0.1
-
Setup RBAC and deploy
memcached-operator
:$ kubectl create -f deploy/role.yaml $ kubectl create -f deploy/role_binding.yaml # TODO: $ kubectl create -f deploy/service_account.yaml $ kubectl create -f deploy/operator.yaml
-
Verify that
memcached-operator
is up and running:$ kubectl get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE memcached-operator 1 1 1 1 1m
-
-
Option B: Running locally outside the cluster.
This method is preferred during development cycle to deploy and test faster.
Run the Operator locally with the default Kubernetes configuration file present at
$HOME/.kube/config
:$ operator-sdk up local --namespace=default 2018/09/30 23:10:11 Go Version: go1.10.2 2018/09/30 23:10:11 Go OS/Arch: darwin/amd64 2018/09/30 23:10:11 operator-sdk Version: 0.0.6+git 2018/09/30 23:10:12 Registering Components. 2018/09/30 23:10:12 Starting the Cmd.
You can use a specific
kubeconfig
using the flag--kubeconfig=<path/to/kubeconfig>
.
-
-
-
Verify that the Operator can deploy a Memcached application by creating a Memcached CR.
-
Create the example
Memcached
CR that was generated atdeploy/crds/cache_v1alpha1_memcached_cr.yaml
:$ cat deploy/crds/cache_v1alpha1_memcached_cr.yaml apiVersion: "cache.example.com/v1alpha1" kind: "Memcached" metadata: name: "example-memcached" spec: size: 3 $ kubectl apply -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
-
Ensure that
memcached-operator
creates the Deployment for the CR:$ kubectl get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE memcached-operator 1 1 1 1 2m example-memcached 3 3 3 3 1m
-
Check the Pods and CR status to confirm the status is updated with the
memcached
Pod names:$ kubectl get pods NAME READY STATUS RESTARTS AGE example-memcached-6fd7c98d8-7dqdr 1/1 Running 0 1m example-memcached-6fd7c98d8-g5k7v 1/1 Running 0 1m example-memcached-6fd7c98d8-m7vn7 1/1 Running 0 1m memcached-operator-7cc7cfdf86-vvjqk 1/1 Running 0 2m $ kubectl get memcached/example-memcached -o yaml apiVersion: cache.example.com/v1alpha1 kind: Memcached metadata: clusterName: "" creationTimestamp: 2018-03-31T22:51:08Z generation: 0 name: example-memcached namespace: default resourceVersion: "245453" selfLink: /apis/cache.example.com/v1alpha1/namespaces/default/memcacheds/example-memcached uid: 0026cc97-3536-11e8-bd83-0800274106a1 spec: size: 3 status: nodes: - example-memcached-6fd7c98d8-7dqdr - example-memcached-6fd7c98d8-g5k7v - example-memcached-6fd7c98d8-m7vn7
-
-
Verify that the Operator can manage a deployed Memcached application by updating the size of the deployment.
-
Change the
spec.size
field in thememcached
CR from3
to4
:$ cat deploy/crds/cache_v1alpha1_memcached_cr.yaml apiVersion: "cache.example.com/v1alpha1" kind: "Memcached" metadata: name: "example-memcached" spec: size: 4
-
Apply the change:
$ kubectl apply -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
-
Confirm that the Operator changes the Deployment size:
$ kubectl get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE example-memcached 4 4 4 4 5m
-
-
Clean up the resources:
$ kubectl delete -f deploy/crds/cache_v1alpha1_memcached_cr.yaml $ kubectl delete -f deploy/operator.yaml
Managing a Memcached Operator using the Operator Lifecycle Manager
The previous section has covered manually running an Operator. In the next sections, we will explore using the Operator Lifecycle Manager (OLM), which is what enables a more robust deployment model for Operators being run in production environments.
The OLM helps you to install, update, and generally manage the lifecycle of all of the Operators (and their associated services) on a Kubernetes cluster. It runs as an Kubernetes extension and lets you use kubectl
for all the lifecycle management functions without any additional tools.
-
OLM installed on a Kubernetes-based cluster (v1.8 or above to support the
apps/v1beta2
API group), for example OKD 4.0 Preview OLM enabled -
Memcached Operator built
-
Generate an Operator manifest.
An Operator manifest describes how to display, create, and manage the application, in this case Memcached, as a whole. It is defined by a
CustomServiceVersion
(CSV) object and is required for the OLM to function.For the purpose of this guide, we will continue with this predefined manifest file for the next steps. You can alter the image field within this manifest to reflect the image you built in previous steps, but it is unnecessary. In the future, the Operator SDK CLI will generate an Operator manifest for you, a feature that is planned for the next release of the Operator SDK.
See Building a CSV for the Operator Framework for more information on manually defining a manifest file.
-
Deploy the Operator.
-
Deploy an Operator by applying the Operator’s manifest to the desired namespace in the cluster:
$ curl -Lo memcachedoperator.0.0.1.csv.yaml https://raw.githubusercontent.com/operator-framework/getting-started/master/memcachedoperator.0.0.1.csv.yaml $ kubectl apply -f memcachedoperator.0.0.1.csv.yaml $ kubectl get ClusterServiceVersion memcachedoperator.v0.0.1 -o json | jq '.status'
-
After applying this manifest, nothing has happened yet, because the cluster does not meet the requirements specified in our manifest. Create the RBAC rules and
CustomResourceDefinition
for the Memcached type managed by the Operator:$ kubectl apply -f deploy/rbac.yaml $ kubectl apply -f deploy/crd.yaml
Because the OLM creates Operators in a particular namespace when a manifest is applied, administrators can leverage the native Kubernetes RBAC permission model to restrict which users are allowed to install Operators.
-
-
Create an application instance.
The Memcached Operator is now running in the
memcached
namespace. Users interact with Operators via instances ofCustomResources
; in this case, the resource has the kindMemcached
. Native Kubernetes RBAC also applies toCustomResources
, providing administrators control over who can interact with each Operator.Creating instances of Memcached in this namespace will now trigger the Memcached Operator to instantiate pods running the memcached server that are managed by the Operator. The more
CustomResources
you create, the more unique instances of Memcached are managed by the Memcached Operator running in this namespace.$ cat <<EOF | kubectl apply -f - apiVersion: "cache.example.com/v1alpha1" kind: "Memcached" metadata: name: "memcached-for-wordpress" spec: size: 1 EOF $ cat <<EOF | kubectl apply -f - apiVersion: "cache.example.com/v1alpha1" kind: "Memcached" metadata: name: "memcached-for-drupal" spec: size: 1 EOF $ kubectl get Memcached NAME AGE memcached-for-drupal 22s memcached-for-wordpress 27s $ kubectl get pods NAME READY STATUS RESTARTS AGE memcached-app-operator-66b5777b79-pnsfj 1/1 Running 0 14m memcached-for-drupal-5476487c46-qbd66 1/1 Running 0 3s memcached-for-wordpress-65b75fd8c9-7b9x7 1/1 Running 0 8s
-
Update an application.
Manually apply an update to the Operator by creating a new Operator manifest with a
replaces
field that references the old Operator manifest. The OLM ensures that all resources being managed by the old Operator have their ownership moved to the new Operator without fear of any programs stopping execution. It is up to the Operators themselves to execute any data migrations required to upgrade resources to run under a new version of the Operator.The following command demonstrates applying a new Operator manifest file using a new version of the Operator and shows that the pods remain executing:
$ curl -Lo memcachedoperator.0.0.2.csv.yaml https://raw.githubusercontent.com/operator-framework/getting-started/master/memcachedoperator.0.0.2.csv.yaml $ kubectl apply -f memcachedoperator.0.0.2.csv.yaml $ kubectl get pods NAME READY STATUS RESTARTS AGE memcached-app-operator-66b5777b79-pnsfj 1/1 Running 0 3s memcached-for-drupal-5476487c46-qbd66 1/1 Running 0 14m memcached-for-wordpress-65b75fd8c9-7b9x7 1/1 Running 0 14m
Getting involved
This guide provides an effective demonstration of the value of the Operator Framework for building and managing Operators, but this is much more left out in the interest of brevity. The Operator Framework and its components are open source, so visit each project individually and learn what else you can do:
If you want to discuss your experience, have questions, or want to get involved, join the Operator Framework mailing list.