Orchestrate a Local Cluster with Kubernetes

2020-02-16

On top of CockroachDB's built-in automation, you can use a third-party orchestration system to simplify and automate even more of your operations, from deployment to scaling to overall cluster management.

This page walks you through a simple demonstration, using the open-source Kubernetes orchestration system. Using either the CockroachDB Helm chart or a few configuration files, you'll quickly create a 3-node local cluster. You'll run some SQL commands against the cluster and then simulate node failure, watching how Kubernetes auto-restarts without the need for any manual intervention. You'll then scale the cluster with a single command before shutting the cluster down, again with a single command.

{{site.data.alerts.callout_info}} To orchestrate a physically distributed cluster in production, see Orchestrated Deployments. {{site.data.alerts.end}}

Before you begin

Before getting started, it's helpful to review some Kubernetes-specific terminology:

Feature Description
minikube This is the tool you'll use to run a Kubernetes cluster inside a VM on your local workstation.
pod A pod is a group of one or more Docker containers. In this tutorial, all pods will run on your local workstation, each containing one Docker container running a single CockroachDB node. You'll start with 3 pods and grow to 4.
StatefulSet A StatefulSet is a group of pods treated as stateful units, where each pod has distinguishable network identity and always binds back to the same persistent storage on restart. StatefulSets are considered stable as of Kubernetes version 1.9 after reaching beta in version 1.5.
persistent volume A persistent volume is a piece of storage mounted into a pod. The lifetime of a persistent volume is decoupled from the lifetime of the pod that's using it, ensuring that each CockroachDB node binds back to the same storage on restart.

When using minikube, persistent volumes are external temporary directories that endure until they are manually deleted or until the entire Kubernetes cluster is deleted.
persistent volume claim When pods are created (one per CockroachDB node), each pod will request a persistent volume claim to “claim” durable storage for its node.

Step 1. Start Kubernetes

  1. Follow Kubernetes' documentation to install minikube, the tool used to run Kubernetes locally, for your OS. This includes installing a hypervisor and kubectl, the command-line tool used to manage Kubernetes from your local workstation.

    {{site.data.alerts.callout_info}}Make sure you install minikube version 0.21.0 or later. Earlier versions do not include a Kubernetes server that supports the maxUnavailability field and PodDisruptionBudget resource type used in the CockroachDB StatefulSet configuration.{{site.data.alerts.end}}

  2. Start a local Kubernetes cluster:

    $ minikube start
    

Step 2. Start CockroachDB

To start your CockroachDB cluster, you can either use our StatefulSet configuration and related files directly, or you can use the Helm package manager for Kubernetes to simplify the process.

{% include {{ page.version.version }}/orchestration/start-cockroachdb-secure.md %}
{% include {{ page.version.version }}/orchestration/start-cockroachdb-helm-secure.md %}

Step 3. Use the built-in SQL client

To use the built-in SQL client, you need to launch a pod that runs indefinitely with the cockroach binary inside it, get a shell into the pod, and then start the built-in SQL client.

1. From your local workstation, use our [`client-secure.yaml`](https://github.com/cockroachdb/cockroach/blob/master/cloud/kubernetes/client-secure.yaml) file to launch a pod and keep it running indefinitely: {% include copy-clipboard.html %} ~~~ shell $ kubectl create -f https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/client-secure.yaml ~~~ ~~~ pod "cockroachdb-client-secure" created ~~~ The pod uses the `root` client certificate created earlier to initialize the cluster, so there's no CSR approval required. 2. Get a shell into the pod and start the CockroachDB [built-in SQL client](use-the-built-in-sql-client.html): {% include copy-clipboard.html %} ~~~ shell $ kubectl exec -it cockroachdb-client-secure -- ./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public ~~~ ~~~ # Welcome to the cockroach SQL interface. # All statements must be terminated by a semicolon. # To exit: CTRL + D. # # Server version: CockroachDB CCL v1.1.2 (linux amd64, built 2017/11/02 19:32:03, go1.8.3) (same version as client) # Cluster ID: 3292fe08-939f-4638-b8dd-848074611dba # # Enter \? for a brief introduction. # root@cockroachdb-public:26257/> ~~~ 3. Run some basic [CockroachDB SQL statements](learn-cockroachdb-sql.html): {% include copy-clipboard.html %} ~~~ sql > CREATE DATABASE bank; ~~~ {% include copy-clipboard.html %} ~~~ sql > CREATE TABLE bank.accounts (id INT PRIMARY KEY, balance DECIMAL); ~~~ {% include copy-clipboard.html %} ~~~ sql > INSERT INTO bank.accounts VALUES (1, 1000.50); ~~~ {% include copy-clipboard.html %} ~~~ sql > SELECT * FROM bank.accounts; ~~~ ~~~ +----+---------+ | id | balance | +----+---------+ | 1 | 1000.5 | +----+---------+ (1 row) ~~~ 4. [Create a user with a password](create-user.html#create-a-user-with-a-password): {% include copy-clipboard.html %} ~~~ sql > CREATE USER roach WITH PASSWORD 'Q7gc8rEdS'; ~~~ You will need this username and password to access the Admin UI later. 5. Exit the SQL shell and pod: {% include copy-clipboard.html %} ~~~ sql > \q ~~~
1. From your local workstation, use our [`client-secure.yaml`](https://github.com/cockroachdb/cockroach/blob/master/cloud/kubernetes/client-secure.yaml) file to launch a pod and keep it running indefinitely. 1. Download the file: {% include copy-clipboard.html %} ~~~ shell $ curl -OOOOOOOOO \ https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/client-secure.yaml ~~~ 1. In the file, change `serviceAccountName: cockroachdb` to `serviceAccountName: my-release-cockroachdb`. 1. Use the file to launch a pod and keep it running indefinitely: {% include copy-clipboard.html %} ~~~ shell $ kubectl create -f client-secure.yaml ~~~ ~~~ pod "cockroachdb-client-secure" created ~~~ The pod uses the `root` client certificate created earlier to initialize the cluster, so there's no CSR approval required. 2. Get a shell into the pod and start the CockroachDB [built-in SQL client](use-the-built-in-sql-client.html): {% include copy-clipboard.html %} ~~~ shell $ kubectl exec -it cockroachdb-client-secure -- ./cockroach sql --certs-dir=/cockroach-certs --host=my-release-cockroachdb-public ~~~ ~~~ # Welcome to the cockroach SQL interface. # All statements must be terminated by a semicolon. # To exit: CTRL + D. # # Server version: CockroachDB CCL v1.1.2 (linux amd64, built 2017/11/02 19:32:03, go1.8.3) (same version as client) # Cluster ID: 3292fe08-939f-4638-b8dd-848074611dba # # Enter \? for a brief introduction. # root@my-release-cockroachdb-public:26257/> ~~~ 3. Run some basic [CockroachDB SQL statements](learn-cockroachdb-sql.html): {% include copy-clipboard.html %} ~~~ sql > CREATE DATABASE bank; ~~~ {% include copy-clipboard.html %} ~~~ sql > CREATE TABLE bank.accounts (id INT PRIMARY KEY, balance DECIMAL); ~~~ {% include copy-clipboard.html %} ~~~ sql > INSERT INTO bank.accounts VALUES (1, 1000.50); ~~~ {% include copy-clipboard.html %} ~~~ sql > SELECT * FROM bank.accounts; ~~~ ~~~ +----+---------+ | id | balance | +----+---------+ | 1 | 1000.5 | +----+---------+ (1 row) ~~~ 4. [Create a user with a password](create-user.html#create-a-user-with-a-password): {% include copy-clipboard.html %} ~~~ sql > CREATE USER roach WITH PASSWORD 'Q7gc8rEdS'; ~~~ You will need this username and password to access the Admin UI later. 5. Exit the SQL shell and pod: {% include copy-clipboard.html %} ~~~ sql > \q ~~~

{{site.data.alerts.callout_success}} This pod will continue running indefinitely, so any time you need to reopen the built-in SQL client or run any other cockroach client commands (e.g., cockroach node), repeat step 2 using the appropriate cockroach command.

If you'd prefer to delete the pod and recreate it when needed, run kubectl delete pod cockroachdb-client-secure. {{site.data.alerts.end}}

Step 4. Access the Admin UI

To access the cluster's Admin UI:

  1. Port-forward from your local machine to one of the pods:

    {% include copy-clipboard.html %} ~~~ shell $ kubectl port-forward cockroachdb-0 8080 ~~~
    {% include copy-clipboard.html %} ~~~ shell $ kubectl port-forward my-release-cockroachdb-0 8080 ~~~
    Forwarding from 127.0.0.1:8080 -> 8080
    

    {{site.data.alerts.callout_info}}The port-forward command must be run on the same machine as the web browser in which you want to view the Admin UI. If you have been running these commands from a cloud instance or other non-local shell, you will not be able to view the UI without configuring kubectl locally and running the above port-forward command on your local machine.{{site.data.alerts.end}}

  1. Go to https://localhost:8080 and log in with the username and password you created earlier.
  1. Go to http://localhost:8080.
  1. In the UI, verify that the cluster is running as expected:
    • Click View nodes list on the right to ensure that all nodes successfully joined the cluster.
    • Click the Databases tab on the left to verify that bank is listed.

Step 5. Simulate node failure

Based on the replicas: 3 line in the StatefulSet configuration, Kubernetes ensures that three pods/nodes are running at all times. When a pod/node fails, Kubernetes automatically creates another pod/node with the same network identity and persistent storage.

To see this in action:

  1. Kill one of CockroachDB nodes:

    {% include copy-clipboard.html %} ~~~ shell $ kubectl delete pod cockroachdb-2 ~~~ ~~~ pod "cockroachdb-2" deleted ~~~
    {% include copy-clipboard.html %} ~~~ shell $ kubectl delete pod my-release-cockroachdb-2 ~~~ ~~~ pod "my-release-cockroachdb-2" deleted ~~~
  2. In the Admin UI, the Cluster Overview will soon show one node as Suspect. As Kubernetes auto-restarts the node, watch how the node once again becomes healthy.

  3. Back in the terminal, verify that the pod was automatically restarted:

    {% include copy-clipboard.html %} ~~~ shell $ kubectl get pod cockroachdb-2 ~~~ ~~~ NAME READY STATUS RESTARTS AGE cockroachdb-2 1/1 Running 0 12s ~~~
    {% include copy-clipboard.html %} ~~~ shell $ kubectl get pod my-release-cockroachdb-2 ~~~ ~~~ NAME READY STATUS RESTARTS AGE my-release-cockroachdb-2 1/1 Running 0 44s ~~~

Step 6. Add nodes

  1. Use the kubectl scale command to add a pod for another CockroachDB node:

    {% include copy-clipboard.html %} ~~~ shell $ kubectl scale statefulset cockroachdb --replicas=4 ~~~ ~~~ statefulset "cockroachdb" scaled ~~~
    {% include copy-clipboard.html %} ~~~ shell $ kubectl scale statefulset my-release-cockroachdb --replicas=4 ~~~ ~~~ statefulset "my-release-cockroachdb" scaled ~~~
  2. Verify that the pod for a fourth node, cockroachdb-3, was added successfully:

    $ kubectl get pods
    
    ~~~ NAME READY STATUS RESTARTS AGE cockroachdb-0 1/1 Running 0 28m cockroachdb-1 1/1 Running 0 27m cockroachdb-2 1/1 Running 0 10m cockroachdb-3 1/1 Running 0 5s example-545f866f5-2gsrs 1/1 Running 0 25m ~~~
    ~~~ NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 28m my-release-cockroachdb-1 1/1 Running 0 27m my-release-cockroachdb-2 1/1 Running 0 10m my-release-cockroachdb-3 1/1 Running 0 5s example-545f866f5-2gsrs 1/1 Running 0 25m ~~~

Step 7. Remove nodes

To safely remove a node from your cluster, you must first decommission the node and only then adjust the --replicas value of your StatefulSet configuration to permanently remove it. This sequence is important because the decommissioning process lets a node finish in-flight requests, rejects any new requests, and transfers all range replicas and range leases off the node.

{{site.data.alerts.callout_danger}} If you remove nodes without first telling CockroachDB to decommission them, you may cause data or even cluster unavailability. For more details about how this works and what to consider before removing nodes, see Decommission Nodes. {{site.data.alerts.end}}

  1. Get a shell into the cockroachdb-client-secure pod you created earlier and use the cockroach node status command to get the internal IDs of nodes:

    {% include copy-clipboard.html %} ~~~ shell $ kubectl exec -it cockroachdb-client-secure -- ./cockroach node status --certs-dir=/cockroach-certs --host=cockroachdb-public ~~~ ~~~ id | address | build | started_at | updated_at | is_available | is_live +----+---------------------------------------------------------------------------------+--------+----------------------------------+----------------------------------+--------------+---------+ 1 | cockroachdb-0.cockroachdb.default.svc.cluster.local:26257 | v2.1.1 | 2018-11-29 16:04:36.486082+00:00 | 2018-11-29 18:24:24.587454+00:00 | true | true 2 | cockroachdb-2.cockroachdb.default.svc.cluster.local:26257 | v2.1.1 | 2018-11-29 16:55:03.880406+00:00 | 2018-11-29 18:24:23.469302+00:00 | true | true 3 | cockroachdb-1.cockroachdb.default.svc.cluster.local:26257 | v2.1.1 | 2018-11-29 16:04:41.383588+00:00 | 2018-11-29 18:24:25.030175+00:00 | true | true 4 | cockroachdb-3.cockroachdb.default.svc.cluster.local:26257 | v2.1.1 | 2018-11-29 17:31:19.990784+00:00 | 2018-11-29 18:24:26.041686+00:00 | true | true (4 rows) ~~~
    {% include copy-clipboard.html %} ~~~ shell $ kubectl exec -it cockroachdb-client-secure -- ./cockroach node status --certs-dir=/cockroach-certs --host=my-release-cockroachdb-public ~~~ ~~~ id | address | build | started_at | updated_at | is_available | is_live +----+---------------------------------------------------------------------------------+--------+----------------------------------+----------------------------------+--------------+---------+ 1 | my-release-cockroachdb-0.my-release-cockroachdb.default.svc.cluster.local:26257 | v2.1.1 | 2018-11-29 16:04:36.486082+00:00 | 2018-11-29 18:24:24.587454+00:00 | true | true 2 | my-release-cockroachdb-2.my-release-cockroachdb.default.svc.cluster.local:26257 | v2.1.1 | 2018-11-29 16:55:03.880406+00:00 | 2018-11-29 18:24:23.469302+00:00 | true | true 3 | my-release-cockroachdb-1.my-release-cockroachdb.default.svc.cluster.local:26257 | v2.1.1 | 2018-11-29 16:04:41.383588+00:00 | 2018-11-29 18:24:25.030175+00:00 | true | true 4 | my-release-cockroachdb-3.my-release-cockroachdb.default.svc.cluster.local:26257 | v2.1.1 | 2018-11-29 17:31:19.990784+00:00 | 2018-11-29 18:24:26.041686+00:00 | true | true (4 rows) ~~~

    The pod uses the root client certificate created earlier to initialize the cluster, so there's no CSR approval required.

  2. Note the ID of the node with the highest number in its address (in this case, the address including cockroachdb-3) and use the cockroach node decommission command to decommission it:

    {{site.data.alerts.callout_info}} It's important to decommission the node with the highest number in its address because, when you reduce the --replica count, Kubernetes will remove the pod for that node. {{site.data.alerts.end}}

    {% include copy-clipboard.html %} ~~~ shell $ kubectl exec -it cockroachdb-client-secure -- ./cockroach node decommission --insecure --host=cockroachdb-public ~~~
    {% include copy-clipboard.html %} ~~~ shell $ kubectl exec -it cockroachdb-client-secure -- ./cockroach node decommission --insecure --host=my-release-cockroachdb-public ~~~

    You'll then see the decommissioning status print to stderr as it changes:

     id | is_live | replicas | is_decommissioning | is_draining  
    +---+---------+----------+--------------------+-------------+
      4 |  true   |       73 |        true        |    false     
    (1 row)
    

    Once the node has been fully decommissioned and stopped, you'll see a confirmation:

     id | is_live | replicas | is_decommissioning | is_draining  
    +---+---------+----------+--------------------+-------------+
      4 |  true   |        0 |        true        |    false     
    (1 row)
    
    No more data reported on target nodes. Please verify cluster health before removing the nodes.
    
  3. Once the node has been decommissioned, use the kubectl scale command to remove a pod from your StatefulSet:

    {% include copy-clipboard.html %} ~~~ shell $ kubectl scale statefulset cockroachdb --replicas=3 ~~~ ~~~ statefulset "cockroachdb" scaled ~~~
    {% include copy-clipboard.html %} ~~~ shell $ kubectl scale statefulset my-release-cockroachdb --replicas=3 ~~~ ~~~ statefulset "my-release-cockroachdb" scaled ~~~

Step 8. Stop the cluster

  • If you plan to restart the cluster, use the minikube stop command. This shuts down the minikube virtual machine but preserves all the resources you created:

    $ minikube stop
    
    Stopping local Kubernetes cluster...
    Machine stopped.
    

    You can restore the cluster to its previous state with minikube start.

  • If you do not plan to restart the cluster, use the minikube delete command. This shuts down and deletes the minikube virtual machine and all the resources you created, including persistent volumes:

    $ minikube delete
    
    Deleting local Kubernetes cluster...
    Machine deleted.
    

    {{site.data.alerts.callout_success}}To retain logs, copy them from each pod's stderr before deleting the cluster and all its resources. To access a pod's standard error stream, run kubectl logs <podname>.{{site.data.alerts.end}}

See also

Explore other core CockroachDB benefits and features:

You might also want to learn how to orchestrate a production deployment of CockroachDB with Kubernetes.