Persistent Storage Using Ceph Rados Block Device (RBD)
OKD clusters can be provisioned with persistent storage using Ceph RBD.
Persistent volumes (PVs) and persistent volume claims (PVCs) can share volumes across a single project. While the Ceph RBD-specific information contained in a PV definition could also be defined directly in a pod definition, doing so does not create the volume as a distinct cluster resource, making the volume more susceptible to conflicts.
Project and namespace are used interchangeably throughout this document. See Projects and Users for details on the relationship.
High-availability of storage in the infrastructure is left to the underlying storage provider.
To provision Ceph volumes, the following are required:
An existing storage device in your underlying infrastructure.
The Ceph key to be used in an OKD secret object.
The Ceph image name.
The file system type on top of the block storage (e.g., ext4).
ceph-common installed on each schedulable OKD node in your cluster:
# yum install ceph-common
Define the authorization key in a secret configuration, which is then converted to base64 for use by OKD.
In order to use Ceph storage to back a persistent volume, the secret must be created in the same project as the PVC and pod. The secret cannot simply be in the default project.
ceph auth get-keyon a Ceph MON node to display the key value for the
apiVersion: v1 kind: Secret metadata: name: ceph-secret data: key: QVFBOFF2SlZheUJQRVJBQWgvS2cwT1laQUhPQno3akZwekxxdGc9PQ== type: kubernetes.io/rbd
Save the secret definition to a file, for example ceph-secret.yaml, then create the secret:
$ oc create -f ceph-secret.yaml
Verify that the secret was created:
# oc get secret ceph-secret NAME TYPE DATA AGE ceph-secret kubernetes.io/rbd 1 23d
Developers request Ceph RBD storage by referencing either a PVC, or the Gluster volume plug-in directly in the
volumes section of a pod specification. A PVC exists only in the user’s namespace and can be referenced only by pods within that same namespace. Any attempt to access a PV from a different namespace causes the pod to fail.
Define the PV in an object definition before creating it in OKD:Example 1. Persistent Volume Object Definition Using Ceph RBD
apiVersion: v1 kind: PersistentVolume metadata: name: ceph-pv (1) spec: capacity: storage: 2Gi (2) accessModes: - ReadWriteOnce (3) rbd: (4) monitors: (5) - 192.168.122.133:6789 pool: rbd image: ceph-image user: admin secretRef: name: ceph-secret (6) fsType: ext4 (7) readOnly: false persistentVolumeReclaimPolicy: Retain
1 The name of the PV that is referenced in pod definitions or displayed in various
2 The amount of storage allocated to this volume. 3
accessModesare used as labels to match a PV and a PVC. They currently do not define any form of access control. All block storage is defined to be single user (non-shared storage).
4 The volume type being used, in this case the rbd plug-in. 5 An array of Ceph monitor IP addresses and ports. 6 The Ceph secret used to create a secure connection from OKD to the Ceph server. 7 The file system type mounted on the Ceph RBD block device.
Changing the value of the
fstypeparameter after the volume has been formatted and provisioned can result in data loss and pod failure.
Save your definition to a file, for example ceph-pv.yaml, and create the PV:
# oc create -f ceph-pv.yaml
Verify that the persistent volume was created:
# oc get pv NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON AGE ceph-pv <none> 2147483648 RWO Available 2s
Create a PVC that will bind to the new PV:Example 2. PVC Object Definition
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: ceph-claim spec: accessModes: (1) - ReadWriteOnce resources: requests: storage: 2Gi (2)
accessModesdo not enforce access right, but instead act as labels to match a PV to a PVC.
2 This claim looks for PVs offering
2Gior greater capacity.
Save the definition to a file, for example ceph-claim.yaml, and create the PVC:
# oc create -f ceph-claim.yaml
See the full Volume Security topic before implementing Ceph RBD volumes.
A significant difference between shared volumes (NFS and GlusterFS) and block volumes (Ceph RBD, iSCSI, and most cloud storage), is that the user and group IDs defined in the pod definition or container image are applied to the target physical storage. This is referred to as managing ownership of the block device. For example, if the Ceph RBD mount has its owner set to 123 and its group ID set to 567, and if the pod defines its
runAsUser set to 222 and its
fsGroup to be 7777, then the Ceph RBD physical mount’s ownership will be changed to 222:7777.
Even if the user and group IDs are not defined in the pod specification, the resulting pod may have defaults defined for these IDs based on its matching SCC, or its project. See the full Volume Security topic which covers storage aspects of SCCs and defaults in greater detail.
A pod defines the group ownership of a Ceph RBD volume using the
fsGroup stanza under the pod’s
spec: containers: - name: ... securityContext: (1) fsGroup: 7777 (2)
|2||All containers in the pod will have the same fsGroup ID.|