Configuring the cluster auto-scaler in AWS
About the OKD auto-scaler
The auto-scaler in OKD repeatedly checks to see how many pods are pending node allocation. If pods are pending allocation and the auto-scaler has not met its maximum capacity, then new nodes are continuously provisioned to accommodate the current demand. When demand drops and fewer nodes are required, the auto-scaler removes unused nodes. After you install the auto-scaler, its behavior is automatic. You only need to add the desired number of replicas to the deployment.
In OKD version Latest, you can deploy the auto-scaler only on Amazon Web Services (AWS). The auto-scaler uses some standard AWS objects to manage your cluster size, including Auto Scaling groups and Launch Configurations.
The auto-scaler uses the following assets:
- Auto Scaling groups
-
An Auto Scaling group is a logical representation of a set of machines. You configure an Auto Scaling group with a minimum number of instances to run, the maximum number of instances that can run, and your desired number of instances to run. An Auto Scaling group starts by launching enough instances to meet your desired capacity. You can configure an Auto Scaling group to start with zero instances.
- Launch Configurations
-
A Launch Configuration is a template that an Auto Scaling group uses to launch instances. When you create a Launch Configuration, you specify information such as:
-
The ID of the Amazon Machine Image (AMI) to use as the base image
-
The instance type, such as m4.large
-
A key pair
-
One or more security groups
-
The subnets to apply the Launch Configuration to
-
- OKD primed images
-
When the Auto Scaling group provisions a new instance, the image that it launches must have OKD already prepared. The Auto Scaling group uses this image to both automatically bootstrap the node and enroll it within the cluster without any manual intervention.
Creating a primed image
You can use Ansible playbooks to automatically create a primed image for the auto-scaler to use. You must provide attributes from your existing Amazon Web Services (AWS) cluster.
If you already have a primed image, you can use it instead of creating a new one. |
On the host that you used to create your OKD cluster, create a primed image:
-
Create a new Ansible inventory file on your local host:
[OSEv3:children] masters nodes etcd [OSEv3:vars] openshift_deployment_type=origin ansible_ssh_user=ec2-user openshift_clusterid=mycluster ansible_become=yes [masters] [etcd] [nodes]
-
Create provisioning file, build-ami-provisioning-vars.yaml, on your local host:
openshift_deployment_type: origin openshift_aws_clusterid: mycluster (1) openshift_aws_region: us-east-1 (2) openshift_aws_create_vpc: false (3) openshift_aws_vpc_name: production (4) openshift_aws_subnet_az: us-east-1d (5) openshift_aws_create_security_groups: false (6) openshift_aws_ssh_key_name: production-ssh-key (7) openshift_aws_base_ami: ami-12345678 (8) openshift_aws_create_s3: False (9) openshift_aws_build_ami_group: default (10) openshift_aws_vpc: (11) name: "{{ openshift_aws_vpc_name }}" cidr: 172.18.0.0/16 subnets: us-east-1: - cidr: 172.18.0.0/20 az: "us-east-1d" container_runtime_docker_storage_type: overlay2 (12) container_runtime_docker_storage_setup_device: /dev/xvdb (13)
1 Provide the name of the existing cluster. 2 Provide the region the existing cluster is currently running in. 3 Specify False
to disable the creation of a VPC.4 Provide the existing VPC name that the cluster is running in. 5 Provide the name of a subnet the existing cluster is running in. 6 Specify False
to disable the creation of security groups.7 Provide the AWS key name to use for SSH access. 8 Provide the AMI image ID to use as the base image for the primed image. See Red Hat® Cloud Access. 9 Specify False
to disable the creation of an S3 bucket.10 Provide the security group name. 11 Provide the VPC subnets the existing cluster is running in. 12 Specify overlay2
as the Docker storage type.13 Specify the mount point for LVM and the /var/lib/docker directory. -
Run the build_ami.yml playbook to generate a primed image:
# ansible-playbook -i </path/to/inventory/file> \ ~/openshift-ansible/playbooks/aws/openshift-cluster/build_ami.yml \ -e @build-ami-provisioning-vars.yaml
After the playbook runs, you see a new image ID, or AMI, in its output. You specify the AMI that it generated when you create the Launch Configuration.
Creating the launch configuration and Auto Scaling group
Before you deploy the cluster auto-scaler, you must create an Amazon Web Services (AWS) launch configuration and Auto Scaling group that reference a primed image. You must configure the launch configuration so that the new node automatically joins the existing cluster when it starts.
-
Install an OKD cluster in AWS.
-
Create a primed image.
-
If you deployed the EFK stack in your cluster, set the node label to
logging-infra-fluentd=true
.
-
Create the bootstrap.kubeconfig file by generating it from a master node:
$ ssh master "sudo oc serviceaccounts create-kubeconfig -n openshift-infra node-bootstrapper" > ~/bootstrap.kubeconfig
-
Create the user-data.txt cloud-init file from the bootstrap.kubeconfig file:
$ cat <<EOF > user-data.txt #cloud-config write_files: - path: /root/openshift_bootstrap/openshift_settings.yaml owner: 'root:root' permissions: '0640' content: | openshift_node_config_name: node-config-compute - path: /etc/origin/node/bootstrap.kubeconfig owner: 'root:root' permissions: '0640' encoding: b64 content: | $(base64 ~/bootstrap.kubeconfig | sed '2,$s/^/ /') runcmd: - [ ansible-playbook, /root/openshift_bootstrap/bootstrap.yml] - [ systemctl, restart, systemd-hostnamed] - [ systemctl, restart, NetworkManager] - [ systemctl, enable, origin-node] - [ systemctl, start, origin-node] EOF
-
Upload a launch configuration template to an AWS S3 bucket.
-
Create the launch configuration by using the AWS CLI:
$ aws autoscaling create-launch-configuration \ --launch-configuration-name mycluster-LC \ (1) --region us-east-1 \ (2) --image-id ami-987654321 \ (3) --instance-type m4.large \ (4) --security-groups sg-12345678 \ (5) --template-url https://s3-.amazonaws.com/.../yourtemplate.json \ (6) --key-name production-key \ (7)
1 Specify a launch configuration name. 2 Specify the region to launch the image in. 3 Specify the primed image AMI that you created. 4 Specify the type of instance to launch. 5 Specify the security groups to attach to the launched image. 6 Specify the launch configuration template that you uploaded. 7 Specify the SSH key-pair name. If your template is fewer than 16 KB before you encode it, you can provide it using the AWS CLI by substituting --template-url
with--user-data
. -
Create the Auto Scaling group by using the AWS CLI:
$ aws autoscaling create-auto-scaling-group \ --auto-scaling-group-name mycluster-ASG \ (1) --launch-configuration-name mycluster-LC \ (2) --min-size 0 \ (3) --max-size 6 \ (4) --vpc-zone-identifier subnet-12345678 \ (5) --tags ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=Name,Value=mycluster-ASG-node,PropagateAtLaunch=true ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=kubernetes.io/cluster/mycluster,Value=true,PropagateAtLaunch=true ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/node-template/label/node-role.kubernetes.io/compute,Value=true,PropagateAtLaunch=true (6)
1 Specify the name of the Auto Scaling group, which you use when you deploy the auto-scaler deployment 2 Specify the name of the Launch Configuration that you created. 3 Specify the minimum number of nodes that the auto-scaler maintains. 4 Specify the maximum number of nodes the scale group can expand to. 5 Specify the VPC subnet-id, which is the same subnet that the cluster uses. 6 Specify this string to ensure that Auto Scaling group tags are propagated to the nodes when they launch.
Deploying the auto-scaler components on your cluster
After you create the Launch Configuration and Auto Scaling group, you can deploy the auto-scaler components onto the cluster.
-
Install a OKD cluster in AWS.
-
Create a primed image.
-
Create a Launch Configuration and Auto Scaling group that reference the primed image.
To deploy the auto-scaler:
-
Update your cluster to run the auto-scaler:
-
Add the following parameter to the inventory file that you used to create the cluster, by default /etc/ansible/hosts:
openshift_master_bootstrap_auto_approve=true
-
To obtain the auto-scaler components, change to the playbook directory and run the playbook again:
$ cd /usr/share/ansible/openshift-ansible $ ansible-playbook -i </path/to/inventory/file> \ playbooks/deploy_cluster.yml
-
Confirm that the
bootstrap-autoapprover
pod is running:$ oc get pods --all-namespaces | grep bootstrap-autoapprover NAMESPACE NAME READY STATUS RESTARTS AGE openshift-infra bootstrap-autoapprover-0 1/1 Running 0
-
-
Create a namespace for the auto-scaler:
$ oc apply -f - <<EOF apiVersion: v1 kind: Namespace metadata: name: cluster-autoscaler annotations: openshift.io/node-selector: "" EOF
-
Create a service account for the auto-scaler:
$ oc apply -f - <<EOF apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler name: cluster-autoscaler namespace: cluster-autoscaler EOF
-
Create a cluster role to grant the required permissions to the service account:
$ oc apply -n cluster-autoscaler -f - <<EOF apiVersion: v1 kind: ClusterRole metadata: name: cluster-autoscaler rules: - apiGroups: (1) - "" resources: - pods/eviction verbs: - create attributeRestrictions: null - apiGroups: - "" resources: - persistentvolumeclaims - persistentvolumes - pods - replicationcontrollers - services verbs: - get - list - watch attributeRestrictions: null - apiGroups: - "" resources: - events verbs: - get - list - watch - patch - create attributeRestrictions: null - apiGroups: - "" resources: - nodes verbs: - get - list - watch - patch - update attributeRestrictions: null - apiGroups: - extensions - apps resources: - daemonsets - replicasets - statefulsets verbs: - get - list - watch attributeRestrictions: null - apiGroups: - policy resources: - poddisruptionbudgets verbs: - get - list - watch attributeRestrictions: null EOF
1 If the cluster-autoscaler
object exists, ensure that thepods/eviction
rule exists with the verbcreate
. -
Create a role for the deployment auto-scaler:
$ oc apply -n cluster-autoscaler -f - <<EOF apiVersion: v1 kind: Role metadata: name: cluster-autoscaler rules: - apiGroups: - "" resources: - configmaps resourceNames: - cluster-autoscaler - cluster-autoscaler-status verbs: - create - get - patch - update attributeRestrictions: null - apiGroups: - "" resources: - configmaps verbs: - create attributeRestrictions: null - apiGroups: - "" resources: - events verbs: - create attributeRestrictions: null EOF
-
Create a creds file to store AWS credentials for the auto-scaler:
cat <<EOF > creds [default] aws_access_key_id = your-aws-access-key-id aws_secret_access_key = your-aws-secret-access-key EOF
The auto-scaler uses these credentials to launch new instances.
-
Create the a secret that contains the AWS credentials:
$ oc create secret -n cluster-autoscaler generic autoscaler-credentials --from-file=creds
The auto-scaler uses this secret to launch instances within AWS.
-
Create and grant cluster-reader role to the
cluster-autoscaler
service account that you created:$ oc adm policy add-cluster-role-to-user cluster-autoscaler system:serviceaccount:cluster-autoscaler:cluster-autoscaler -n cluster-autoscaler $ oc adm policy add-role-to-user cluster-autoscaler system:serviceaccount:cluster-autoscaler:cluster-autoscaler --role-namespace cluster-autoscaler -n cluster-autoscaler $ oc adm policy add-cluster-role-to-user cluster-reader system:serviceaccount:cluster-autoscaler:cluster-autoscaler -n cluster-autoscaler
-
Deploy the cluster auto-scaler:
$ oc apply -n cluster-autoscaler -f - <<EOF apiVersion: apps/v1 kind: Deployment metadata: labels: app: cluster-autoscaler name: cluster-autoscaler namespace: cluster-autoscaler spec: replicas: 1 selector: matchLabels: app: cluster-autoscaler role: infra template: metadata: labels: app: cluster-autoscaler role: infra spec: containers: - args: - /bin/cluster-autoscaler - --alsologtostderr - --v=4 - --skip-nodes-with-local-storage=False - --leader-elect-resource-lock=configmaps - --namespace=cluster-autoscaler - --cloud-provider=aws - --nodes=0:6:mycluster-ASG env: - name: AWS_REGION value: us-east-1 - name: AWS_SHARED_CREDENTIALS_FILE value: /var/run/secrets/aws-creds/creds image: docker.io/openshift/origin-cluster-autoscaler:v3.11.0 name: autoscaler volumeMounts: - mountPath: /var/run/secrets/aws-creds name: aws-creds readOnly: true dnsPolicy: ClusterFirst nodeSelector: node-role.kubernetes.io/infra: "true" serviceAccountName: cluster-autoscaler terminationGracePeriodSeconds: 30 volumes: - name: aws-creds secret: defaultMode: 420 secretName: autoscaler-credentials EOF
Testing the auto-scaler
After you add the auto-scaler to your Amazon Web Services (AWS) cluster, you can confirm that the auto-scaler works by deploying more pods than the current nodes can run.
-
You added the auto-scaler to your OKD cluster that runs on AWS.
-
Create the scale-up.yaml file that contains the deployment configuration to test auto-scaling:
apiVersion: apps/v1 kind: Deployment metadata: name: scale-up labels: app: scale-up spec: replicas: 20 (1) selector: matchLabels: app: scale-up template: metadata: labels: app: scale-up spec: containers: - name: origin-base image: openshift/origin-base resources: requests: memory: 2Gi command: - /bin/sh - "-c" - "echo 'this should be in the logs' && sleep 86400" terminationGracePeriodSeconds: 0
1 This deployment specifies 20 replicas, but the initial size of the cluster cannot run all of the pods without first increasing the number of compute nodes. -
Create a namespace for the deployment:
$ oc apply -f - <<EOF apiVersion: v1 kind: Namespace metadata: name: autoscaler-demo EOF
-
Deploy the configuration:
$ oc apply -n autoscaler-demo -f scale-up.yaml
-
View the pods in your namespace:
-
View the running pods in your namespace:
$ oc get pods -n autoscaler-demo | grep Running cluster-autoscaler-5485644d46-ggvn5 1/1 Running 0 1d scale-up-79684ff956-45sbg 1/1 Running 0 31s scale-up-79684ff956-4kzjv 1/1 Running 0 31s scale-up-79684ff956-859d2 1/1 Running 0 31s scale-up-79684ff956-h47gv 1/1 Running 0 31s scale-up-79684ff956-htjth 1/1 Running 0 31s scale-up-79684ff956-m996k 1/1 Running 0 31s scale-up-79684ff956-pvvrm 1/1 Running 0 31s scale-up-79684ff956-qs9pp 1/1 Running 0 31s scale-up-79684ff956-zwdpr 1/1 Running 0 31s
-
View the pending pods in your namespace:
$ oc get pods -n autoscaler-demo | grep Pending scale-up-79684ff956-5jdnj 0/1 Pending 0 40s scale-up-79684ff956-794d6 0/1 Pending 0 40s scale-up-79684ff956-7rlm2 0/1 Pending 0 40s scale-up-79684ff956-9m2jc 0/1 Pending 0 40s scale-up-79684ff956-9m5fn 0/1 Pending 0 40s scale-up-79684ff956-fr62m 0/1 Pending 0 40s scale-up-79684ff956-q255w 0/1 Pending 0 40s scale-up-79684ff956-qc2cn 0/1 Pending 0 40s scale-up-79684ff956-qjn7z 0/1 Pending 0 40s scale-up-79684ff956-tdmqt 0/1 Pending 0 40s scale-up-79684ff956-xnjhw 0/1 Pending 0 40s
These pending pods cannot run until the cluster auto-scaler automatically provisions new compute nodes to run the pods on. It can several minutes for the nodes have a
Ready
state in the cluster.
-
-
After several minutes, check the list of nodes to see if new nodes are ready:
$ oc get nodes NAME STATUS ROLES AGE VERSION ip-172-31-49-172.ec2.internal Ready infra 1d v1.11.0+d4cacc0 ip-172-31-53-217.ec2.internal Ready compute 7m v1.11.0+d4cacc0 ip-172-31-55-89.ec2.internal Ready compute 9h v1.11.0+d4cacc0 ip-172-31-56-21.ec2.internal Ready compute 7m v1.11.0+d4cacc0 ip-172-31-56-71.ec2.internal Ready compute 7m v1.11.0+d4cacc0 ip-172-31-63-234.ec2.internal Ready master 1d v1.11.0+d4cacc0
-
When more nodes are ready, view the running pods in your namespace again:
$ oc get pods -n autoscaler-demo NAME READY STATUS RESTARTS AGE cluster-autoscaler-5485644d46-ggvn5 1/1 Running 0 1d scale-up-79684ff956-45sbg 1/1 Running 0 8m scale-up-79684ff956-4kzjv 1/1 Running 0 8m scale-up-79684ff956-5jdnj 1/1 Running 0 8m scale-up-79684ff956-794d6 1/1 Running 0 8m scale-up-79684ff956-7rlm2 1/1 Running 0 8m scale-up-79684ff956-859d2 1/1 Running 0 8m scale-up-79684ff956-9m2jc 1/1 Running 0 8m scale-up-79684ff956-9m5fn 1/1 Running 0 8m scale-up-79684ff956-fr62m 1/1 Running 0 8m scale-up-79684ff956-h47gv 1/1 Running 0 8m scale-up-79684ff956-htjth 1/1 Running 0 8m scale-up-79684ff956-m996k 1/1 Running 0 8m scale-up-79684ff956-pvvrm 1/1 Running 0 8m scale-up-79684ff956-q255w 1/1 Running 0 8m scale-up-79684ff956-qc2cn 1/1 Running 0 8m scale-up-79684ff956-qjn7z 1/1 Running 0 8m scale-up-79684ff956-qs9pp 1/1 Running 0 8m scale-up-79684ff956-tdmqt 1/1 Running 0 8m scale-up-79684ff956-xnjhw 1/1 Running 0 8m scale-up-79684ff956-zwdpr 1/1 Running 0 8m ...