Start Google Cloud GKE Cluster with GPUs for KubeRay
Contents
Start Google Cloud GKE Cluster with GPUs for KubeRay#
See https://cloud.google.com/kubernetes-engine/docs/how-to/gpus for full details, or continue reading for a quick start.
Step 1: Create a Kubernetes cluster on GKE#
Run this command and all following commands on your local machine or on the Google Cloud Shell. If running from your local machine, you need to install the Google Cloud SDK. The following command creates a Kubernetes cluster named kuberay-gpu-cluster with 1 CPU node in the us-west1-b zone. This example uses the e2-standard-4 machine type, which has 4 vCPUs and 16 GB RAM.
gcloud container clusters create kuberay-gpu-cluster \
--num-nodes=1 --min-nodes 0 --max-nodes 1 --enable-autoscaling \
--zone=us-west1-b --machine-type e2-standard-4
Note: You can also create a cluster from the Google Cloud Console.
Step 2: Create a GPU node pool#
Run the following command to create a GPU node pool for Ray GPU workers. You can also create it from the Google Cloud Console: https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#console
gcloud container node-pools create gpu-node-pool \
--accelerator type=nvidia-l4-vws,count=1,gpu-driver-version=default \
--zone us-west1-b \
--cluster kuberay-gpu-cluster \
--num-nodes 1 \
--min-nodes 0 \
--max-nodes 1 \
--enable-autoscaling \
--machine-type g2-standard-4 \
The --accelerator flag specifies the type and number of GPUs for each node in the node pool. This example uses the NVIDIA L4 GPU. The machine type g2-standard-4 has 1 GPU, 24 GB GPU Memory, 4 vCPUs and 16 GB RAM.
… note::
GKE automatically installs the GPU drivers for you. For more details, see [GKE documentation](https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#create-gpu-pool-auto-drivers).
… note::
GKE automatically configures taints and tolerations so that only GPU pods are scheduled on GPU nodes. For more details, see [GKE documentation](https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#create)
Step 3: Configure kubectl to connect to the cluster#
Run the following command to download Google Cloud credentials and configure the Kubernetes CLI to use them.
gcloud container clusters get-credentials kuberay-gpu-cluster --zone us-west1-b
For more details, see GKE documentation.