KubeRay simplifies managing Ray clusters on Kubernetes by introducing three key Custom Resource Definitions (CRDs): RayCluster, RayJob, and RayService. These CRDs make it easy to tailor Ray clusters for different use cases.1
The KubeRay operator offers a Kubernetes-native approach to managing Ray clusters. A typical Ray cluster includes a head node pod and multiple worker node pods. With optional autoscaling, the operator can dynamically adjust the cluster size based on workload demands, adding or removing pods as needed.1
Setting up KubeRay is straightforward. This guide will walk you through installing the KubeRay operator and deploying your first Ray cluster using Helm. By the end, you'll have a fully functional Ray environment running on your Kubernetes cluster.23
Update your local Helm chart list to ensure you're using the latest version:
helmrepoupdate
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "kuberay" chart repository
Update Complete. ⎈Happy Helming!⎈
Next, create a namespace to manage KubeRay resources:
kubectlcreatenskuberay
namespace/kuberay created
Now, install the KubeRay operator in the namespace. This sets up the controller to manage Ray clusters:
Export the default values.yaml file to customize memory settings. If you've encountered OOM issues, it's a good idea to increase memory allocation upfront.4
# Default values for ray-cluster.# This is a YAML-formatted file.# Declare variables to be passed into your templates.# The KubeRay community welcomes PRs to expose additional configuration# in this Helm chart.image:repository:rayproject/raytag:2.41.0pullPolicy:IfNotPresentnameOverride:"kuberay"fullnameOverride:""imagePullSecrets:[]# - name: an-existing-secret# common defined values shared between the head and workercommon:# containerEnv specifies environment variables for the Ray head and worker containers.# Follows standard K8s container env schema.containerEnv:[]# - name: BLAH# value: VALhead:# rayVersion determines the autoscaler's image version.# It should match the Ray version in the image of the containers.# rayVersion: 2.41.0# If enableInTreeAutoscaling is true, the autoscaler sidecar will be added to the Ray head pod.# Ray autoscaler integration is supported only for Ray versions >= 1.11.0# Ray autoscaler integration is Beta with KubeRay >= 0.3.0 and Ray >= 2.0.0.# enableInTreeAutoscaling: true# autoscalerOptions is an OPTIONAL field specifying configuration overrides for the Ray autoscaler.# The example configuration shown below represents the DEFAULT values.# autoscalerOptions:# upscalingMode: Default# idleTimeoutSeconds is the number of seconds to wait before scaling down a worker pod which is not using Ray resources.# idleTimeoutSeconds: 60# imagePullPolicy optionally overrides the autoscaler container's default image pull policy (IfNotPresent).# imagePullPolicy: IfNotPresent# Optionally specify the autoscaler container's securityContext.# securityContext: {}# env: []# envFrom: []# resources specifies optional resource request and limit overrides for the autoscaler container.# For large Ray clusters, we recommend monitoring container resource usage to determine if overriding the defaults is required.# resources:# limits:# cpu: "500m"# memory: "512Mi"# requests:# cpu: "500m"# memory: "512Mi"labels:{}# Note: From KubeRay v0.6.0, users need to create the ServiceAccount by themselves if they specify the `serviceAccountName`# in the headGroupSpec. See https://github.com/ray-project/kuberay/pull/1128 for more details.serviceAccountName:""restartPolicy:""rayStartParams:{}# containerEnv specifies environment variables for the Ray container,# Follows standard K8s container env schema.containerEnv:[]# - name: EXAMPLE_ENV# value: "1"envFrom:[]# - secretRef:# name: my-env-secret# ports optionally allows specifying ports for the Ray container.# ports: []# resource requests and limits for the Ray head container.# Modify as needed for your application.# Note that the resources in this example are much too small for production;# we don't recommend allocating less than 8G memory for a Ray pod in production.# Ray pods should be sized to take up entire K8s nodes when possible.# Always set CPU and memory limits for Ray pods.# It is usually best to set requests equal to limits.# See https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/config.html#resources# for further guidance.resources:limits:cpu:"1"# To avoid out-of-memory issues, never allocate less than 2G memory for the Ray head.memory:"4G"requests:cpu:"1"memory:"4G"annotations:{}nodeSelector:{}tolerations:[]affinity:{}# Pod security context.podSecurityContext:{}# Ray container security context.securityContext:{}# Optional: The following volumes/volumeMounts configurations are optional but recommended because# Ray writes logs to /tmp/ray/session_latests/logs instead of stdout/stderr.volumes:-name:log-volumeemptyDir:{}volumeMounts:-mountPath:/tmp/rayname:log-volume# sidecarContainers specifies additional containers to attach to the Ray pod.# Follows standard K8s container spec.sidecarContainers:[]# See docs/guidance/pod-command.md for more details about how to specify# container command for head Pod.command:[]args:[]# Optional, for the user to provide any additional fields to the service.# See https://pkg.go.dev/k8s.io/Kubernetes/pkg/api/v1#ServiceheadService:{}# metadata:# annotations:# prometheus.io/scrape: "true"# Custom pod DNS configuration# See https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-dns-config# dnsConfig:# nameservers:# - 8.8.8.8# searches:# - example.local# options:# - name: ndots# value: "2"# - name: edns0topologySpreadConstraints:{}worker:# If you want to disable the default workergroup# uncomment the line below# disabled: truegroupName:workergroupreplicas:1minReplicas:1maxReplicas:3labels:{}serviceAccountName:""restartPolicy:""rayStartParams:{}# containerEnv specifies environment variables for the Ray container,# Follows standard K8s container env schema.containerEnv:[]# - name: EXAMPLE_ENV# value: "1"envFrom:[]# - secretRef:# name: my-env-secret# ports optionally allows specifying ports for the Ray container.# ports: []# resource requests and limits for the Ray head container.# Modify as needed for your application.# Note that the resources in this example are much too small for production;# we don't recommend allocating less than 8G memory for a Ray pod in production.# Ray pods should be sized to take up entire K8s nodes when possible.# Always set CPU and memory limits for Ray pods.# It is usually best to set requests equal to limits.# See https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/config.html#resources# for further guidance.resources:limits:cpu:"1"memory:"3G"requests:cpu:"1"memory:"3G"annotations:{}nodeSelector:{}tolerations:[]affinity:{}# Pod security context.podSecurityContext:{}# Ray container security context.securityContext:{}# Optional: The following volumes/volumeMounts configurations are optional but recommended because# Ray writes logs to /tmp/ray/session_latests/logs instead of stdout/stderr.volumes:-name:log-volumeemptyDir:{}volumeMounts:-mountPath:/tmp/rayname:log-volume# sidecarContainers specifies additional containers to attach to the Ray pod.# Follows standard K8s container spec.sidecarContainers:[]# See docs/guidance/pod-command.md for more details about how to specify# container command for worker Pod.command:[]args:[]topologySpreadConstraints:{}# Custom pod DNS configuration# See https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-dns-config# dnsConfig:# nameservers:# - 8.8.8.8# searches:# - example.local# options:# - name: ndots# value: "2"# - name: edns0# The map's key is used as the groupName.# For example, key:small-group in the map below# will be used as the groupNameadditionalWorkerGroups:smallGroup:# Disabled by defaultdisabled:truereplicas:0minReplicas:0maxReplicas:3labels:{}serviceAccountName:""restartPolicy:""rayStartParams:{}# containerEnv specifies environment variables for the Ray container,# Follows standard K8s container env schema.containerEnv:[]# - name: EXAMPLE_ENV# value: "1"envFrom:[]# - secretRef:# name: my-env-secret# ports optionally allows specifying ports for the Ray container.# ports: []# resource requests and limits for the Ray head container.# Modify as needed for your application.# Note that the resources in this example are much too small for production;# we don't recommend allocating less than 8G memory for a Ray pod in production.# Ray pods should be sized to take up entire K8s nodes when possible.# Always set CPU and memory limits for Ray pods.# It is usually best to set requests equal to limits.# See https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/config.html#resources# for further guidance.resources:limits:cpu:1memory:"3G"requests:cpu:1memory:"3G"annotations:{}nodeSelector:{}tolerations:[]affinity:{}# Pod security context.podSecurityContext:{}# Ray container security context.securityContext:{}# Optional: The following volumes/volumeMounts configurations are optional but recommended because# Ray writes logs to /tmp/ray/session_latests/logs instead of stdout/stderr.volumes:-name:log-volumeemptyDir:{}volumeMounts:-mountPath:/tmp/rayname:log-volumesidecarContainers:[]# See docs/guidance/pod-command.md for more details about how to specify# container command for worker Pod.command:[]args:[]# Topology Spread Constraints for worker pods# See: https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/topologySpreadConstraints:{}# Custom pod DNS configuration# See https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-dns-config# dnsConfig:# nameservers:# - 8.8.8.8# searches:# - example.local# options:# - name: ndots# value: "2"# - name: edns0# Configuration for Head's Kubernetes Serviceservice:# This is optional, and the default is ClusterIP.type:ClusterIP
Install the Ray cluster using the customized values.yaml. Here, we're using the image tag 2.46.0-py310-aarch64 for Python 3.10, Ray 2.46.0, and MacOS ARM architecture. You can find all supported Ray images on Docker Hub.5