Deploy Ray Cluster on Kubernetes Using KubeRay¶

KubeRay simplifies managing Ray clusters on Kubernetes by introducing three key Custom Resource Definitions (CRDs): RayCluster, RayJob, and RayService. These CRDs make it easy to tailor Ray clusters for different use cases.¹

The KubeRay operator offers a Kubernetes-native approach to managing Ray clusters. A typical Ray cluster includes a head node pod and multiple worker node pods. With optional autoscaling, the operator can dynamically adjust the cluster size based on workload demands, adding or removing pods as needed.¹

Architecture (Click to Enlarge)

Setting up KubeRay is straightforward. This guide will walk you through installing the KubeRay operator and deploying your first Ray cluster using Helm. By the end, you'll have a fully functional Ray environment running on your Kubernetes cluster.²³

Install KubeRay Operator¶

Start by adding the KubeRay Helm repository to access the required charts:

helm repo add kuberay https://ray-project.github.io/kuberay-helm/

"kuberay" has been added to your repositories

Update your local Helm chart list to ensure you're using the latest version:

helm repo update

Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "kuberay" chart repository
Update Complete. ⎈Happy Helming!⎈

Next, create a namespace to manage KubeRay resources:

kubectl create ns kuberay

namespace/kuberay created

Now, install the KubeRay operator in the namespace. This sets up the controller to manage Ray clusters:

helm install kuberay-operator kuberay/kuberay-operator \
  --version 1.3.0 \
  -n kuberay

NAME: kuberay-operator
LAST DEPLOYED: Wed May 14 20:29:44 2025
NAMESPACE: kuberay
STATUS: deployed
REVISION: 1
TEST SUITE: None

Verify that the KubeRay operator pod is running:

kubectl get pods -n kuberay

NAME                                READY   STATUS    RESTARTS   AGE
kuberay-operator-66d848f5cd-5npp6   1/1     Running   0          23s

Deploy a Ray Cluster¶

Export the default values.yaml file to customize memory settings. If you've encountered OOM issues, it's a good idea to increase memory allocation upfront.⁴

helm show values kuberay/ray-cluster > values.yaml
nano values.yaml

values.yaml

values.yaml
# Default values for ray-cluster.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

# The KubeRay community welcomes PRs to expose additional configuration
# in this Helm chart.

image:
  repository: rayproject/ray
  tag: 2.41.0
  pullPolicy: IfNotPresent

nameOverride: "kuberay"
fullnameOverride: ""

imagePullSecrets: []
  # - name: an-existing-secret

# common defined values shared between the head and worker
common:
  # containerEnv specifies environment variables for the Ray head and worker containers.
  # Follows standard K8s container env schema.
  containerEnv: []
  #  - name: BLAH
  #    value: VAL
head:
  # rayVersion determines the autoscaler's image version.
  # It should match the Ray version in the image of the containers.
  # rayVersion: 2.41.0
  # If enableInTreeAutoscaling is true, the autoscaler sidecar will be added to the Ray head pod.
  # Ray autoscaler integration is supported only for Ray versions >= 1.11.0
  # Ray autoscaler integration is Beta with KubeRay >= 0.3.0 and Ray >= 2.0.0.
  # enableInTreeAutoscaling: true
  # autoscalerOptions is an OPTIONAL field specifying configuration overrides for the Ray autoscaler.
  # The example configuration shown below represents the DEFAULT values.
  # autoscalerOptions:
    # upscalingMode: Default
    # idleTimeoutSeconds is the number of seconds to wait before scaling down a worker pod which is not using Ray resources.
    # idleTimeoutSeconds: 60
    # imagePullPolicy optionally overrides the autoscaler container's default image pull policy (IfNotPresent).
    # imagePullPolicy: IfNotPresent
    # Optionally specify the autoscaler container's securityContext.
    # securityContext: {}
    # env: []
    # envFrom: []
    # resources specifies optional resource request and limit overrides for the autoscaler container.
    # For large Ray clusters, we recommend monitoring container resource usage to determine if overriding the defaults is required.
    # resources:
    #   limits:
    #     cpu: "500m"
    #     memory: "512Mi"
    #   requests:
    #     cpu: "500m"
    #     memory: "512Mi"
  labels: {}
  # Note: From KubeRay v0.6.0, users need to create the ServiceAccount by themselves if they specify the `serviceAccountName`
  # in the headGroupSpec. See https://github.com/ray-project/kuberay/pull/1128 for more details.
  serviceAccountName: ""
  restartPolicy: ""
  rayStartParams: {}
  # containerEnv specifies environment variables for the Ray container,
  # Follows standard K8s container env schema.
  containerEnv: []
  # - name: EXAMPLE_ENV
  #   value: "1"
  envFrom: []
    # - secretRef:
    #     name: my-env-secret
  # ports optionally allows specifying ports for the Ray container.
  # ports: []
  # resource requests and limits for the Ray head container.
  # Modify as needed for your application.
  # Note that the resources in this example are much too small for production;
  # we don't recommend allocating less than 8G memory for a Ray pod in production.
  # Ray pods should be sized to take up entire K8s nodes when possible.
  # Always set CPU and memory limits for Ray pods.
  # It is usually best to set requests equal to limits.
  # See https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/config.html#resources
  # for further guidance.
  resources:
    limits:
      cpu: "1"
      # To avoid out-of-memory issues, never allocate less than 2G memory for the Ray head.
      memory: "4G"
    requests:
      cpu: "1"
      memory: "4G"
  annotations: {}
  nodeSelector: {}
  tolerations: []
  affinity: {}
  # Pod security context.
  podSecurityContext: {}
  # Ray container security context.
  securityContext: {}
  # Optional: The following volumes/volumeMounts configurations are optional but recommended because
  # Ray writes logs to /tmp/ray/session_latests/logs instead of stdout/stderr.
  volumes:
    - name: log-volume
      emptyDir: {}
  volumeMounts:
    - mountPath: /tmp/ray
      name: log-volume
  # sidecarContainers specifies additional containers to attach to the Ray pod.
  # Follows standard K8s container spec.
  sidecarContainers: []
  # See docs/guidance/pod-command.md for more details about how to specify
  # container command for head Pod.
  command: []
  args: []
  # Optional, for the user to provide any additional fields to the service.
  # See https://pkg.go.dev/k8s.io/Kubernetes/pkg/api/v1#Service
  headService: {}
    # metadata:
    #   annotations:
    #     prometheus.io/scrape: "true"

  # Custom pod DNS configuration
  # See https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-dns-config
  # dnsConfig:
  #   nameservers:
  #     - 8.8.8.8
  #   searches:
  #     - example.local
  #   options:
  #     - name: ndots
  #       value: "2"
  #     - name: edns0
  topologySpreadConstraints: {}


worker:
  # If you want to disable the default workergroup
  # uncomment the line below
  # disabled: true
  groupName: workergroup
  replicas: 1
  minReplicas: 1
  maxReplicas: 3
  labels: {}
  serviceAccountName: ""
  restartPolicy: ""
  rayStartParams: {}
  # containerEnv specifies environment variables for the Ray container,
  # Follows standard K8s container env schema.
  containerEnv: []
  # - name: EXAMPLE_ENV
  #   value: "1"
  envFrom: []
    # - secretRef:
    #     name: my-env-secret
  # ports optionally allows specifying ports for the Ray container.
  # ports: []
  # resource requests and limits for the Ray head container.
  # Modify as needed for your application.
  # Note that the resources in this example are much too small for production;
  # we don't recommend allocating less than 8G memory for a Ray pod in production.
  # Ray pods should be sized to take up entire K8s nodes when possible.
  # Always set CPU and memory limits for Ray pods.
  # It is usually best to set requests equal to limits.
  # See https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/config.html#resources
  # for further guidance.
  resources:
    limits:
      cpu: "1"
      memory: "3G"
    requests:
      cpu: "1"
      memory: "3G"
  annotations: {}
  nodeSelector: {}
  tolerations: []
  affinity: {}
  # Pod security context.
  podSecurityContext: {}
  # Ray container security context.
  securityContext: {}
  # Optional: The following volumes/volumeMounts configurations are optional but recommended because
  # Ray writes logs to /tmp/ray/session_latests/logs instead of stdout/stderr.
  volumes:
    - name: log-volume
      emptyDir: {}
  volumeMounts:
    - mountPath: /tmp/ray
      name: log-volume
  # sidecarContainers specifies additional containers to attach to the Ray pod.
  # Follows standard K8s container spec.
  sidecarContainers: []
  # See docs/guidance/pod-command.md for more details about how to specify
  # container command for worker Pod.
  command: []
  args: []
  topologySpreadConstraints: {}


  # Custom pod DNS configuration
  # See https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-dns-config
  # dnsConfig:
  #   nameservers:
  #     - 8.8.8.8
  #   searches:
  #     - example.local
  #   options:
  #     - name: ndots
  #       value: "2"
  #     - name: edns0

# The map's key is used as the groupName.
# For example, key:small-group in the map below
# will be used as the groupName
additionalWorkerGroups:
  smallGroup:
    # Disabled by default
    disabled: true
    replicas: 0
    minReplicas: 0
    maxReplicas: 3
    labels: {}
    serviceAccountName: ""
    restartPolicy: ""
    rayStartParams: {}
    # containerEnv specifies environment variables for the Ray container,
    # Follows standard K8s container env schema.
    containerEnv: []
      # - name: EXAMPLE_ENV
      #   value: "1"
    envFrom: []
        # - secretRef:
        #     name: my-env-secret
    # ports optionally allows specifying ports for the Ray container.
    # ports: []
    # resource requests and limits for the Ray head container.
    # Modify as needed for your application.
    # Note that the resources in this example are much too small for production;
    # we don't recommend allocating less than 8G memory for a Ray pod in production.
    # Ray pods should be sized to take up entire K8s nodes when possible.
    # Always set CPU and memory limits for Ray pods.
    # It is usually best to set requests equal to limits.
    # See https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/config.html#resources
    # for further guidance.
    resources:
      limits:
        cpu: 1
        memory: "3G"
      requests:
        cpu: 1
        memory: "3G"
    annotations: {}
    nodeSelector: {}
    tolerations: []
    affinity: {}
    # Pod security context.
    podSecurityContext: {}
    # Ray container security context.
    securityContext: {}
    # Optional: The following volumes/volumeMounts configurations are optional but recommended because
    # Ray writes logs to /tmp/ray/session_latests/logs instead of stdout/stderr.
    volumes:
      - name: log-volume
        emptyDir: {}
    volumeMounts:
      - mountPath: /tmp/ray
        name: log-volume
    sidecarContainers: []
    # See docs/guidance/pod-command.md for more details about how to specify
    # container command for worker Pod.
    command: []
    args: []

    # Topology Spread Constraints for worker pods
    # See: https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/
    topologySpreadConstraints: {}

    # Custom pod DNS configuration
    # See https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-dns-config
    # dnsConfig:
    #   nameservers:
    #     - 8.8.8.8
    #   searches:
    #     - example.local
    #   options:
    #     - name: ndots
    #       value: "2"
    #     - name: edns0

# Configuration for Head's Kubernetes Service
service:
  # This is optional, and the default is ClusterIP.
  type: ClusterIP

Install the Ray cluster using the customized values.yaml. Here, we're using the image tag 2.46.0-py310-aarch64 for Python 3.10, Ray 2.46.0, and MacOS ARM architecture. You can find all supported Ray images on Docker Hub.⁵

helm install raycluster kuberay/ray-cluster \
  --version 1.3.0 \
  --set 'image.tag=2.46.0-py310-aarch64' \
  -n kuberay \
  -f values.yaml

NAME: raycluster
LAST DEPLOYED: Wed May 14 20:31:53 2025
NAMESPACE: kuberay
STATUS: deployed
REVISION: 1
TEST SUITE: None

Once the RayCluster CR is created, you can check its status:

kubectl get rayclusters -n kuberay

NAME                 DESIRED WORKERS   AVAILABLE WORKERS   CPUS   MEMORY   GPUS   STATUS   AGE
raycluster-kuberay   1                                     2      4G       0               62s

To view the running pods in your Ray cluster, use:

kubectl get pods --selector=ray.io/cluster=raycluster-kuberay -n kuberay

NAME                                          READY   STATUS    RESTARTS   AGE
raycluster-kuberay-head-k6ktp                 1/1     Running   0          5m49s
raycluster-kuberay-workergroup-worker-zrxbj   1/1     Running   0          5m49s