Skip to content

Deployment

Docker-Compose

docker-compose.yml
services:
  spark-iceberg:
    image: tabulario/spark-iceberg
    container_name: spark-iceberg
    build: spark/
    networks:
      iceberg_net:
    depends_on:
      - rest
      - minio
    volumes:
      - ./warehouse:/home/iceberg/warehouse
      - ./notebooks:/home/iceberg/notebooks/notebooks
    environment:
      - AWS_ACCESS_KEY_ID=admin
      - AWS_SECRET_ACCESS_KEY=password
      - AWS_REGION=us-east-1
    ports:
      - 8888:8888
      - 8080:8080
      - 10000:10000
      - 10001:10001
  rest:
    image: apache/iceberg-rest-fixture
    container_name: iceberg-rest
    networks:
      iceberg_net:
    ports:
      - 8181:8181
    environment:
      - AWS_ACCESS_KEY_ID=admin
      - AWS_SECRET_ACCESS_KEY=password
      - AWS_REGION=us-east-1
      - CATALOG_WAREHOUSE=s3://warehouse/
      - CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO
      - CATALOG_S3_ENDPOINT=http://minio:9000
  minio:
    image: minio/minio
    container_name: minio
    environment:
      - MINIO_ROOT_USER=admin
      - MINIO_ROOT_PASSWORD=password
      - MINIO_DOMAIN=minio
    networks:
      iceberg_net:
        aliases:
          - warehouse.minio
    ports:
      - 9001:9001
      - 9000:9000
    command: ["server", "/data", "--console-address", ":9001"]
  mc:
    depends_on:
      - minio
    image: minio/mc
    container_name: mc
    networks:
      iceberg_net:
    environment:
      - AWS_ACCESS_KEY_ID=admin
      - AWS_SECRET_ACCESS_KEY=password
      - AWS_REGION=us-east-1
    entrypoint: |
      /bin/sh -c "
      until (/usr/bin/mc alias set minio http://minio:9000 admin password) do echo '...waiting...' && sleep 1; done;
      /usr/bin/mc rm -r --force minio/warehouse;
      /usr/bin/mc mb minio/warehouse;
      /usr/bin/mc policy set public minio/warehouse;
      tail -f /dev/null
      "
networks:
  iceberg_net:

This Docker Compose configuration sets up a complete Apache Iceberg development environment with the following components:

Spark Service (spark-iceberg)

The main Spark service that provides the computational engine for working with Iceberg tables:

  • Image: tabulario/spark-iceberg - A pre-configured Spark image with Iceberg support
  • Ports:
    • 8888: Jupyter notebook interface
    • 8080: Spark UI
    • 10000-10001: Spark Thrift server ports
  • Volumes: Mounts local directories for warehouse data and notebooks
  • Dependencies: Requires both the REST catalog and MinIO services to be running

REST Catalog Service (rest)

Apache Iceberg's REST catalog service for metadata management:

  • Image: apache/iceberg-rest-fixture - Official Iceberg REST catalog
  • Port: 8181 - REST API endpoint
  • Configuration:
    • Connects to MinIO S3-compatible storage
    • Uses s3://warehouse/ as the warehouse location
    • Configured with S3FileIO for object storage operations

MinIO Service (minio)

S3-compatible object storage for storing Iceberg table data:

  • Image: minio/minio - Open-source S3-compatible storage
  • Ports:
    • 9000: S3 API endpoint
    • 9001: MinIO web console
  • Credentials: admin/password (for development only)
  • Storage: Serves data from /data directory inside container

MinIO Client Service (mc)

Initialization service that sets up the MinIO storage:

  • Purpose: Creates and configures the warehouse bucket
  • Actions:
    • Waits for MinIO to be ready
    • Creates the warehouse bucket
    • Sets public access policy for development
    • Runs indefinitely to keep the service active

Networking

All services communicate through the iceberg_net custom network, enabling:

  • Service discovery by container name
  • Isolated network environment
  • MinIO alias configuration for S3 compatibility

docker-compose up

Spark UI will be available at http://localhost:8080:

Then go to the notebook server available at http://localhost:8888: