Kubernetes
Get started with a multi-machine prover that can run with as many GPUs as you can provision.
The SP1 Cluster is the official multi-GPU prover service implementation for generating SP1 proofs on the Succinct Prover Network. It can coordinate proof generation across tens to hundreds of GPU nodes.
This page explains how to setup the SP1 Cluster using Kubernetes and generate some basic proofs using command line tools. There are several ways to deploy SP1 Cluster, with Kubernetes being the best option for practical, production-ready deployments that need the best performance. For a simpler deployment workflow, please refer to the Docker Compose installation guide.
Prerequisites
SP1 Cluster runs on Linux and has the following software requirements for each worker on the cluster:
- Kubernetes or RKE2 (Recommended)
- CUDA 12
- NVIDIA Container Toolkit
For the machine you're using to connect to the kubernetes cluster, you'll need:
Hardware Requirements
The hardware requirements for running SP1 Cluster depend on the node configuration and can change over time as the prover changes or new features are implemented.
Component | Minimum Requirements | Recommended Requirements | Notes |
---|---|---|---|
1x+ CPU Machines | • ≥40 GB RAM • >30 GB disk space • High single clock speed | • ≥64 GB DDR5 RAM • >30 GB disk space • High single clock speed • High core count | • High single clock speed is important for optimal VM emulation • High core count helps reduce Groth16/Plonk proving latency • DDR5 RAM is recommended for better proving performance |
1x+ GPU Machines | • ≥16 GB RAM • 1x NVIDIA GPU with: • ≥24 GB VRAM • CUDA Compute Capability ≥8.6 | • ≥32 GB DDR5 RAM per GPU • Multiple supported GPUs: • GeForce RTX 5090/4090 (Best performance) • NVIDIA L4 • NVIDIA A10G • NVIDIA A5000/A6000 | • Multiple GPUs are supported on a single machine • Each GPU requires a separate instance of the GPU node binary • DDR5 RAM is recommended for proving performance |
See the FAQ for detailed hardware recommendations.
Note that if you are using 5090s, there is currently a driver bug which requires you to set the env var MOONGATE_DISABLE_GRIND_DEVICE=true
on the GPU node to enable a workaround. Not setting this will greatly reduce proving performance.
Helm Chart
We use a Helm Chart to manage the Kubernetes applications. The chart is defined in infra/charts/sp1-cluster/values-example.yml
.
There are several services that the Helm chart orcheastrates:
- CPU Workers
- GPU Workers
- API
- Postgres
- Redis
- Coordinator
- Fulfiller (Optional)
- Bidder (Optional)
The chart can be used to configure the hardware requirements and the number of replicas you want per service.
Setup
In this section, we'll walk through the steps needed to setup a basic proving service using Helm.
1. Clone the repo
git clone https://github.com/succinctlabs/sp1-cluster.git
cd cluster
2. Ensure kubectl is connected to your cluster
kubectl cluster-info
kubectl get nodes
3. Create a k8s namespace
kubectl create namespace sp1-cluster-test
4. Create k8s secrets
Create a secret called cluster-secrets
that configures the postgres database, the redis database, and the private key used for the prover.
kubectl create secret generic cluster-secrets \
--from-literal=DATABASE_URL=postgresql://postgres:postgrespassword@postgresql:5432/postgres \
--from-literal=REDIS_NODES=redis://:redispassword@redis-master:6379/0 \
--from-literal=PRIVATE_KEY=<PROVER_SIGNER_KEY> \
-n sp1-cluster-test
Create a secret called ghcr-secret
that can allow you to pull private images from ghcr.io.
kubectl create secret docker-registry ghcr-secret \
--docker-server=ghcr.io \
--docker-username=$GITHUB_USERNAME \
--docker-password=$GITHUB_TOKEN \
-n sp1-cluster-test
5. Configure the Helm chart
The template helm chart for the cluster exists at infra/charts/sp1-cluster/values-example.yml
. Copy it and configure it to your liking.
cp infra/charts/sp1-cluster/values-example.yaml infra/charts/sp1-cluster/values-test.yaml
You can configure the Helm chart to your liking. In particular, you may want to configure the resources
and node placement values based on your cluster hardware. We recommend following the following configuration constraints:
- Redis: placed on a non-worker machine with >= 20 GB RAM.
- API, Postgres, Fulfiller: placed on non-worker machine.
- CPU Workers: allocated >= 32 GB RAM, 10 GB disk, and powerful CPU with as many cores as possible.
- GPU Workers: allocated 1 GPU, >= 24 GB RAM, and as many cores as possible.
6. Setup Helm chart dependencies
helm dependency update infra/charts/redis-store
helm dependency update infra/charts/sp1-cluster
7. Create (or redeploy) the cluster
helm upgrade --install my-sp1-cluster infra/charts/sp1-cluster \
-f infra/charts/sp1-cluster/values-test.yaml \
-n sp1-cluster-test \
--debug
8. Send a test 5M cycle Fibonacci proof
We'll first create a temporary pod within the k8s namespace so we can access the cluster network.
kubectl run cli --image=ghcr.io/succinctlabs/sp1-cluster:base-latest \
--rm -it -n sp1-cluster-test /bin/bash
Once you're in the pod, you can use the SP1 Cluster CLI to send a test 20M cycle Fibonacci proof:
/cli bench fibonacci 20 \
--cluster-rpc http://api-grpc:50051 \
--redis-nodes redis://:redispassword@redis-master:6379/0
This should output something like:
INFO crates/common/src/logger.rs:110: logging initialized
INFO bin/cli/src/commands/bench.rs:68: Running Fibonacci Compressed for 5 million cycles...
INFO crates/common/src/client.rs:22: connecting to http://api-grpc:50051
INFO bin/cli/src/commands/bench.rs:113: using redis artifact store
INFO crates/artifact/src/redis.rs:38: initializing redis pool
INFO serialize: crates/artifact/src/lib.rs:126: close time.busy=15.5µs time.idle=267µs
INFO upload: crates/artifact/src/redis.rs:196: close time.busy=1.56ms time.idle=3.03ms artifact_type=Program id="artifact_01jxzm994ke78shjk272egp5vt"
INFO upload: crates/artifact/src/redis.rs:196: close time.busy=355µs time.idle=492µs artifact_type=Stdin id="artifact_01jxzm994rf3hve7yrfgg43t0w"
INFO bin/cli/src/commands/bench.rs:146: proof_id: cli_1750186894489
INFO bin/cli/src/commands/bench.rs:185: Proof request completed after 3.016538313s
INFO bin/cli/src/commands/bench.rs:187: Aggregate MHz: 1.66
Prover Network Integration
This section walks through the steps necessary to integrate with the Succinct Prover Network enabling you to fulfill proofs on behalf of requesters and earn fees.
0. Onchain Registration
You should have already registered your prover onchain. If you haven't, please refer to the Introduction for more information.
You also should have added a signer address to your prover so that you can use the private key to sign transactions on behalf of the prover. This private key should be set in the k8s secret used for FULFILLER_PRIVATE_KEY
and BIDDER_PRIVATE_KEY
.
1. Run the fulfiller
Inside the Helm chart, update the fulfiller
service's enabled
attribute to true
.
fulfiller:
enabled: true
...
Then upgrade the deployment:
helm upgrade --install my-sp1-cluster infra/charts/sp1-cluster \
-f infra/charts/sp1-cluster/values-test.yaml \
-n sp1-cluster-test \
--debug
2. Run the bidder
Inside the Helm chart, update the bidder
service's enabled
attribute to true
.
bidder:
enabled: true
...
Then upgrade the deployment:
helm upgrade --install my-sp1-cluster infra/charts/sp1-cluster \
-f infra/charts/sp1-cluster/values-test.yaml \
-n sp1-cluster-test \
--debug