Kubernetes Deployment

Deploy Immich Memories to Kubernetes with NVIDIA GPU support. The manifests live in deploy/kubernetes/.

Prerequisites

NVIDIA GPU Operator installed in your cluster:

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update
helm install gpu-operator nvidia/gpu-operator \
  --namespace gpu-operator \
  --create-namespace

RuntimeClass for NVIDIA (usually created by GPU Operator automatically):

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: nvidia
handler: nvidia

Storage Class available for PVCs

Quick Start

cd deploy/kubernetes

# Create the secret with your Immich credentials
cp secret.yaml.example secret.yaml
# Edit with your actual values
vim secret.yaml

# Deploy everything
kubectl apply -k .

Or deploy resources individually if you prefer:

kubectl apply -f namespace.yaml
kubectl apply -f secret.yaml
kubectl apply -f configmap.yaml
kubectl apply -f pvc.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

Access the UI

kubectl port-forward -n immich-memories svc/immich-memories 8080:80
# Open http://localhost:8080

Batch Jobs

Run one-off video generation via CLI instead of the UI:

# Edit job.yaml with your parameters
kubectl apply -f job.yaml

# Watch the logs
kubectl logs -n immich-memories -f job/immich-memories-generate

# Check output
kubectl exec -n immich-memories deployment/immich-memories -- ls -la /output/

Configuration

GPU Resources

The deployment requests 1 NVIDIA GPU by default. Adjust in deployment.yaml:

resources:
  requests:
    nvidia.com/gpu: "1"
  limits:
    nvidia.com/gpu: "1"

Node Selection

Pods schedule on nodes labeled nvidia.com/gpu.present=true. Change the nodeSelector if your cluster uses different labels:

nodeSelector:
  nvidia.com/gpu.present: "true"
  # Or your custom label
  # gpu-node: "true"

Storage

Default PVC sizes:

PVC	Size	Purpose
Output	50Gi	Generated videos
Cache	20Gi	Downloaded assets and analysis cache

Adjust in pvc.yaml based on how many videos you plan to generate.

The Cache PVC holds cache.db (analysis scores). This is the most valuable volume: it stores all LLM scoring results. Losing it means re-analyzing your entire library on the next run.

# Backup cache from the running pod
kubectl exec -n immich-memories deployment/immich-memories -- \
  immich-memories cache backup /output/cache-backup.db

# Or export as portable JSON
kubectl exec -n immich-memories deployment/immich-memories -- \
  immich-memories cache export /output/scores.json

Sealed Secrets

For production, don't commit plain secrets. Use sealed-secrets:

# Install kubeseal
brew install kubeseal

# Create and seal the secret
cp secret.yaml.example secret.yaml
# Fill in your values, then seal
kubeseal --format=yaml < secret.yaml > sealed-secret.yaml

# Apply
kubectl apply -f sealed-secret.yaml

Monitoring

The deployment includes liveness and readiness probes hitting /health. The /health endpoint returns JSON with status, immich_reachable, last_successful_run, and version.

For monitoring tools like Uptime Kuma or your existing health check infrastructure, point them at the /health endpoint on port 8080.

Prerequisites​

Quick Start​

Access the UI​

Batch Jobs​

Configuration​

GPU Resources​

Node Selection​

Storage​

Sealed Secrets​

Monitoring​