Streamline AI Model Hosting with Docker Swarm, Chroma DB, and Ollama
Learn how to deploy Open WebUI seamlessly within a Docker Swarm deployment, integrating Chroma DB for efficient vector database management and Ollama for AI model hosting.

Deploy OpenWebUI, Ollama, and Chroma Containers to a Docker Swarm
Deploying AI tools like OpenWebUI, Ollama, and ChromaDB in a Docker Swarm can seem daunting. This guide simplifies the process by providing a streamlined method using a Docker stack file to deploy three containers as services. Whether you're using GPU or CPU, this guide ensures a smooth setup.
Overview
This deployment method leverages Docker Swarm to create isolated services for OpenWebUI, Ollama, and ChromaDB. With pre-configured environment variables and detailed instructions, you can get your AI tools running with minimal effort.
Prerequisites
- Basic understanding of Docker Swarm.
- Docker Swarm configured on your host system.
- Proper directories or volumes created for data storage:
mkdir -p data/open-webui data/chromadb data/ollama - GPU support (optional) with NVIDIA Container Toolkit installed and configured.
Deployment Steps
Step 1: Prepare Your Environment
-
With GPU Support: Ensure your host system has GPU support configured:
- Enable CUDA for your OS and GPU.
- Install the NVIDIA Container Toolkit.
- Edit
/etc/docker/daemon.jsonto include:{ "NVIDIA-GPU": "GPU-<YOUR_GPU_NUMBER>" } - Enable GPU resource advertising in
/etc/nvidia-container-runtime/config.tomlby uncommenting:swarm-resource = "DOCKER_RESOURCE_GPU" - Restart the Docker daemon:
sudo service docker restart
-
With CPU Support: Modify the
docker-stack.yamlfile to remove GPU-specific lines (lines 70-76).
Step 2: Configure the Docker Stack
Below is a sample docker-stack.yaml file for deploying the three services:
version: '3.9'
services:
openWebUI:
image: ghcr.io/open-webui/open-webui:main
depends_on:
- chromadb
- ollama
volumes:
- ./data/open-webui:/app/backend/data
environment:
DATA_DIR: /app/backend/data
OLLAMA_BASE_URLS: http://ollama:11434
CHROMA_HTTP_PORT: 8000
CHROMA_HTTP_HOST: chromadb
CHROMA_TENANT: default_tenant
VECTOR_DB: chroma
WEBUI_NAME: Awesome ChatBot
CORS_ALLOW_ORIGIN: "*"
RAG_EMBEDDING_ENGINE: ollama
RAG_EMBEDDING_MODEL: nomic-embed-text-v1.5
RAG_EMBEDDING_MODEL_TRUST_REMOTE_CODE: "True"
ports:
- target: 8080
published: 8080
mode: overlay
deploy:
replicas: 1
restart_policy:
condition: any
delay: 5s
max_attempts: 3
chromadb:
hostname: chromadb
image: chromadb/chroma:0.5.15
volumes:
- ./data/chromadb:/chroma/chroma
environment:
- IS_PERSISTENT=TRUE
- ALLOW_RESET=TRUE
- PERSIST_DIRECTORY=/chroma/chroma
ports:
- target: 8000
published: 8000
mode: overlay
deploy:
replicas: 1
restart_policy:
condition: any
delay: 5s
max_attempts: 3
healthcheck:
test: ["CMD-SHELL", "curl localhost:8000/api/v1/heartbeat || exit 1"]
interval: 10s
retries: 2
start_period: 5s
timeout: 10s
ollama:
image: ollama/ollama:latest
hostname: ollama
ports:
- target: 11434
published: 11434
mode: overlay
deploy:
resources:
reservations:
generic_resources:
- discrete_resource_spec:
kind: "NVIDIA-GPU"
value: 0
replicas: 1
restart_policy:
condition: any
delay: 5s
max_attempts: 3
volumes:
- ./data/ollama:/root/.ollama
Step 3: Deploy the Stack
-
Deploy the stack with GPU support:
docker stack deploy -c docker-stack.yaml -d super-awesome-ai -
Deploy with CPU support (after modifying the stack file):
docker stack deploy -c docker-stack.yaml -d super-awesome-ai


