Streamline AI Model Hosting with Docker Swarm, Chroma DB, and Ollama

[date] 11.16.2024

#docker#swarm#ollama#chromadb#ai#self-hosted

Learn how to deploy Open WebUI seamlessly within a Docker Swarm deployment, integrating Chroma DB for efficient vector database management and Ollama for AI model hosting.

reading: deploy open webui in docker swarm with c...

Deploy Open WebUI in Docker Swarm with Chroma DB and Ollama

Deploy OpenWebUI, Ollama, and Chroma Containers to a Docker Swarm

Deploying AI tools like OpenWebUI, Ollama, and ChromaDB in a Docker Swarm can seem daunting. This guide simplifies the process by providing a streamlined method using a Docker stack file to deploy three containers as services. Whether you're using GPU or CPU, this guide ensures a smooth setup.

Overview

This deployment method leverages Docker Swarm to create isolated services for OpenWebUI, Ollama, and ChromaDB. With pre-configured environment variables and detailed instructions, you can get your AI tools running with minimal effort.

Prerequisites

Basic understanding of Docker Swarm.
Docker Swarm configured on your host system.
Proper directories or volumes created for data storage:
```
mkdir -p data/open-webui data/chromadb data/ollama
```
GPU support (optional) with NVIDIA Container Toolkit installed and configured.

Deployment Steps

Step 1: Prepare Your Environment

With GPU Support: Ensure your host system has GPU support configured:
- Enable CUDA for your OS and GPU.
- Install the NVIDIA Container Toolkit.
- Edit /etc/docker/daemon.json to include:
```
{
  "NVIDIA-GPU": "GPU-<YOUR_GPU_NUMBER>"
}
```
- Enable GPU resource advertising in /etc/nvidia-container-runtime/config.toml by uncommenting:
```
swarm-resource = "DOCKER_RESOURCE_GPU"
```
- Restart the Docker daemon:
```
sudo service docker restart
```
With CPU Support: Modify the docker-stack.yaml file to remove GPU-specific lines (lines 70-76).

Step 2: Configure the Docker Stack

Below is a sample docker-stack.yaml file for deploying the three services:

version: '3.9'
services:
  openWebUI:
    image: ghcr.io/open-webui/open-webui:main
    depends_on:
      - chromadb
      - ollama
    volumes:
      - ./data/open-webui:/app/backend/data
    environment:
      DATA_DIR: /app/backend/data
      OLLAMA_BASE_URLS: http://ollama:11434
      CHROMA_HTTP_PORT: 8000
      CHROMA_HTTP_HOST: chromadb
      CHROMA_TENANT: default_tenant
      VECTOR_DB: chroma
      WEBUI_NAME: Awesome ChatBot
      CORS_ALLOW_ORIGIN: "*"
      RAG_EMBEDDING_ENGINE: ollama
      RAG_EMBEDDING_MODEL: nomic-embed-text-v1.5
      RAG_EMBEDDING_MODEL_TRUST_REMOTE_CODE: "True"
    ports:
      - target: 8080
        published: 8080
        mode: overlay
    deploy:
      replicas: 1
      restart_policy:
        condition: any
        delay: 5s
        max_attempts: 3

  chromadb:
    hostname: chromadb
    image: chromadb/chroma:0.5.15
    volumes:
      - ./data/chromadb:/chroma/chroma
    environment:
      - IS_PERSISTENT=TRUE
      - ALLOW_RESET=TRUE
      - PERSIST_DIRECTORY=/chroma/chroma
    ports:
      - target: 8000
        published: 8000
        mode: overlay
    deploy:
      replicas: 1
      restart_policy:
        condition: any
        delay: 5s
        max_attempts: 3
    healthcheck:
      test: ["CMD-SHELL", "curl localhost:8000/api/v1/heartbeat || exit 1"]
      interval: 10s
      retries: 2
      start_period: 5s
      timeout: 10s

  ollama:
    image: ollama/ollama:latest
    hostname: ollama
    ports:
      - target: 11434
        published: 11434
        mode: overlay
    deploy:
      resources:
        reservations:
          generic_resources:
            - discrete_resource_spec:
                kind: "NVIDIA-GPU"
                value: 0
      replicas: 1
      restart_policy:
        condition: any
        delay: 5s
        max_attempts: 3
    volumes:
      - ./data/ollama:/root/.ollama

Step 3: Deploy the Stack

Deploy the stack with GPU support:

docker stack deploy -c docker-stack.yaml -d super-awesome-ai

Deploy with CPU support (after modifying the stack file):

docker stack deploy -c docker-stack.yaml -d super-awesome-ai

Additional Resources

permalink: /posts/open-webui-docker-swarm

Gitlab 16.0+ Method for creating Gitlab Runners on Kubernetes

[date] 08.07.2023[mood] deploying things

[music] Carpenter Brut - Turbo Killer

#gitlab#kubernetes#helm#devops#ci-cd#runners

A walkthrough of the new GitLab 16.0+ runner creation workflow using authentication tokens, Kubernetes secrets, and Helm charts to deploy runners on your cluster.

reading: install gitlab runners on kubernetes

As the title suggests, there are new ways of configuring Gitlab Runners. In GitLab 16.0, a new runner creation workflow was introduced that uses authentication tokens to register runners. The legacy workflow that uses registration tokens is deprecated and will be removed in GitLab 17.0.

We will be starting the process in the GitLab Web GUI.

Go to Settings > CI/CD in a Project
Select Runners > New Runner
Select Linux; Write a tag, description, and edit configurations if desired.

NOTE: Runner TAGS are deprecated in the TOML/HELM, so this is where they are set now.

Copy the runner token, and save it in 1Pass or Vault. If you are using Kubernetes, we will need it for the next step.

Encoding the Token

Follow the steps below -- in order to use this runner token in Kubernetes, you will need to Base64 encode it for the Secret.

echo -n 'glrt-isC-rVy7MUy1cWoxrV4b' | base64

Creating the Namespace

Depending if one exists, we may need to make a namespace. Make sure you use this namespace throughout this deployment:

kubectl create namespace gitlab-runner

Creating the Secret

Create the Kubernetes manifest for the Secret in the namespace for the gitlab runner, we will call it secret.yml:

apiVersion: v1
kind: Secret
metadata:
  name: gitlab-runner-secret
  namespace: gitlab-runner
  type: Opaque
data:
  runner-registration-token: "" # need to leave as an empty string for compatibility reasons
  runner-token: "Z2xydC1pc0MtclZ5N01VeTFjV294clY0Yg==" # This is our Base64 string

Apply the secret to the namespace:

kubectl apply -f secret.yml

Helm Chart

This is the fun part. We will be using a Helmfile for this deployment, which uses the helm chart by GitLab for the deployment of the app gitlab-runner.

Here is the helmfile.yml using the external values file (recommended since we have RBAC also attached):

repositories:
  - name: gitlab
    url: https://charts.gitlab.io
releases:
  - name: gitlab-runner
    chart: gitlab/gitlab-runner
    namespace: gitlab-runner
    createNamespace: true
    version: 0.55.0
    installed: true
    values:
      - gl-runner-values.yaml

Runner Values

While there are many values and settings we can set, only crucial ones will be listed here. For a full list of available values, see the ArtifactHUB Chart.

## GitLab Runner Image
image:
  registry: registry.gitlab.com
  image: gitlab-org/gitlab-runner

imagePullPolicy: IfNotPresent

gitlabUrl: https://gitlab.com/

unregisterRunners: true

replicas: 1

concurrent: 10

## RBAC
rbac:
  create: true
  rules:
    - resources: ["deployments", "configmaps", "pods", "pods/attach", "pods/exec",
                  "secrets", "services", "namespaces", "serviceaccounts"]
      apiGroups: ["*"]
      verbs: ["get", "list", "watch", "create", "patch", "delete", "update"]
    - resources: ["clusterroles", "clusterrolebindings", "secrets", "events"]
      apiGroups: ["*"]
      verbs: ["get", "patch", "update", "create", "list", "watch"]

  clusterWideAccess: true
  serviceAccountName: gitlab-runner

## Runner Configuration
runners:
  config: |
    [[runners]]
      [runners.kubernetes]
        namespace = "{{.Release.Namespace}}"
        image = "ubuntu:16.04"
        service_account = "gitlab-runner"
        service_account_overwrite_allowed = ".*"
        pull_policy = ["always", "if-not-present"]

  executor: kubernetes
  name: "gitlab-new-runner"
  secret: gitlab-runner-secret

Apply the Helmfile:

helmfile apply -f helmfile.yml

If done correctly, you should see these runners in your GitLab!

Documentation

permalink: /posts/gitlab-runners-kubernetes

Open Source your blackbox Cisco firewall

[date] 02.14.2022[mood] hacker mode

[music] Perturbator - Dangerous Days

#networking#cisco#opnsense#linux#homelab#hardware

Free your Cisco ASA from closed-source firmware and put Linux/BSD on it instead. A step-by-step guide to bypassing ROMMON and installing OPNSense on a Cisco ASA-5555X.

reading: install opnsense and linux on cisco asa

Install OPNSense or Linux on Cisco ASA

Cisco is a great network solution for most, but I don't think it is a "one-size-fits-all" solution as most would believe.

Pay-Walls, Closed Source, and Black Box technologies will be the death of Privacy and Security. While Cisco has done a lot and is considered the "gold standard," it isn't ideal for those who care for privacy due to its relation with IBM.

During my search for this deemed "impossible" task, I have found a lot of unusual hate surrounding this topic. I don't understand why, but I hope to save others from the hive-minded's trouble towards questioning and wasting any more precious time.

Some of you reading this may think I am an absurd lunatic, and to a degree, that is totally valid; however, that is beside the point! This article's "point" is to help the curious and interested in tinkering, learning, or even furthering the community and the technologies available. In my opinion, this could be the beginning of what could revolutionize the way admins think of "dated" devices. This is a matter of unlocking doorways and reintroducing the mindset of "what can one make this device do" rather than just using it as is.

This "impossible" task may seem complicated, but I assure you it is a cakewalk! Once opened up, it's as easy as 1-2-3!

Requirements

Patience, this is a base requirement in general.
A live image of OPNSense flashed to a USB (or whatever OS you want to use)
- Make sure you review base requirements for your Operating System of choice, I am using OPNSense.
Storage device, I used an SSD; you can use an HDD or a USB drive as your storage. This will go in your ASA and serve as its storage device.
IDC 16 PIN to VGA Adapter ($6 USD from PCCABLES.com)

Get started

Identify your Device (Cisco ASA Model):
- Open the ASA up, read documents (that you can find) on your model, look into the specifications
- On the motherboard, you should see a PINOUT for VGA (16 PIN IDC). This PINOUT will be your entryway into the machine and allow you to bypass ROMMON.
Make your Bootable device:
- I used Rufus and the VGA OPNSense image. If you are here, I assume you already understand this topic, so I will leave you to your own devices.
Add your parts to the CISCO ASA:
- I added an SSD to the front of it, 250GB should be more than enough
- Add RAM if you want; I have 32GB already in mine but it depends on the specs/limits of your device
- Attach your Adapter and connect the VGA to a screen
- Attach a keyboard and your bootable device

Show time

Power on your device and enter the bios (Hitting F2 for me)
- Boot screen took almost a minute to appear after power-on
In BIOS, Disable ROMMON.
Switch the boot order; Make the USB the primary and HDD (boot device) as secondary.
Save changes and reboot.
Enjoy the POWER! Powering the ASA back on should find the bootable device.
- Install your Operating system!

Conclusion

As of writing this, I have OPNSense running on my CISCO ASA-5555X! These modifications have turned this once scrapped device into a fantastic multi-purpose machine for my home network, free of the subscriptions and closed source limitations. The best part is I have control over the device the way I want to have it.

I hope this helps; feel free to reach out -- I love to collaborate and conversate! Let me know your experiences, I have seen ESXi running on these mini powerhouses, which is brilliant!

All the best, Dominic

Links

permalink: /posts/asa-modding-opnsense

~+ AquaOctet +~

>> LATEST TRANSMISSIONS <<