Bare Metal

Edit on GitHub

Bare-metal guide

The fastest path is the one-line installer. It downloads a sha256-verified release tarball from GitHub, drops binaries into /usr/local/bin (or $HOME/.cognitora/bin if root is unavailable), and prints PATH guidance.

curl -fsSL https://raw.githubusercontent.com/antonellof/cognitora-inference/main/deploy/installer/install.sh | sh
cgn-ctl pki bootstrap                 # generates dev PKI material
cgn-ctl install baremetal             # systemd units + config
systemctl enable --now cognitora.target

The installer respects these env vars:

VariablePurpose
CGN_VERSIONPin a tag (default: latest GitHub release).
CGN_PREFIXInstall prefix (default: /usr/local or $HOME/.cognitora).
CGN_REPOGitHub owner/name (default: antonellof/cognitora-inference).
CGN_BASE_URLOverride artefact host (useful for forks/mirrors).

After ~5 seconds:

curl http://127.0.0.1:8080/v1/models

returns the live model list. The next sections describe each step in more detail.

What gets installed

PathOwnerPurpose
/usr/local/bin/cgn-*rootthe six binaries
/etc/cognitora/cognitora.tomlrootrendered config (idempotent)
/etc/cognitora/pki/{ca,leaf}.{crt,key}rootdev PKI material
/etc/cognitora/keys.txtrootAPI keys file (sha256 hashes)
/var/lib/cognitora/cognitorastate (kv, model cache)
/var/log/cognitora/cognitoralogs
/run/cognitora/cognitoraUDS sockets
/etc/systemd/system/cgn-*.servicerootsystemd units
/etc/systemd/system/cognitora.targetrootaggregator

A new cognitora system user owns the runtime data; the binaries themselves stay owned by root.

Topology

For HA, run cgn-router on at least two non-GPU hosts (or behind your load balancer). cgn-agent and cgn-kvcached always run together on every GPU host — they share a Unix socket for KV transfers. cgn-metrics can run anywhere reachable from the BMC and the Prom endpoints.

Example small cluster:

HostRole
lb1cgn-router × 2 (active/active behind HAProxy)
gpu1..Ncgn-agent + cgn-kvcached + vLLM (one of each)
obs1cgn-metrics + Prometheus + Grafana
etcd1..3etcd cluster

Key rotation

cgn-ctl key create alice                # prints the plaintext token
cgn-ctl key create build-bot --read-only
cgn-ctl key revoke <id>
cgn-ctl key lock                        # disables the file until unlock

The keys file is hot-reloaded by the router; no restart needed.

Upgrade

curl -fsSL https://raw.githubusercontent.com/antonellof/cognitora-inference/main/deploy/installer/install.sh \
  | CGN_VERSION=v0.2.0 sh
systemctl restart cognitora.target

The installer always verifies the sha256 sum before overwriting binaries. Cosign signature verification runs additionally when cosign is on the PATH.

Uninstall

systemctl disable --now cognitora.target
rm -f /etc/systemd/system/cgn-*.service /etc/systemd/system/cognitora.target
systemctl daemon-reload
rm -rf /etc/cognitora /var/lib/cognitora /var/log/cognitora /run/cognitora
userdel cognitora
rm -f /usr/local/bin/cgn-{ctl,router,agent,kvcached,metrics,operator}