Skip to content

How to Self-Host Hermes Agent on a $6/mo VPS for 24/7 Phone Access

Purpose

This post shows how to run Hermes Agent on a small VPS so it stays up after I close my laptop and replies to me from Telegram. Total cost: about $6/mo for the VPS plus a cheap LLM API. The key point is the four-step stack: provision → install → wire a chat bridge → tunnel the webhook.

The architecture I am building:

Five layers of an AI coding harness: context manager, tool/permission system, loop/scheduler, provider/model adapter, UI

The image is the same five-layer model from the previous post. On a VPS, all five layers run as a single process I control, and the LLM call at the bottom is just an outbound HTTPS request to whatever model I point at. The chat bridge is a thin new layer on the side that lets the same agent answer from a phone.

Environment

  • Hetzner CX22 (4 GB RAM, ~€4.5/mo) on Ubuntu 24.04 LTS
  • DeepSeek V4 Pro Max for inference (~$13/mo for one user)
  • Telegram bot token from BotFather
  • Cloudflare tunnel in front of the webhook
  • MacBook as the dev workstation, iPhone as the phone

Step 1 — Provision a Hetzner CX22 and harden SSH

I create the smallest server that fits the Python process, then close every public port except SSH.

harden-ssh.sh
ssh root@<vps-ip>
apt update && apt install -y ufw fail2ban
ufw allow OpenSSH
ufw enable
systemctl enable --now fail2ban

The first command updates the package index and installs ufw (a small firewall) and fail2ban (an SSH brute-force blocker). I then allow only SSH through the firewall and turn it on. That gives me a default-deny posture: anything that did not come in on port 22 gets dropped.

Next I disable root SSH and password login, key-only access:

/etc/ssh/sshd_config.d/10-harden.conf
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
reload-sshd.sh
sudo sshd -t && sudo systemctl reload ssh

The sshd -t step is the one I always forget and regret. It runs a config syntax check before reload, so a typo does not lock me out of a remote box.

Step 2 — Install Hermes as a systemd unit

Hermes is a Python process. I install it under its own user, in a virtualenv, and pin it with systemd so a crash restarts it.

install-hermes.sh
sudo useradd -r -m -d /opt/hermes -s /bin/bash hermes
sudo -u hermes mkdir -p /opt/hermes && cd /opt/hermes
python3 -m venv venv && source venv/bin/activate
pip install hermes-agent

Then a systemd unit so it survives reboots:

/etc/systemd/system/hermes.service
[Unit]
Description=Hermes Agent
After=network.target
[Service]
User=hermes
WorkingDirectory=/opt/hermes
ExecStart=/opt/hermes/venv/bin/hermes serve
Restart=always
EnvironmentFile=/opt/hermes/.env
[Install]
WantedBy=multi-user.target
enable-hermes.sh
sudo systemctl daemon-reload
sudo systemctl enable --now hermes
sudo systemctl status hermes

The last command should print active (running). If it does, Hermes is up before I even wire the chat bridge.

Step 3 — Wire a Telegram bot to Hermes

I create a bot through BotFather, drop the token into Hermes’s channel config, then point the Telegram webhook at my server.

hermes.yaml
channels:
telegram:
enabled: true
bot_token: ${TELEGRAM_BOT_TOKEN}
allowed_users: ["<your-telegram-id>"]
slack:
enabled: true
app_token: ${SLACK_APP_TOKEN}
channel: "#hermes"
automations:
- name: morning_research_brief
schedule: "0 7 * * *"
prompt: "Summarize the last 24h of my Obsidian inbox and 3 RSS feeds"
- name: inbox_triage
schedule: "*/30 * * * *"
prompt: "Triage unread emails, draft replies for the high-priority ones"

The allowed_users list is the only thing stopping a stranger from driving my agent. Telegram user IDs are integers; get yours by messaging @userinfobot.

Step 4 — Expose the webhook through a Cloudflare tunnel

I do not open a public web port on the VPS. cloudflared opens an outbound tunnel to Cloudflare’s edge, and the webhook reaches Hermes through that.

cloudflared-tunnel.sh
cloudflared tunnel create hermes
cloudflared tunnel route dns hermes hermes.example.com
cloudflared service install
~/.cloudflared/config.yml
tunnel: hermes
credentials-file: /root/.cloudflared/<tunnel-id>.json
ingress:
- hostname: hermes.example.com
service: http://localhost:8080
- service: http_status:404
verify-tunnel.sh
curl https://hermes.example.com/healthz
# -> {"status":"ok"}

If the curl returns 200, the tunnel is alive and Telegram can reach the webhook.

How it works

A message lands in Telegram, Telegram hits the webhook at hermes.example.com, the Cloudflare tunnel forwards it to localhost:8080 on the VPS, and Hermes asks DeepSeek what to do. The reply walks the same path back. The whole loop is just HTTPS in and HTTPS out, with no public port open on the box.

Common mistakes

  • Exposing the agent’s raw API port. Always tunnel.
  • Using the VPS for inference. The VPS orchestrates; the LLM is a cloud API.
  • Sharing one Hermes instance across multiple clients without a secrets manager. Each tenant needs isolated credentials.
  • Treating the VPS Hermes as a backup. The whole point is that it is the always-on primary, not a fallback for when the laptop is on.

Summary

In this post, I showed how to put Hermes on a $6/mo VPS and reach it from Telegram. The key point is the four-step stack: harden SSH, install as a systemd unit, wire a chat bridge, and tunnel the webhook through Cloudflare. Once it is up, the agent runs 24/7 for about $20/mo all-in.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments