YOUR24/7AI AGENT IN ESTONIA.
Run Claude Code, OpenClaw, RAG pipelines and smaller local language models on your own VPS. The API models do the heavy lifting, while your sessions, files and data stay fully under your control on the server.
Why move AI onto a server?
A laptop is for experimenting. A server is for getting work done. A VPS keeps the agent session alive and runs webhooks, scheduled workflows, RAG databases and private tools even while your own machine is closed.
- Persistence: tmux/systemd keep agents running
- Access: SSH, Tailscale, Caddy, HTTPS
- Control: files, vector stores and logs on your server
Smaller local models
Open-weight ecosystem
Smaller open-weight models are a great fit for privacy, RAG, classification, summarisation and internal tools. Large frontier-class models (DeepSeek V4, Qwen3-235B, Llama 4 Maverick, Kimi K2.6) usually need a dedicated GPU setup or an external inference service.
- ✓ Fixed monthly cost — no surprises
- ✓ Data stays 100% on your server
- ✓ Customisable models (fine-tuning, LoRA)
Autonomous agents
API-based
API-based agents where the heavy computation runs on the provider’s servers (Anthropic, OpenAI, Google). Your VPS holds the agent’s memory, workflows, files and sessions. tmux keeps the session alive, Tailscale keeps the connection secure.
- ✓ The agent runs around the clock, even after you close the terminal
- ✓ Pay-per-token — pay only for what you use
- ✓ 2026 frontier models: GPT-5.5, Claude Opus 4.8 / Fable 5, Gemini 3.1 Pro — built for complex code analysis, agentic workflows and long-context processing.
Your server, your rules.
Clean Linux, root access and full freedom to choose what you run. Here are a few examples.
Local LLMs
Smaller open-weight models (Gemma 4, Phi-4, smaller Qwen3) are ideal for private RAG, classification and internal tools. Larger models need a dedicated GPU setup.
RAG pipelines
Connect your sensitive documents, databases and internal wiki to AI. The data stays under your control, not in a third party’s cloud.
Autonomous agents
Claude Code, Hermes and other agents in a tmux session. Hand off a task and let the agent work in the background — a dropped SSH connection doesn’t interrupt it.
LoRA, adapters and AI Lab
LoRA/adapter experiments on smaller models are possible on an appropriately sized server. Training larger models and heavier GPU inference needs separate hardware or a custom AI Lab arrangement.
The three most useful agents on your server.
Your virtual senior developer.
Claude Code is a CLI agent that reads your codebase, writes tests and makes Git commits. On the VPS it runs inside a tmux session. You give it a task from your phone, put the phone in your pocket, and the agent keeps refactoring files in the background.
Open Termius on your phone → SSH into your VPS
Give Claude Code a task in the tmux session
Put your phone away, the agent works in the background
2h later — the work is done, you review it through Caddy
This is vibe coding at its best.
Your 24/7 personal AI assistant.
OpenClaw turns your VPS into a personal assistant that lives in your WhatsApp, Telegram or Slack. It can browse the web, run scripts, read your email and send reminders. State and memory stay 100% on your server.
The software is free — you only pay for the API usage of the model you choose
Pick your own model: Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro, DeepSeek V4 — or use a local Ollama
Open source — adapt and extend it as you need
Unlike SaaS solutions, you control every layer yourself — from the model to the infrastructure.
An agent that learns from your workflows.
Hermes Agent (Nous Research) is OpenClaw’s most serious challenger: it doesn’t just execute tasks — it learns from them. The agent creates its own skills on the fly, refines them during use and remembers what it learns across sessions. State and memory stay 100% on your server.
A closed learning loop — it creates and improves skills on its own, no manual skill files
Persistent cross-session memory (FTS5 search) plus a seven-layer security model
300+ model providers (Anthropic, OpenAI, OpenRouter, local) plus cron and sub-agents
One command — hermes claw migrate — brings your OpenClaw setup across. Same VPS, a sharper brain.
Which server do you need?
Three tiers to choose from — depending on whether you run API-based agents, local RAG, or a more serious AI lab.
Agent Starter
API-based workflows
- ✓ Claude Code, OpenClaw, Hermes
- ✓ Telegram/Discord bots, webhooks, cron
- ✓ 24/7 tmux sessions
- ✓ 2–4 GB RAM is enough
Recommended for
Solo developer · Hobby project · Telegram bot
Private AI Server
Local RAG and smaller LLMs
- ✓ ChromaDB, pgvector, Ollama
- ✓ Gemma 4, Phi-4, smaller Qwen models
- ✓ Private documents and vector stores
- ✓ 8–32 GB RAM, depending on the model
Recommended for
Internal company tool · Private RAG · Compliance-sensitive
AI Lab
Custom GPU setup
- ✓ Large open-weight models (DeepSeek V4, Qwen3-235B, Llama 4 Maverick, Kimi K2.6)
- ✓ LoRA fine-tuning and adapter experiments
- ✓ Heavier inference loads
- ✓ GPU or a separate arrangement — we don’t offer a standard GPU plan, but we build custom setups
Recommended for
Researcher · ML team · Frontier-model inference
What do you actually do with this server?
Three examples of how other developers and companies do real work with their AI server.
Telegram customer support
A customer messages the Telegram bot, OpenClaw receives it, retrieves an answer from your company docs via RAG, and replies in natural language. Logs and conversations stay on your server.
Stack
Cost
VPS + a small API cost, depending on query volume and the model.
Recommended tier
AGENT STARTER (VPS 2/3)
Private company search
Internal wiki, NDA documents, project files — all vectorised into ChromaDB or pgvector. A local model (Gemma 4, Phi-4 or a smaller Qwen) generates the answers. Data never leaves for an external AI API; the whole pipeline runs locally.
Stack
Cost
VPS cost only — no external API fees when the whole pipeline runs locally.
Recommended tier
PRIVATE AI SERVER (8–32 GB RAM)
Autonomous developer
Claude Code runs in a tmux session 24/7. It helps review pull requests, writes tests and refactors code on your command. You hand off a task from your phone with Termius, and the agent works in the background until it’s done.
Stack
Cost
VPS + a Claude Max subscription or API usage.
Recommended tier
AGENT STARTER (VPS 2/3)
Clean OS, full freedom.
A practical guide to the initial VPS setup. We provide an unmanaged, clean OS (or your own ISO) — you build your AI environment on top of it.
Unmanaged VPS
Virtuaal.comVirtuaal.com gives you clean Linux (Ubuntu, AlmaLinux, Debian — or upload your own ISO) and root access. No bloatware, no restrictive middle layers. Thanks to the clean OS you don’t always need the largest plan — for API-based agents even VPS 2 or a dedicated custom setup gets the job done.
Security
How-toWhenever possible, don’t expose SSH to the whole internet. Use Tailscale, fixed allow-listed IPs and SSH-key-only login. If public SSH is temporarily required, restrict it by IP and disable password login.
1) Install Tailscale
$ curl -fsSL https://tailscale.com/install.sh | sh
$ sudo tailscale up
2) UFW — SSH only over Tailscale
$ sudo ufw default deny incoming
$ sudo ufw default allow outgoing
$ sudo ufw allow in on tailscale0 to any port 22 proto tcp
$ sudo ufw enable
3) Verify
$ sudo ufw status verbose
$ tailscale ip -4
Persistence
Tmux · Systemd · RCtmux and systemd make sure your agents keep running even when your SSH connection drops.
$ tmux new -s ai-agent
$ cd ~/projects/my-project
$ claude --remote-control
# detach: Ctrl+B, then D — or steer from phone
$ tmux attach -t ai-agent
New: Remote Control (claude --remote-control) lets you steer the same local session from your phone or browser. tmux/systemd keeps the process alive; Remote Control gives you access from anywhere (Claude Code v2.1.51+).
Access
CaddyA Caddy reverse proxy gives you secure web-UI access. Automatic Let’s Encrypt SSL and a dead-simple config.
Caddyfile
ai.sinudomeen.ee {
reverse_proxy localhost:18789
}
Validate & reload
$ sudo caddy validate --config /etc/caddy/Caddyfile
$ sudo systemctl reload caddy
Caddy’s automatic HTTPS assumes the domain’s DNS record points to your VPS and ports 80/443 are open. For a private UI, don’t expose it to the public internet — use Tailscale, an IP restriction or Basic Auth.
SSH hardening (baseline)
sshd_configAfter testing SSH-key login, set these defaults:
$ sudo sed -i 's/^#\?PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
$ sudo sed -i 's/^#\?PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
$ sudo sshd -t
$ sudo systemctl reload ssh
Warning: Only do this after you have successfully tested SSH-key login in a SECOND SSH session. Otherwise you will lock yourself out of the server. sshd -t checks the syntax before reload.
It all starts with one command.
Pick a tab and watch how Claude Code or OpenClaw is installed and launched.
The commands are illustrative. Before installing, always check the project’s official documentation and never paste unknown install commands onto a server unverified.
# Examples verified: July 2026
Pay only for what you use.
Smart model routing keeps your bills under control. Send simple queries to a cheap model and heavy analysis to a more capable one.
The VPS bill comes from us — a fixed monthly amount regardless of usage. You’ll find exact prices and plans on our servers page.
See the plans →The API bill comes straight from the provider (Anthropic, OpenAI, Google). You pay exactly as much as your agents consume.
"What’s the weather in Tallinn today?" — a simple question/chat
"Refactor this module and write tests" — complex code analysis
(illustrative — the exact cost depends on the model and query size)
Subscription vs API — which to pick?
For everyday use a subscription is usually far cheaper than per-token API: a heavy coding session on the API can cost many times a fixed monthly plan. Key nuance — subscriptions only apply to first-party CLIs.
Claude Code → a subscription (Pro/Max) is the cheapest path for daily coding.
OpenClaw and Hermes → API key, OpenRouter or a local model. Third-party agents cannot use a subscription (Anthropic restricted this in April 2026).
| Service | Subscription | Price/mo | Tool |
|---|---|---|---|
| Anthropic Claude | Pro / Max 5x / Max 20x | $20 / $100 / $200 | Claude Code |
| OpenAI ChatGPT | Plus / Pro / Pro | $20 / $100 / $200 | Codex CLI |
| Google Gemini | AI Pro / AI Ultra | ~$20 / ~$100 | Gemini CLI |
| Local (Ollama) | — | €0 model cost | OpenClaw / Hermes / any |
* Subscription prices as of July 2026 — check with the provider. A local model = VPS cost only, no per-token fee.
* Figures are illustrative. The actual cost depends on the model, query volume and context length. USD is the official currency of the API providers.
Questions and answers
Ready to put your AI agents to work?
Move your development environment to the cloud, where it’s secure, fast and always available.