How I Kept OpenClaw Alive After Anthropic Killed Third-Party Billing
On April 4, 2026, Anthropic silently revoked subscription billing for third-party AI harnesses. Here's the full story of how I rebuilt the request pipeline — from CLI backend to a 7-layer bidirectional proxy — to keep 13 autonomous agents running on my homelab without paying Extra Usage.
Homelab Architecture
Deep-dives into the evolving architecture of a memory-driven AI homelab
How I Kept OpenClaw Alive After Anthropic Killed Third-Party Billing
On April 4th, 2026, without announcement, Anthropic began routing API requests from third-party harnesses — including OpenClaw — away from subscription billing and into Extra Usage. Suddenly, every message to Sparky, my main Telegram agent, was accruing per-token charges on top of my Max subscription.
Within 48 hours I’d exhausted my Extra Usage cap. The agents went silent.
This post is the full story of what happened, how the API pipeline was rebuilt in two phases, and what the final 7-layer bidirectional proxy architecture looks like running on my DGX Spark.
What Changed on April 4th
Anthropic had always allowed Max and Pro subscribers to use the Claude API through third-party tools like OpenClaw. The OAuth token you get from claude auth login was accepted by api.anthropic.com and billed against your subscription.
On April 4th, Anthropic started fingerprinting requests from known harnesses. Requests that matched certain patterns — tool names, system prompt structures, header signatures — were reclassified as Extra Usage rather than subscription calls. The same token, the same model, the same API endpoint. Just a different billing bucket, silently enforced.
OpenClaw’s default system prompt is ~48K characters of structured configuration: agent team tables, workspace file paths, runtime metadata, MCP server listings. It turns out this is a very distinctive fingerprint.
Before April 4th:
┌─────────────┐ OAuth Token ┌──────────────────────┐
│ OpenClaw │ ──────────────────► │ api.anthropic.com │
│ (48K sys) │ │ → billed to MAX SUB │
└─────────────┘ └──────────────────────┘
After April 4th:
┌─────────────┐ OAuth Token ┌──────────────────────┐
│ OpenClaw │ ──────────────────► │ api.anthropic.com │
│ (48K sys) │ ← fingerprinted │ → billed to EXTRA │
└─────────────┘ └──────────────────────┘
💸 per token
Phase 1: The CLI Backend (April 5th)
The community’s first response was a Medium article: switch OpenClaw from the API path to the Claude Code CLI backend. Instead of sending HTTP requests to api.anthropic.com, OpenClaw would invoke the claude binary as a subprocess. The CLI authenticates as a first-party client — no fingerprinting, subscription billing restored.
The basic fix is one command:
openclaw models auth login \
--provider anthropic \
--method cli \
--set-default
That worked. For simple setups. Mine wasn’t simple.
Why the Basic Fix Broke Production
OpenClaw’s system prompt — the 48K blob — gets passed to the CLI via --append-system-prompt. Anthropic also fingerprints this flag when it comes from a known harness. Same detection, different vector.
I needed to intercept the CLI invocation, strip the original prompt, and inject a condensed version that preserved functionality without the identifiable structure. That meant a wrapper script.
/opt/claude-wrapper.sh
┌────────────────────────────────────────────┐
OpenClaw calls │ 1. Strip --append-system-prompt arg │
claude binary ──► │ 2. Load 22.9K condensed prompt from /opt/ │
(subprocess) │ 3. Auto-hydrate installed skills list │
│ 4. Exec real claude binary with new args │
└────────────────────────────────────────────┘
│
▼
~/.npm-global/bin/claude
│
▼
api.anthropic.com (CLI path)
→ billed to MAX subscription ✓
The fingerprinting is content-based, not size-based. I identified the three specific markers Anthropic detects:
| Marker | Example | Action |
|---|---|---|
| Workspace file path headers | ## /home/ghost/.openclaw/workspace/AGENTS.md | Strip |
| Section header | ## Workspace Files (injected) | Strip |
| Runtime metadata | ## Runtime: agent=main | host=openclaw | Strip |
Removing those three strings — keeping everything else — was enough. The threshold is tight: 22,933 characters passes, 23,029 characters is blocked.
The Token Renewal Chain
The CLI uses OAuth tokens from ~/.claude/.credentials.json. Those expire. On bare metal, you’d have to re-run claude auth login in a browser every few days.
My homelab solves this differently. The cli-proxy-api container on Tower (10.0.3.90) auto-refreshes the OAuth token every 15 minutes. A cron job on the DGX Spark pulls the fresh token down every 30 minutes:
Tower (10.0.3.90) DGX Spark (10.0.128.196)
┌──────────────────┐ ┌──────────────────────────┐
│ cli-proxy-api │ │ cron: */30 * * * * │
│ │ │ /opt/sync-claude-auth.sh │
│ OAuth refresh │ ──────►│ │
│ every 15 min │ pull │ ~/.claude/.credentials │
│ │ │ .json (always fresh) │
└──────────────────┘ └──────────────────────────┘
No browser re-auth ever. Token stays live, agents stay running.
What Broke Downstream
The CLI backend approach introduced new problems:
Problem 1: Non-streaming subprocess
OpenClaw expects streaming SSE responses.
The CLI blocks the entire subprocess for the duration of each call.
Result: Telegram's 2-minute typing TTL expires. Users see "..." forever.
Problem 2: Stale session accumulation
The CLI maintains session IDs in ~/.openclaw/agents/*/sessions/sessions.json
After gateway restarts or prompt changes, those IDs are dead.
OpenClaw tries --resume with a dead ID → fails → retries without system prompt.
Result: "New session started" on every Telegram message.
Problem 3: Watchdog kills
OpenClaw's gateway watchdog (noOutputTimeoutMs: 120000) kills long tool chains.
The CLI subprocess doesn't stream output mid-execution.
Result: Complex agent tasks silently die at 2 minutes.
These were livable for a few days. Then Extra Usage ran out anyway.
Phase 2: The Billing Proxy (April 10th)
The CLI backend was always a workaround. The real fix is intercepting at the HTTP layer — sitting between OpenClaw and api.anthropic.com and transforming requests to look like they originate from Claude Code itself.
This is what openclaw-billing-proxy does. Seven layers of bidirectional transformation, all in a single zero-dependency Node.js file.
The 7-Layer Architecture
Here’s the full request transformation pipeline. Every outbound request passes through all seven layers before reaching Anthropic. Every inbound response is reverse-mapped so OpenClaw sees its original identifiers.
OpenClaw Agent (Telegram / Web UI)
│
│ POST /v1/messages
│ model: claude-sonnet-4-6
│ x-api-key: sk-openclaw-proxy-key-2026
│ [OpenClaw tool names, OC properties, OC system prompt]
▼
┌─────────────────────────────────────────────────────┐
│ openclaw-billing-proxy :18801 │
│ │
│ ── OUTBOUND LAYERS ───────────────────────────── │
│ │
│ Layer 1 ▸ BILLING HEADER INJECTION │
│ │ Generates an 84-char SHA256-based billing │
│ │ identifier matching Claude Code's fingerprint │
│ │ format (BILLING_HASH_SALT + device ID). │
│ │ Injected as first line of system prompt. │
│ ▼ │
│ Layer 2 ▸ TOKEN SWAP │
│ │ Strips OpenClaw's API key from Authorization │
│ │ header. Loads OAuth token from │
│ │ ~/.claude/.credentials.json. Injects as │
│ │ Bearer token → requests bill to MAX sub. │
│ ▼ │
│ Layer 3 ▸ STRING SANITIZATION │
│ │ 30 trigger phrase replacements: │
│ │ "OpenClaw" → "ClaudeCode" │
│ │ "sessions_*" → "thread_*" │
│ │ "HEARTBEAT" → "KEEPALIVE" │
│ │ "openclaw-*" → "claude-*" │
│ │ ... 26 more patterns │
│ ▼ │
│ Layer 4 ▸ TOOL NAME FINGERPRINT BYPASS │
│ │ Renames all 31 OpenClaw tool names to │
│ │ PascalCase Claude Code convention: │
│ │ exec → Bash │
│ │ lcm_grep → ContextGrep │
│ │ lcm_read → FileRead │
│ │ lcm_write → FileWrite │
│ │ sessions_spawn → TaskSpawn │
│ │ ... 26 more │
│ │ Applied as quoted replacements throughout │
│ │ the ENTIRE request body (not just tool array). │
│ ▼ │
│ Layer 5 ▸ SYSTEM TEMPLATE BYPASS │
│ │ Strips ~28K of structured config sections │
│ │ (AGENTS.md tables, runtime metadata, paths). │
│ │ Replaces with ~0.5K natural prose paraphrase │
│ │ that preserves functional meaning without │
│ │ identifiable structure. │
│ ▼ │
│ Layer 6 ▸ TOOL DESCRIPTION STRIPPING │
│ │ Removes all tool description strings from │
│ │ the tool schema array. Reduces fingerprint │
│ │ surface area significantly — OC descriptions │
│ │ are verbose and distinctive. │
│ ▼ │
│ Layer 7 ▸ PROPERTY RENAMING │
│ │ Renames 8 OC-specific schema properties: │
│ │ session_id → thread_id │
│ │ workspace → context │
│ │ agent_name → assistant_name │
│ │ ... 5 more │
│ │ │
└──┼──────────────────────────────────────────────────┘
│
│ POST api.anthropic.com/v1/messages
│ Authorization: Bearer <oauth-token>
│ x-stainless-sdk-name: claude-code (emulated)
│ x-claude-code-version: 2.1.97 (emulated)
│ [Looks exactly like a Claude Code session]
▼
┌─────────────────────────────────────────────────────┐
│ api.anthropic.com │
│ │
│ ✓ Billing identifier recognized │
│ ✓ OAuth token valid (Max subscription) │
│ ✓ Tool names match Claude Code convention │
│ ✓ System prompt not flagged │
│ → Billed to MAX SUBSCRIPTION │
│ │
│ Streams SSE response with Claude Code tool names │
└──────────────────────┬──────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ openclaw-billing-proxy :18801 │
│ │
│ ── INBOUND LAYER ─────────────────────────────── │
│ │
│ Layer 8 ▸ FULL REVERSE MAPPING │
│ │ Restores ALL original identifiers: │
│ │ Bash → exec │
│ │ ContextGrep → lcm_grep │
│ │ FileRead → lcm_read │
│ │ thread_id → session_id │
│ │ ... every rename reversed │
│ │ │
│ │ Applied to BOTH: │
│ │ • SSE streaming chunks (line by line) │
│ │ • Full JSON responses │
│ │ │
│ │ OpenClaw sees its original tool names, │
│ │ paths, and identifiers. Zero code changes. │
│ ▼ │
└──┼──────────────────────────────────────────────────┘
│
▼
OpenClaw receives normal response
Tool calls use expected OC names ✓
Streaming works natively ✓
Sessions persist correctly ✓
Why Tool Name Renaming Is The Critical Layer
The v1 proxy (April 5-8) only did string sanitization — Layers 1-3. It worked for four days until Anthropic deployed an update on April 8th that added tool-name fingerprinting.
OpenClaw tool names follow a snake_case convention with prefixes (lcm_, sessions_, flow_). Claude Code tools follow PascalCase with no prefixes (Bash, FileRead, WebFetch). These naming conventions are as distinctive as a fingerprint.
By renaming every tool in the outbound request body — not just the tool array, but every reference throughout the entire JSON — and reversing the mapping in every response chunk, the proxy makes the entire session structurally indistinguishable from a real Claude Code session.
OUTBOUND (what Anthropic sees):
"tools": [
{ "name": "Bash", ... }, ← was "exec"
{ "name": "FileRead", ... }, ← was "lcm_read"
{ "name": "FileWrite", ... }, ← was "lcm_write"
{ "name": "ContextGrep",... }, ← was "lcm_grep"
{ "name": "TaskSpawn", ... } ← was "sessions_spawn"
]
INBOUND (what OpenClaw sees):
"content": [
{ "type": "tool_use", "name": "exec", ... }, ← was "Bash"
{ "type": "tool_use", "name": "lcm_read", ... }, ← was "FileRead"
{ "type": "tool_use", "name": "sessions_spawn", ... } ← was "TaskSpawn"
]
Homelab Topology: Where the Proxy Sits
INTERNET
│
Cloudflare Tunnel
(cloudflared 10.0.3.66)
│
┌───────────────────┼───────────────────┐
│ │ │
▼ ▼ ▼
tower.vitalemazo vault.vitalemazo openclaw.vitalemazo
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────┐
│ TOWER (Unraid 10.0.128.2) │
│ │
│ ┌─────────────┐ ┌──────────────┐ │
│ │ cli-proxy- │ │ vault │ │
│ │ api :8317 │ │ 10.0.3.75 │ │
│ │ (OAuth mgr) │ │ (secrets) │ │
│ └──────┬──────┘ └──────────────┘ │
│ │ token sync │
└─────────┼───────────────────────────────────────┘
│ every 15min refresh
│ every 30min pull (cron)
▼
┌─────────────────────────────────────────────────┐
│ DGX SPARK spanky1 (10.0.128.196) │
│ │
│ ~/.claude/.credentials.json (always fresh) │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ openclaw-billing-proxy :18801 │ │
│ │ systemd service (ghost user) │ │
│ │ /opt/openclaw-billing-proxy/proxy.js │ │
│ │ │ │
│ │ Subscription: MAX Token: 7.7h │ │
│ │ Requests served: live │ │
│ └──────────────┬──────────────────────────┘ │
│ │ localhost:18801 │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ openclaw-gateway :18789 │ │
│ │ systemd --user service │ │
│ │ │ │
│ │ 13 agents → cli-proxy/claude-* │ │
│ │ main (sonnet-4-6) │ │
│ │ trader (opus-4-6) ← trading bot │ │
│ │ orchestrator (opus-4-6) │ │
│ │ sentinel (haiku-4-5) │ │
│ │ + 9 more... │ │
│ └──────────────────────────────────────────┘ │
│ │
│ k3s cluster: ArgoCD, Prometheus, TEI, vLLM... │
└───────────────────┬───────────────────────────────┘
│
▼
api.anthropic.com
(billed to MAX subscription)
Deployment
The proxy is a single Node.js file with zero dependencies. On the DGX Spark I run it as a systemd service under the ghost user:
# Clone
sudo git clone https://github.com/zacdcook/openclaw-billing-proxy /opt/openclaw-billing-proxy
sudo chown -R ghost:ghost /opt/openclaw-billing-proxy
# Systemd unit
cat > /etc/systemd/system/openclaw-billing-proxy.service << 'EOF'
[Unit]
Description=OpenClaw Billing Proxy
After=network.target
[Service]
Type=simple
User=ghost
ExecStart=/usr/bin/node /opt/openclaw-billing-proxy/proxy.js --port 18801
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now openclaw-billing-proxy
Then update ~/.openclaw/openclaw.json to route the cli-proxy provider to the new port:
"models": {
"providers": {
"cli-proxy": {
"baseUrl": "http://127.0.0.1:18801",
"apiKey": "any-value-proxy-replaces-it",
"api": "anthropic-messages"
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "cli-proxy/claude-sonnet-4-6"
}
}
}
That’s it. Clear stale sessions, restart the gateway, done:
rm -f ~/.openclaw/agents/*/sessions/sessions.json
systemctl --user restart openclaw-gateway
Verify the proxy is up and reading a valid token:
curl http://127.0.0.1:18801/health
{
"status": "ok",
"proxy": "openclaw-billing-proxy",
"version": "2.2.3",
"requestsServed": 0,
"uptime": "67s",
"tokenExpiresInHours": "7.7",
"subscriptionType": "max",
"layers": {
"stringReplacements": 30,
"toolNameRenames": 31,
"propertyRenames": 8,
"ccToolStubs": 5,
"systemStripEnabled": true,
"descriptionStripEnabled": true
}
}
Before and After
| Phase 0 (pre-April 4) | Phase 1 (CLI backend) | Phase 2 (billing proxy) | |
|---|---|---|---|
| Billing path | Extra Usage (broken) | Max subscription ✓ | Max subscription ✓ |
| Transport | HTTP streaming | CLI subprocess | HTTP streaming ✓ |
| Telegram latency | Normal | High (2min TTL issues) | Normal ✓ |
| Session stability | Stable | Stale session restarts | Stable ✓ |
| Watchdog kills | None | Frequent | None ✓ |
| Token renewal | Automatic | Cron sync from Tower | Cron sync from Tower |
| Fingerprint bypass | N/A | Prompt wrapper | 7-layer transforms ✓ |
| Code changes to OC | None | None | None |
What Isn’t Solved
The token in ~/.claude/.credentials.json expires roughly every 8 hours. My Tower cron keeps it fresh in normal operation. If Tower goes down or the cron misfires, the proxy’s token will eventually expire and requests will start failing with 401s.
The health endpoint exposes tokenExpiresInHours, so it’s straightforward to alert on. I’ll add a Prometheus scrape target to the kube-prometheus-stack and a Grafana alert when that drops below 1 hour. That’s a follow-up post.
The other open item: the trader agent’s cron jobs (crypto scan every 30 minutes, equity scan during market hours) were wiped when this was all being sorted out. With the proxy stable, those are next to restore.
Closing Thought
The pattern here — detecting and transforming request fingerprints bidirectionally, in a zero-dependency proxy that sits transparently in the pipeline — is more broadly applicable than just OpenClaw. Any tool that relies on a subscription credential to reach an API that fingerprints its clients faces the same architecture problem.
The solution isn’t to fight the detection at the client. It’s to intercept at the transport layer, transform outbound, reverse inbound, and present a face the API recognizes. The tool on one side sees its own world. The API on the other sees what it expects. The proxy is the translation layer.
Seven layers sounds like a lot. It’s 800 lines of Node.js with no imports.
Running 13 agents on a DGX Spark in my homelab. Tower is an Unraid box with 24 Docker containers. All infrastructure as code. All secrets in Vault. Prior posts in this series: Memory-Driven AI Homelab, K3s Migration, OpenClaw on the Rise.
Related Posts
Building a Memory-Driven AI Homelab: DGX Spark, Knowledge Graphs, and 20 Containers From Soup to Nuts
A surgical deep-dive into running an NVIDIA DGX Spark with K3s, multi-agent AI orchestration, three-layer persistent memory (QMD vector search, Graphiti knowledge graph, MuninnDB cognitive memory), and 20 Docker containers on Unraid — all wired together with MCP servers, HashiCorp Vault, Cloudflare Access, and a custom API layer.
From Systemd to Kubernetes: Running AI Workloads on K3s with ArgoCD GitOps
Migrating two vLLM models from bare systemd services to a production K3s cluster on the DGX Spark — with NVIDIA GPU Operator time-slicing, ArgoCD app-of-apps GitOps, kube-prometheus-stack monitoring, and Cloudflare Access + Auth0 SSO protecting five web dashboards.
AI Orchestration for Network Operations: Autonomous Infrastructure at Scale
How a single AI agent orchestrates AWS Global WAN infrastructure with autonomous decision-making, separation-of-powers governance, and 10-100x operational acceleration.
Comments & Discussion
Discussions are powered by GitHub. Sign in with your GitHub account to leave a comment.