Skip to main content

Command Palette

Search for a command to run...

OpenClaw Multi-Agent + CLIProxyAPIPlus Complete Deployment Guide

Published
18 min read
OpenClaw Multi-Agent + CLIProxyAPIPlus Complete Deployment Guide

Environment: Mac Mini M4 (16GB RAM) · macOS Sequoia Goal: Build a multi-agent system with CLIProxyAPIPlus proxying ChatGPT 5.4 OAuth + OpenRouter (Kimi K2.5) dual-channel API with automatic failover and intelligent model routing Date: March 2026


Architecture Overview

┌──────────────────────────┐
│   OpenClaw Gateway        │
│   (Multi-Agent System)    │
│                           │
│  ┌─────────────────────┐  │     ┌─────────────────────────────┐
│  │ Agent: Coordinator   │──┼────▶│  CLIProxyAPIPlus             │
│  │ Agent: Marketing     │  │     │  localhost:8317               │
│  │ Agent: Developer     │  │     │                               │
│  │ Agent: Researcher    │  │     │  ┌─────────────────────────┐  │
│  │ Agent: Support       │  │     │  │ Primary: ChatGPT 5.4    │  │──▶ OpenAI OAuth
│  └─────────────────────┘  │     │  │ (OAuth, no API key)      │  │
│                           │     │  ├─────────────────────────┤  │
│  Telegram Bots            │     │  │ Fallback: OpenRouter API │  │──▶ Kimi K2.5
│  (1 Bot per Agent)        │     │  │ (Kimi K2.5 auto-rotate)  │  │
└──────────────────────────┘     │  └─────────────────────────┘  │
                                  └─────────────────────────────┘

How it works: All Agent API requests hit localhost:8317 (CLIProxyAPIPlus). The proxy prioritizes your ChatGPT Plus/Pro subscription's OAuth token to call GPT-5.4. When the quota is exhausted or rate-limited, it automatically falls back to Kimi K2.5 via OpenRouter.


Part 1: Install CLIProxyAPIPlus

1.1 Install Go

brew install go
echo 'export PATH=\(PATH:\)(go env GOPATH)/bin' >> ~/.zshrc
source ~/.zshrc
go version  # Verify installation

1.2 Clone and Build CLIProxyAPIPlus

mkdir -p ~/ai-infra && cd ~/ai-infra
git clone https://github.com/router-for-me/CLIProxyAPIPlus.git
cd CLIProxyAPIPlus
go build -o cliproxyapi ./cmd/server
chmod +x cliproxyapi

# Optional: Add to PATH
sudo cp cliproxyapi /usr/local/bin/

1.3 Create Config Directory

mkdir -p ~/.cli-proxy-api

1.4 Create config.yaml

This is the most critical configuration file. We configure both ChatGPT OAuth (primary) and OpenRouter API Key (fallback):

# ~/.cli-proxy-api/config.yaml
# CLIProxyAPIPlus — ChatGPT 5.4 OAuth + OpenRouter Kimi K2.5

port: 8317

# Management UI
remote-management:
  allow-remote: false
  secret-key: "your-secret-key-change-me"

# OAuth token storage directory
auth-dir: "~/.cli-proxy-api"

# Debug mode (enable during initial setup, disable once stable)
debug: true
logging-to-file: true

# Usage statistics
usage-statistics-enabled: true

# Retry configuration
request-retry: 3

# Behavior when quota is exceeded
quota-exceeded:
  switch-project: true
  switch-preview-model: true

# API Key authentication
auth:
  # OpenClaw uses this key to authenticate with CLIProxyAPIPlus
  providers:
    - type: "api-key"
      key: "sk-openclaw-proxy-key-2026"
  
  # OpenRouter API Key (third-party provider fallback)
  generative-language-api-key: []

# ═══════════════════════════════════════════
# Third-Party Provider — OpenRouter (Kimi K2.5)
# ═══════════════════════════════════════════
openai-compatibility:
  - name: "openrouter"
    base-url: "https://openrouter.ai/api/v1"
    api-keys:
      - key: "sk-or-v1-YOUR_OPENROUTER_KEY"
    models:
      - "moonshotai/kimi-k2.5"
      - "moonshotai/kimi-k2-0905"
      - "anthropic/claude-sonnet-4"
      - "anthropic/claude-haiku-4.5"
      - "google/gemini-2.5-flash"

# ═══════════════════════════════════════════
# Model Aliases — Unified model names for OpenClaw
# ═══════════════════════════════════════════
model-aliases:
  # Primary: GPT-5.4 (via ChatGPT OAuth)
  - alias: "gpt-5.4"
    target: "gpt-5.4-2026-03-05"
    channel: "codex"

  - alias: "gpt-5.4-thinking"
    target: "gpt-5.4-2026-03-05"
    channel: "codex"

  # Fallback: Kimi K2.5 (via OpenRouter)
  - alias: "kimi-k2.5"
    target: "moonshotai/kimi-k2.5"
    channel: "openai-compatibility"

  # Lightweight: For simple tasks
  - alias: "gemini-flash"
    target: "google/gemini-2.5-flash"
    channel: "openai-compatibility"

# ═══════════════════════════════════════════
# Fallback Routing — GPT-5.4 fails → Kimi K2.5
# ═══════════════════════════════════════════
model-fallbacks:
  - from: "gpt-5.4-2026-03-05"
    to: "moonshotai/kimi-k2.5"
  - from: "gpt-5.4"
    to: "moonshotai/kimi-k2.5"
  - from: "gpt-5"
    to: "moonshotai/kimi-k2.5"

1.5 Log in to ChatGPT OAuth

cd ~/ai-infra/CLIProxyAPIPlus

# Log in to ChatGPT (opens browser for OAuth flow)
./cliproxyapi --codex-login

# After successful login, tokens are stored in ~/.cli-proxy-api/
# Verify token files exist:
ls ~/.cli-proxy-api/*.json

Note: You need an active ChatGPT Plus or Pro subscription. Tokens auto-refresh, but you may occasionally need to re-authenticate.

1.6 Start CLIProxyAPIPlus

# Foreground mode (for testing)
cliproxyapi --config ~/.cli-proxy-api/config.yaml

# Background mode (for production)
nohup cliproxyapi --config ~/.cli-proxy-api/config.yaml > ~/.cli-proxy-api/proxy.log 2>&1 &

# Verify it's running
curl http://localhost:8317/v1/models \
  -H "Authorization: Bearer sk-openclaw-proxy-key-2026"

1.7 Test Both API Channels

# Test GPT-5.4 (via ChatGPT OAuth)
curl http://localhost:8317/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-openclaw-proxy-key-2026" \
  -d '{
    "model": "gpt-5.4",
    "messages": [{"role": "user", "content": "Hello, which model are you?"}],
    "max_tokens": 100
  }'

# Test Kimi K2.5 (via OpenRouter)
curl http://localhost:8317/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-openclaw-proxy-key-2026" \
  -d '{
    "model": "kimi-k2.5",
    "messages": [{"role": "user", "content": "Hello, which model are you?"}],
    "max_tokens": 100
  }'

1.8 Set Up Auto-Start (launchd)

cat > ~/Library/LaunchAgents/com.cliproxyapi.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.cliproxyapi</string>
    <key>ProgramArguments</key>
    <array>
        <string>/usr/local/bin/cliproxyapi</string>
        <string>--config</string>
        <string>/Users/YOUR_USERNAME/.cli-proxy-api/config.yaml</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>KeepAlive</key>
    <true/>
    <key>StandardOutPath</key>
    <string>/Users/YOUR_USERNAME/.cli-proxy-api/proxy.log</string>
    <key>StandardErrorPath</key>
    <string>/Users/YOUR_USERNAME/.cli-proxy-api/proxy-error.log</string>
</dict>
</plist>
EOF

# Replace YOUR_USERNAME with your actual macOS username
launchctl load ~/Library/LaunchAgents/com.cliproxyapi.plist

Part 2: Install OpenClaw Multi-Agent

2.1 Install OpenClaw

npm install -g openclaw@latest
openclaw --version  # Verify version
openclaw onboard    # First-time setup wizard

2.2 Configure Environment Variables

Add to ~/.zshrc:

# OpenClaw accesses all models through CLIProxyAPIPlus
export OPENAI_API_BASE="http://localhost:8317/v1"
export OPENAI_API_KEY="sk-openclaw-proxy-key-2026"

# OpenRouter direct access (backup for edge cases)
export OPENROUTER_API_KEY="sk-or-v1-YOUR_OPENROUTER_KEY"
source ~/.zshrc

2.3 Create Multiple Agents

# Create 5 agents
openclaw agents add main
openclaw agents add marketing
openclaw agents add developer
openclaw agents add support
openclaw agents add researcher

# Verify creation
openclaw agents list --bindings

2.4 Edit openclaw.json

{
  "env": {
    "OPENAI_API_BASE": "http://localhost:8317/v1",
    "OPENAI_API_KEY": "sk-openclaw-proxy-key-2026",
    "OPENROUTER_API_KEY": "${OPENROUTER_API_KEY}"
  },

  "gateway": {
    "bind": "127.0.0.1",
    "port": 18789
  },

  "agents": {
    "defaults": {
      "model": {
        "primary": "gpt-5.4",
        "fallbacks": ["kimi-k2.5", "gemini-flash"]
      },
      "maxConcurrent": 10
    },

    "list": [
      {
        "id": "main",
        "default": true,
        "name": "Coordinator",
        "workspace": "~/.openclaw/workspace-main",
        "model": {
          "primary": "gpt-5.4"
        }
      },
      {
        "id": "marketing",
        "name": "Marketing Agent",
        "workspace": "~/.openclaw/workspace-marketing",
        "model": {
          "primary": "kimi-k2.5",
          "fallbacks": ["gpt-5.4"]
        }
      },
      {
        "id": "developer",
        "name": "Developer Agent",
        "workspace": "~/.openclaw/workspace-dev",
        "model": {
          "primary": "gpt-5.4",
          "fallbacks": ["kimi-k2.5"]
        },
        "tools": {
          "allow": ["read", "write", "exec", "web_search", "browser"],
          "deny": []
        }
      },
      {
        "id": "support",
        "name": "Support Agent",
        "workspace": "~/.openclaw/workspace-support",
        "model": {
          "primary": "gemini-flash"
        },
        "tools": {
          "allow": ["read", "web_search"],
          "deny": ["exec", "write", "browser"]
        }
      },
      {
        "id": "researcher",
        "name": "Research Agent",
        "workspace": "~/.openclaw/workspace-research",
        "model": {
          "primary": "kimi-k2.5"
        },
        "tools": {
          "allow": ["read", "web_search", "browser"],
          "deny": ["exec"]
        }
      }
    ]
  },

  "channels": {
    "telegram": {
      "accounts": [
        { "id": "tg-main",      "token": "BOT_TOKEN_MAIN" },
        { "id": "tg-marketing", "token": "BOT_TOKEN_MARKETING" },
        { "id": "tg-dev",       "token": "BOT_TOKEN_DEV" },
        { "id": "tg-support",   "token": "BOT_TOKEN_SUPPORT" },
        { "id": "tg-research",  "token": "BOT_TOKEN_RESEARCH" }
      ]
    }
  },

  "bindings": [
    { "channelId": "tg-main",      "agentId": "main" },
    { "channelId": "tg-marketing", "agentId": "marketing" },
    { "channelId": "tg-dev",       "agentId": "developer" },
    { "channelId": "tg-support",   "agentId": "support" },
    { "channelId": "tg-research",  "agentId": "researcher" }
  ]
}

2.5 Agent SOUL.md Files

Each agent workspace needs a SOUL.md to define its personality and instructions.

Coordinator (~/.openclaw/workspace-main/SOUL.md):

# Coordinator — Tenten AI Command Center

You are the central coordinator for Tenten Digital Product Design Studio.

## Core Responsibilities
- Task routing: Determine which specialist agent should handle each incoming message
- General queries: Answer everyday questions and internal company matters
- Progress tracking: Monitor task status across all agents

## Behavioral Guidelines
- Default language: Traditional Chinese (switch to English if the user writes in English)
- Be concise and direct — no filler
- For specialized queries (marketing/dev/support/research), explicitly say "I'm routing this to [Agent Name]"
- When unsure which agent fits, ask one clarifying question before routing

Marketing Agent (~/.openclaw/workspace-marketing/SOUL.md):

# Marketing Agent — Tenten Content & Growth Specialist

You are the AI marketing expert for Tenten Digital.

## Core Capabilities
- Social media content creation (Facebook / Instagram / LinkedIn)
- SEO/GEO/AEO content strategy
- Ad copy optimization (Meta Ads / Google Ads)
- Content calendar planning for client brands

## Active Clients
- ALUXE Diamond (luxury jewelry)
- JOY COLORi (lab-grown diamonds)
- KGI Life Insurance
- SD Health (health supplements)

## Writing Style
- AVOID: "In this era of..." "empower" "leverage" "unlock" and other AI-sounding clichés
- Append SEO metadata (H1/H2 suggestions, meta description) at the end of each piece
- Default output: Traditional Chinese; produce English version separately when requested

## Platform-Specific Formatting
- Facebook Formal: Professional tone, 150-300 words, CTA at end
- Facebook Conversational: Warm tone, emoji permitted, question-based hooks
- Instagram: Short punchy copy, hashtag block, visual-first storytelling
- Instagram GenZ: Casual slang permitted, trend-aware, meme-adjacent

Developer Agent (~/.openclaw/workspace-dev/SOUL.md):

# Developer Agent — Tenten Full-Stack Engineer

You are the AI developer for Tenten Digital.

## Tech Stack
- Frontend: Astro, React, Next.js, Webflow + DevLink
- E-commerce: Shopify Plus, Shopify Storefront API, Shopify Functions
- Infrastructure: Cloudflare Workers, Docker, Ploi.io
- CMS: Ghost, WordPress, Shopify Content
- Tracking: GTM + GA4 + Meta Pixel + CAPI

## Development Principles
- Code comments in English
- Explanations and discussions in Traditional Chinese (unless user prefers English)
- Respect Shopify's 100-variant limit; use Metafields + Line Item Properties for overflow
- Headless architecture must preserve Shopify checkout paths (/checkouts/, /cart, /account, /wallets, /payments)
- Prefer browser_snapshot accessibility tree over CSS selectors for browser automation on dynamic SPAs
- Test all code suggestions before presenting; include error handling

Support Agent (~/.openclaw/workspace-support/SOUL.md):

# Support Agent — Tenten Customer Service Assistant

You are the AI customer support specialist for Tenten Digital.

## Core Approach
- Friendly, professional, solution-oriented
- Resolve issues on first contact when possible
- Escalate complex or sensitive issues to a human team member
- Never make promises about timelines or outcomes you can't guarantee

## Limitations
- You can ONLY read files and search the web
- You CANNOT execute code, write files, or use browser automation
- For technical issues requiring code changes, route to the Developer Agent

## Response Style
- Acknowledge the customer's concern first
- Provide clear, step-by-step solutions
- End with a confirmation question: "Does this resolve your issue?"

Research Agent (~/.openclaw/workspace-research/SOUL.md):

# Research Agent — Tenten Deep Research Analyst

You are the AI research analyst for Tenten Digital.

## Core Capabilities
- Industry trend analysis (AI, e-commerce, MarTech)
- Competitive research and benchmarking
- Technical feasibility assessment
- Data analysis and report writing

## Research Standards
- ALL factual claims must include sources
- Clearly distinguish between "confirmed" and "speculated"
- Report format: Executive Summary → Methodology → Findings → Recommendations → Sources
- Default output: Traditional Chinese (preserve original language for source citations)
- When analyzing data, show your work and methodology
- For conflicting sources, present both sides and note the discrepancy

2.6 Start OpenClaw Gateway

# Start the gateway
openclaw gateway start

# Verify all agents are running
openclaw agents list --bindings
openclaw channels status --probe

# Follow logs
openclaw gateway logs --follow

Part 3: Claude Code Deployment Prompt

Save the CLAUDE.md file (provided separately) to your project root. Open the directory with Claude Code, and it will automatically read the instructions and guide you through the entire deployment.

The CLAUDE.md contains:

  • Step-by-step bash commands for each phase

  • Complete config.yaml content with TODO placeholders

  • Complete openclaw.json with TODO placeholders

  • SOUL.md templates for all 5 agents

  • Verification checklist


Part 4: Multi-Agent Intelligent Model Routing — Cost Optimization Best Practices

This is the most valuable section. Correct model routing can save 70-80% on token costs.

4.1 Model Pricing Comparison (March 2026)

Model Input $/M tokens Output $/M tokens Best For
GPT-5.4 (API) $2.50 $15.00 Complex reasoning, code, long documents
GPT-5.4 (OAuth) Included in subscription Included Same capabilities, no per-token cost
GPT-5.4 Pro $30.00 $180.00 Most complex tasks (avoid unless necessary)
Kimi K2.5 $0.45-0.60 $2.20-3.00 General purpose, visual coding, agentic
Kimi K2.5 (cached) $0.15 $2.20 Repeated context scenarios
Gemini 2.5 Flash ~free-$0.15 ~free-$0.60 Classification, summaries, translation

4.2 Tiered Task Routing Strategy

Core principle: Not every task needs the most powerful model.

┌─────────────────────────────────────────────────────────────────┐
│                  Tiered Model Routing Strategy                   │
├────────┬──────────────────┬─────────────────────────────────────┤
│ Tier   │ Model            │ Task Types                          │
├────────┼──────────────────┼─────────────────────────────────────┤
│ Tier 0 │ Gemini 2.5 Flash │ Intent classification, language     │
│ FREE   │ (via OAuth)      │ detection, simple translation,      │
│        │                  │ format conversion, data cleaning,   │
│        │                  │ yes/no judgments                    │
├────────┼──────────────────┼─────────────────────────────────────┤
│ Tier 1 │ Kimi K2.5        │ General conversation, content       │
│ BUDGET │ (\(0.45/\)2.20)    │ summarization, SEO copywriting,     │
│        │                  │ social posts, support replies,      │
│        │                  │ data analysis, frontend UI gen,     │
│        │                  │ visual coding                       │
├────────┼──────────────────┼─────────────────────────────────────┤
│ Tier 2 │ GPT-5.4          │ Complex reasoning, long-form        │
│ STD    │ (\(2.50/\)15.00)   │ writing, architecture design,       │
│        │ (FREE via OAuth!)│ multi-step agentic tasks, deep      │
│        │                  │ research, business analysis,        │
│        │                  │ legal/financial documents            │
├────────┼──────────────────┼─────────────────────────────────────┤
│ Tier 3 │ GPT-5.4 Pro      │ Almost never — only when other      │
│ MAX    │ (\(30/\)180)       │ models repeatedly fail, e.g.        │
│        │                  │ extremely complex math proofs        │
└────────┴──────────────────┴─────────────────────────────────────┘

Key Insight: Since you have a ChatGPT Plus/Pro subscription, GPT-5.4 via OAuth is effectively "free" (included in your monthly fee). Your strategy should be:

  1. Prioritize GPT-5.4 OAuth (included in subscription, no extra cost)

  2. When OAuth quota is exhausted, fall back to Kimi K2.5 (cheap and capable)

  3. Simple tasks use Gemini Flash OAuth (also free)

4.3 Per-Agent Model Assignment Recommendations

# Coordinator: Handles misc tasks and routing, mostly simple
main:
  primary: "gpt-5.4"        # OAuth = free
  fallback: "kimi-k2.5"     # \(0.45/\)2.20

# Marketing: High-volume content production, Kimi K2.5 best value
marketing:
  primary: "kimi-k2.5"      # Best for bulk content, cheap
  fallback: "gpt-5.4"       # Complex strategy docs only

# Developer: Code quality matters, GPT-5.4 coding is strongest
developer:
  primary: "gpt-5.4"        # OAuth = free, best at code
  fallback: "kimi-k2.5"     # Also strong; visual coding even better

# Support: Template responses, doesn't need top model
support:
  primary: "gemini-flash"   # OAuth = free, simple replies sufficient
  fallback: "kimi-k2.5"     # Complex complaints only

# Researcher: Long context crucial, Kimi K2.5 has 262K context
researcher:
  primary: "kimi-k2.5"      # 262K context, ideal for research
  fallback: "gpt-5.4"       # When 1M context is needed

4.4 Dynamic Router Prompt

Add this task routing logic to your Coordinator Agent's SOUL.md:

## Task Routing Rules

When a new task arrives, follow this decision flow:

1. **Intent Classification** (use Tier 0 model)
   - Simple greeting / yes-no question → Reply directly with Gemini Flash
   - Marketing-related → Route to marketing Agent
   - Code / technical → Route to developer Agent
   - Customer issue / complaint → Route to support Agent
   - Needs deep research → Route to researcher Agent

2. **Complexity Assessment**
   - Requires multi-step reasoning? → Use Tier 2 (GPT-5.4)
   - Involves images / UI? → Use Kimi K2.5 (strongest visual)
   - Plain text, simple task? → Use Tier 0/1
   - Needs very long context? → Use Kimi K2.5 (262K) or GPT-5.4 (1M)

3. **Cost Awareness**
   - Always try the cheapest model that can complete the task
   - Only upgrade when output quality is clearly insufficient
   - Bulk repetitive tasks (e.g., batch-generate 50 social posts) → always Tier 1

4.5 Token-Saving Tactics

Tactic 1: Prompt Caching (Kimi K2.5 auto-enabled)

Kimi K2.5 on OpenRouter has automatic context caching. Repeated system prompts are billed only once (\(0.15/M vs \)0.60/M). Put each agent's fixed instructions in the system prompt to automatically benefit from cache pricing.

Tactic 2: Chunk Long Tasks

❌ Wrong: Send an entire 10,000-word article to GPT-5.4 for proofreading
   → Massive output token consumption ($15/M)

✅ Right:
  Step 1: Use Gemini Flash for initial segmentation and classification (free)
  Step 2: Send only problematic paragraphs to GPT-5.4 for fixing
  → Saves ~80% tokens

Tactic 3: Response Caching (Agent-Level)

Maintain a cache/ directory in each agent's workspace to store common query responses. For frequently repeated queries (product FAQs, company descriptions), read directly from cache — zero API calls.

Tactic 4: Streaming + Early Truncation

When you detect the model is rambling or repeating itself, truncate the stream early. CLIProxyAPIPlus supports streaming mode. Your agent can stop receiving once sufficient information has been collected.

Tactic 5: Output Token Control

Output tokens are ALWAYS more expensive than input tokens (GPT-5.4 has a 6x gap). Add explicit constraints in each agent's system prompt:

## Response Format Requirements
- Keep responses under 200 words unless the user explicitly requests a detailed version
- For code, provide only the relevant snippet — not the entire file
- Use bullet points instead of long paragraphs
- If a question can be answered in one sentence, answer in one sentence

Tactic 6: Model-Specific Optimization

GPT-5.4:  Use "reasoning effort" parameter when available.
          Set lower effort for simple tasks to reduce thinking tokens.

Kimi K2.5: Leverage the 262K context window for "stuff everything in"
           approaches — cache pricing makes long context cheap.
           Use vision capabilities for UI/screenshot tasks instead of
           describing them in text.

Gemini Flash: Batch multiple simple tasks into a single prompt.
              "Classify these 10 customer messages: [list]"
              instead of 10 separate API calls.

4.6 Monthly Cost Estimate

For Tenten's estimated usage (~200-500 agent calls per day):

Item Monthly Estimate
ChatGPT Plus subscription $20/mo (GPT-5.4 OAuth free quota)
OpenRouter Kimi K2.5 (overflow) $15-40/mo
OpenRouter Gemini Flash (overflow) $0-5/mo
Total $35-65/mo

Compared to pure GPT-5.4 API (no OAuth): estimated $200-500/mo. Savings: 75%+


Part 5: Failover Testing & Monitoring

5.1 Simulate Failover

# Temporarily disable ChatGPT OAuth token to test failover
cd ~/.cli-proxy-api
mv codex-auth.json codex-auth.json.bak

# Send a GPT-5.4 request → should auto-fallback to Kimi K2.5
curl http://localhost:8317/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-openclaw-proxy-key-2026" \
  -d '{"model":"gpt-5.4","messages":[{"role":"user","content":"Which model is responding? State your model name."}]}'

# Restore the token
mv codex-auth.json.bak codex-auth.json

5.2 Log Monitoring

# CLIProxyAPIPlus logs
tail -f ~/.cli-proxy-api/proxy.log

# OpenClaw Gateway logs
openclaw gateway logs --follow

# Check which model actually served a request
grep "routing to" ~/.cli-proxy-api/proxy.log | tail -20

5.3 Health Check Script

#!/bin/bash
# ~/scripts/health-check.sh

echo "=== AI Infrastructure Health Check ==="

# 1. CLIProxyAPIPlus
if curl -sf http://localhost:8317/v1/models \
  -H "Authorization: Bearer sk-openclaw-proxy-key-2026" > /dev/null; then
  echo "✅ CLIProxyAPIPlus is running"
else
  echo "❌ CLIProxyAPIPlus is not responding"
fi

# 2. OpenClaw Gateway
if openclaw status 2>/dev/null | grep -q "running"; then
  echo "✅ OpenClaw Gateway is running"
else
  echo "❌ OpenClaw Gateway is not started"
fi

# 3. Agent count
AGENT_COUNT=$(openclaw agents list 2>/dev/null | grep -c "id:")
echo "📊 Active agents: $AGENT_COUNT"

# 4. OAuth token files
if ls ~/.cli-proxy-api/*auth*.json >/dev/null 2>&1; then
  echo "✅ OAuth token files exist"
else
  echo "⚠️  OAuth tokens missing — re-login required"
fi

echo "=== Check Complete ==="
chmod +x ~/scripts/health-check.sh

5.4 Cron-Based Auto-Recovery

# Add to crontab: check every 5 minutes, restart if down
crontab -e

# Add this line:
*/5 * * * * curl -sf http://localhost:8317/v1/models -H "Authorization: Bearer sk-openclaw-proxy-key-2026" > /dev/null || (nohup cliproxyapi --config ~/.cli-proxy-api/config.yaml > ~/.cli-proxy-api/proxy.log 2>&1 &)

Part 6: Troubleshooting FAQ

Q: CLIProxyAPIPlus starts but GPT-5.4 requests fail?

A: Verify your OAuth token exists and hasn't expired. Run cliproxyapi --codex-login to re-authenticate. ChatGPT Plus/Pro OAuth tokens expire periodically and need refreshing.

Q: Response quality drops when falling back to Kimi K2.5?

A: Kimi K2.5 performs well on most tasks, but coding tasks may be slightly weaker than GPT-5.4. Add fallback-specific instructions in SOUL.md, e.g., "When generating code, verify syntax correctness before outputting."

Q: Multiple agents hitting the API simultaneously — rate limit issues?

A: CLIProxyAPIPlus's request-retry: 3 and maxConcurrent: 10 handle this. ChatGPT OAuth rate limits are more generous than API keys. If truly rate-limited, the system auto-falls back to OpenRouter.

Q: Is Mac Mini M4 16GB RAM sufficient?

A: More than enough. OpenClaw + CLIProxyAPIPlus are lightweight processes; all AI computation happens in the cloud. Five agents use under 500MB RAM total. The real bottleneck is API rate limits and cost.

Q: How do I know which model actually served a request?

A: Enable debug: true in CLIProxyAPIPlus config — logs will show the actual routing for each request. Alternatively, add a debug instruction asking the model to self-identify at the start of each response.

Q: Can I add more OAuth providers (Gemini, Claude) alongside ChatGPT?

A: Yes. CLIProxyAPIPlus supports multiple OAuth providers simultaneously:

# Add Gemini CLI OAuth
cliproxyapi --login

# Add Claude Code OAuth
cliproxyapi --claude-login

Each provider gets its own token file and failover chain.


Quick Reference Card

# ═══════════════════════════════════
# START FULL STACK
# ═══════════════════════════════════
cliproxyapi --config ~/.cli-proxy-api/config.yaml &
openclaw gateway start

# ═══════════════════════════════════
# STOP FULL STACK
# ═══════════════════════════════════
openclaw gateway stop
pkill cliproxyapi

# ═══════════════════════════════════
# RE-AUTHENTICATE
# ═══════════════════════════════════
cliproxyapi --codex-login          # ChatGPT OAuth
cliproxyapi --login                # Gemini OAuth
cliproxyapi --claude-login         # Claude OAuth

# ═══════════════════════════════════
# STATUS & MONITORING
# ═══════════════════════════════════
openclaw agents list --bindings
openclaw channels status --probe
tail -f ~/.cli-proxy-api/proxy.log
~/scripts/health-check.sh

# ═══════════════════════════════════
# TEST API
# ═══════════════════════════════════
curl localhost:8317/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-openclaw-proxy-key-2026" \
  -d '{"model":"gpt-5.4","messages":[{"role":"user","content":"ping"}]}'