ResourcesΒ·Guides

Using a VPS API from AI agents (Claude, Cursor, your own)

AI agents like Claude Code and Cursor can provision real infrastructure if your VPS provider has a clean API. Token scopes, request shapes, and patterns for safe agent-driven workflows on RareCloud.

By RareCloud Team Β· 8 min read Β· 5/20/2026

AI coding agents are no longer just generating files in your editor. They run tools, search the web, edit code, execute shells, and increasingly: provision infrastructure. Claude Code can terraform apply, Cursor can call out to APIs, your own agent built on the Anthropic SDK can do whatever you give it tools to do.

If your VPS provider has a clean API, this means the agent can:

  • Spin up a new VPS for a temporary task and tear it down
  • Snapshot before a deploy and roll back if a test fails
  • Read invoice + credit data to summarize cost trends
  • Reset a stuck server, attach a volume, rotate an IP

RareCloud's API was designed with agent use cases in mind. This guide explains the shape and the safety story.

The API surface in one paragraph

Every endpoint is under https://api.rarecloud.io/v1/*. Auth is Authorization: Bearer rc_pat_.... Responses are { "ok": true, "data": ... } on success, { "ok": false, "error": { "code": "...", "message": "..." } } on failure. Errors carry stable machine-readable code fields so agents can branch on them.

OpenAPI spec lives at https://console.rarecloud.io/openapi.json. Feed it to your agent as a tool definition and you get auto-generated tools for every endpoint.

Token scopes that matter for agents

Don't issue a full-access token to an agent. Issue scoped tokens:

  • Catalog is public, no token or scope needed to read products, plans, regions and images. A "what could we deploy" agent can read pricing with no credentials at all.
  • services:read, list and inspect existing services. Safe for monitoring / status-summary agents.
  • services:write, create, modify, destroy services. Powerful. Issue with a low rate limit and a short expiry, and lean on your credit balance as your hard spend ceiling.
  • billing:read, read invoices, credit balance, payment history. Safe for cost-analysis agents.
  • billing:write, top up credit, change payment methods. Almost never give this to an agent.
  • account:read, profile, contact info. Mostly safe.
  • account:write, change profile, change passwords. Almost never give this to an agent.

For a "deploy a test VPS, run my test suite, tear it down" agent, the right token scope is just services:write (the catalog is public, so no scope is needed to read plans). Nothing else.

Common patterns

Pattern 1: query before action

Agents reason better when they see ground truth. Before any services:write call, the agent should query services:read to confirm the current state matches its mental model.

1. GET /api/v1/services             β†’ see what exists
2. GET /api/v1/catalog/plans         β†’ see what's available
3. POST /api/v1/services             β†’ create with chosen plan
4. GET /api/v1/services/{id}         β†’ confirm it provisioned

If the OpenAPI spec is loaded as the agent's tool catalog, this flow happens automatically, agents naturally read before write when the read tool is available.

Pattern 2: short-lived, least-privilege tokens

Mint a token scoped to exactly what the run needs, give it an expiry, and cap its request rate. Every token carries a scopes list, an optional expiresAt, and a rateLimitRpm (the default is 60). When the run finishes, revoke it.

POST /api/v1/tokens
Authorization: Bearer rc_pat_...
Content-Type: application/json

{ "name": "agent-run-7a3b2c", "scopes": ["services:write"], "rateLimitRpm": 10, "expiresAt": "2026-06-12T00:00:00Z" }

A token that can only touch services, only for an hour, only ten calls a minute, is a small blast radius even if the agent gets hijacked. The same thing from the CLI: rarecloud token create --scopes services:write --rate-limit-rpm 10 --expires 2026-06-12T00:00:00Z --label agent-run-7a3b2c.

Pattern 3: confirm the cost before committing

There's no server-side "preview" call, so put the confirmation in the agent loop. Have it look up the chosen plan's price from the (public) catalog, show the user exactly what it's about to create and what it will cost, and only POST after approval.

GET  /api/v1/catalog/plans         β†’ read the price of the chosen plan
[agent shows "will create a g-2vcpu-8gb server, €14.99/month" to the user]
[user approves]
POST /api/v1/services              β†’ actually does it

Hooking up with an MCP server

The Model Context Protocol (MCP) is the standard way to give agents access to tools. There are two paths:

First-party RareCloud MCP: we ship @rarecloudio/mcp-server with first-class tool definitions for the API surface. This is more ergonomic than the generic fetch path because the tool descriptions are tuned for LLM consumption (with examples, error shapes, and best-practice notes baked in). Drop it into Claude Code, Claude Desktop or Cursor, give it a scoped token, and the agent can read and manage your infrastructure.

Generic HTTP MCP: alternatively, any MCP server that supports HTTP fetch (the official mcp-server-fetch, for example) can hit our API. Configure it with the Bearer token in the server's environment and the agent can call anything in the OpenAPI spec.

Safety: the budget runaway problem

The headline fear: "what if the agent provisions 100 servers and I owe €10,000?"

Three layers of defense:

  1. Your credit balance. This is the hard ceiling. An agent can only ever spend the credit you've loaded onto the account, there's no card on file to auto-charge and no overdraft. When the balance is exhausted, services pause instead of accruing a debt. Top up explicitly from the dashboard.

  2. Per-token rate limits. Each token has a rateLimitRpm (default 60 requests/min); you can set it as low as you like for a purely passive monitoring agent. Mutations count against the same limit, so a runaway loop is bounded in throughput, not just in spend.

  3. Scoped, expiring, revocable tokens. A services:write token can't touch billing or your account. Give it an expiry, and revoke it the instant a run looks wrong, revocation takes effect immediately.

Combined, the worst case for an agent-with-services:write token that goes off the rails is "spent your credit balance on test servers", recoverable, bounded, visible in the audit log.

What's coming

  • Webhook events, server provisioned, invoice issued, credit low. Agents can subscribe instead of polling.
  • Per-project API tokens, when projects + collaboration ships, tokens will be scoped to a single project for additional isolation.

If you're building an agent against our API today, open a support ticket from your dashboard, feedback shapes the roadmap.

Frequently Asked Questions

Is it safe to let an AI agent provision real infrastructure?
Yes, with scoped tokens. Issue a token with the narrowest scope the agent needs (e.g. services:read for monitoring agents, services:write only when you genuinely want it to create resources), set a low rate-limit-rpm, and set an expiry. You can revoke any token instantly from the dashboard, there's no propagation delay.
Can I use this with an MCP server?
Yes. We ship a first-party MCP server (@rarecloudio/mcp-server) that wraps the API with first-class tools, so an agent can read and manage your infrastructure through Claude Code, Claude Desktop or Cursor. Any generic HTTP-fetch MCP server can also call our /api/v1/* endpoints with a Bearer token. Pin the agent to a single token + low scope so a prompt-injection attack can't escalate.
What about budget runaway, what if the agent provisions 100 servers?
Two safeguards: (1) your credit balance is a hard ceiling, an agent can only ever spend the credit you've actually loaded, then everything pauses, (2) per-token rate limits cap the number of mutating calls per minute. Set token-level guardrails; trust agents like you'd trust an intern.

Related