Agent rate limits & observability
Agent traffic gets its own accounting so it doesn't trade bandwidth with human traffic.
Separate buckets
api-v1— REST API. Keyed by API key ID / Clerk user ID.mcp-tools— MCP tool calls. Keyed byagentClientIdwhen the token came from a DCR-registered client; falls back to the API-key ID / user ID.mcp-output-tokens— per-agent, 1-hour window, counts response tokens from the MCP wrapper. Lets us cap runaway output even when request counts look fine.
Agent-native identity
When an MCP client registered via DCR, its tokens carry a client_id claim that's stamped onto the caller's agentClientId. Audit log entries record it so reports can distinguish "Claude Desktop acting for alice" from "alice directly." The apiKeyAudit table has a by_agent_time index; query it via Convex for a per-agent activity view.
Gateway integration
Point a gateway (Kong, Traefik Hub, Databricks Unity AI Gateway) at https://api.exayard.com/mcp and key the gateway's per-client policies off the same client_id. Our rate-limit is the floor; the gateway layer can be stricter for enterprise customers without teaching our app about them.
Requests already emit RateLimit / RateLimit-Policy headers from the most-constraining counter, so a gateway can enforce its own budget without double-counting.