Agent rate limits & observability

Agent traffic gets its own accounting so it doesn't trade bandwidth with human traffic.

Separate buckets

Agent-native identity

When an MCP client registered via DCR, its tokens carry a client_id claim that's stamped onto the caller's agentClientId. Audit log entries record it so reports can distinguish "Claude Desktop acting for alice" from "alice directly." The apiKeyAudit table has a by_agent_time index; query it via Convex for a per-agent activity view.

Gateway integration

Point a gateway (Kong, Traefik Hub, Databricks Unity AI Gateway) at https://api.exayard.com/mcp and key the gateway's per-client policies off the same client_id. Our rate-limit is the floor; the gateway layer can be stricter for enterprise customers without teaching our app about them.

Requests already emit RateLimit / RateLimit-Policy headers from the most-constraining counter, so a gateway can enforce its own budget without double-counting.