A practical view of MCP server performance for software teams
As MCP moves from experiment to production consideration, software teams need more than protocol awareness. They need to know how server design affects cost, reliability and the quality of AI-assisted outcomes in practice.
This benchmark explores a simple but important question: what happens when identical business questions are asked through different connection methods? Across 30 controlled configurations, Cyclr tested three SaaS systems, two models and three connection approaches to measure the one metric AI product teams feel fastest: token consumption. The result is clear. Server design has a major impact on cost, and the wrong design can also reduce reliability.
Focusing on Thick and Thin MCP Servers, against Direct API calls (CLI), this report looks at token counts and result accuracy.
This report is built for product, platform and engineering teams evaluating how to expose actions and data through MCP without creating unnecessary token overhead, bloated tool surfaces or fragile AI behavior. It shows why a well-scoped, typed MCP server can outperform both thicker MCP designs and raw direct API access
In the report you will find
- Comparison of Thick MCP, Thin MCP and Direct API connection methods
- Evidence that connection method can swing token cost by up to 4X
- Why output tokens are not the main cost driver in MCP workflows
- Benchmark findings on tool count, schema overhead and response bloat
- Why raw Direct API access was not the cheapest option in practice
- Evidence that trimming tool sets reduced cost without reducing accuracy
- Independent research supporting the link between tool surface area, token usage and model performance
- Six practical rules for designing a more efficient MCP server
Turn MCP server design into a product advantage
The benchmark found that Thin MCP designs used 75% fewer tokens per task than Thick MCP designs, while maintaining the same clean first-answer accuracy. It also found that Direct API was 58% more token-intensive than Thin MCP in this test setup and delivered the weakest first-answer performance. In other words, efficiency is not just about removing structure. It is about exposing the right structure.
For software teams, that turns MCP design into a product decision as much as a technical one. The right server shape helps control AI costs, reduces retry loops, improves the chances of the model choosing the right action first time, and makes it easier to expose useful capabilities without overloading context.