Sakana Fugu vs SmophyAI: Two Approaches to Automatic AI Model Selection in 2026

SmophyAI Team · June 23, 2026 · 9 min read

When Sakana AI launched Fugu in June 2026, it validated something that a small number of AI products had already been building quietly: automatic model routing - the idea that a system can read a prompt, determine what kind of task it is, and send it to the model best equipped to handle it - is genuinely valuable, and the manual alternative does not scale.

The interesting thing about Fugu's launch is not that it is a new idea. It is that a well-funded research lab with two ICLR 2026 papers behind it has now put a commercial API around the concept. That gives the category legitimacy it previously lacked in enterprise conversations.

For anyone evaluating AI routing solutions in 2026, two products sit at different ends of the same spectrum. Understanding the difference is more useful than a head-to-head benchmark.

What Sakana Fugu Actually Is

Fugu is a multi-agent system delivered as a single OpenAI-compatible API endpoint. Under the hood, it dynamically orchestrates a pool of specialized frontier models - coordinating them through learned collaboration patterns rather than hand-designed workflows. The architecture is grounded in two peer-reviewed papers: TRINITY, a coordinator that assigns Thinker, Worker, and Verifier roles to different models across turns, and Conductor, a reinforcement-learning-trained orchestrator that discovers natural-language coordination strategies.

The result is a system that, on complex multi-step tasks, can outperform any single frontier model. Fugu Ultra coordinates a deeper pool of expert agents to maximize answer quality on hard, high-stakes problems, and the benchmark data published by Sakana AI shows meaningful leads on SWE-bench Pro (73.7% vs Claude Opus 4.8's 69.2% and GPT-5.5's 58.6%), LiveCodeBench (93.2% - which actually leads even Anthropic's restricted Claude Fable 5 at 89.8%), and GPQA Diamond (95.5% vs Gemini's 94.3% and Mythos Preview's 94.6%).

Two important caveats: these are Sakana AI's own reported benchmarks, and Fugu Ultra's pricing starts at $5/1M input tokens and $30/1M output tokens on pay-as-you-go - significantly higher than individual frontier model pricing. The subscription plan mirrors ChatGPT's structure: Standard $20/month, Pro $100/month, Max $200/month.

One structural limitation worth knowing: Fugu is not yet available in the EU/EEA while Sakana works toward GDPR compliance. For European users or teams with EU data requirements, that is a current blocker.

What SmophyAI's Smophy Mode Is

SmophyAI built automatic model routing as a consumer feature - Smophy Mode - before Fugu existed as a commercial product. The mechanics are different: where Fugu uses multi-agent orchestration with a learned coordinator, Smophy Mode uses a smart routing system that analyzes each prompt and assigns it to the single best model from a pool of six frontier models - the latest from OpenAI, Anthropic, Google, xAI, Perplexity, and DeepSeek, updated in real time as each provider releases new versions.

The key distinction: Smophy Mode is one option within a broader product. Users can toggle between Smophy Mode, which provides automatic single-model routing, and the side-by-side multi-model comparison view, which supports manual selection with all six models in parallel, within the same interface. The same subscription also includes Image Studio, Video Studio, Writing Studio, and Business Tools - it is a complete AI workspace, not a routing API.

SmophyAI workspace showing Smophy Mode and multi-model AI tools

The Core Architectural Difference

This is the most important thing to understand about the two products - they solve the routing problem differently, and each approach has genuine advantages.

Fugu's multi-agent orchestration breaks complex tasks into components and assigns different agents to different parts - a Thinker reasons about the problem, a Worker executes, a Verifier checks the output. On genuinely hard, multi-step problems - reproducing a research paper, running a security assessment end-to-end, solving complex coding challenges - this architecture produces better outputs than any single model can. The benchmark leads are real, though they are Sakana's own numbers.

SmophyAI's single-model routing makes a different bet: for most everyday knowledge work, the right move is to identify which single frontier model is best suited to the task and send it there - not to orchestrate multiple agents. This is faster, cheaper per query, and easier to reason about. When you want to see all six models' answers simultaneously, the side-by-side view is one toggle away.

Neither architecture is universally better. The right choice depends on the task type and user profile.

Who Each Product Is Actually For

Sakana Fugu is built for:

Developers integrating AI into products via API
Teams running complex, multi-step agentic workflows
Enterprise users who want OpenAI-compatible infrastructure with multi-agent performance
Power users comfortable with token-based pricing and API interfaces
Non-EU teams, since EU availability is still pending

SmophyAI is built for:

Knowledge workers who want automatic routing without technical setup
Founders, marketers, consultants, and creators who use AI across multiple modalities daily
Users who want routing plus image generation, video, writing, and business tools in one subscription
Anyone who wants to toggle between auto-routing and manual multi-model comparison
EU and global users, with full availability

Cost-conscious users should also look carefully at usage economics. Fugu Ultra's pay-as-you-go pricing starts at $30/1M output tokens and heavy sessions can reach $10+ per query. SmophyAI's Starter plan at $19.98/month covers unlimited model access, all studios, and 4 million tokens. For heavier workloads, Pro at $54.98 includes 11M tokens and full video generation, while Pro Plus at $79.98 scales to 16M tokens and 4,000 standard images per month - all still significantly cheaper than running equivalent Fugu Ultra sessions at scale.

Benchmark Reality Check

The Fugu benchmark data deserves scrutiny before citing it. The numbers are from Sakana AI's own technical report - independently reproduced validation on all benchmarks does not yet exist at the time of writing. That is not unusual for a newly launched product, but it means the headline numbers, particularly SWE-bench Pro 73.7%, should be treated as directionally credible, not definitively confirmed.

What is independently confirmed: the underlying frontier models Fugu orchestrates, including Claude Opus 4.8, GPT-5.5, and Gemini 3.1 Pro, have well-documented benchmark performance that we have covered throughout this cluster. Multi-agent systems like Fugu's TRINITY architecture have peer-reviewed support at ICLR 2026 for outperforming single-model baselines on complex tasks. The mechanism is sound - the specific numbers are Sakana's claim.

SmophyAI's routing performance is not benchmark-published in the same way - it is a consumer product, not a research system. The argument for Smophy Mode is practical rather than benchmark-driven: for everyday knowledge work, consistent routing to the right model produces better outputs than inconsistent manual selection.

The Honest Comparison

Category	Sakana Fugu	SmophyAI Smophy Mode
Best for	Complex agentic tasks, API integration	Everyday knowledge work, full workspace
Routing type	Multi-agent orchestration	Single-model smart routing
Interface	API only	Consumer chat + studios
Model pool	Frontier models, pool partially disclosed	6 frontier models - latest from OpenAI, Anthropic, Google, xAI, Perplexity, DeepSeek - updated as new versions release
EU availability	Not yet	Yes
Pricing	$20-$200/mo subscription, or pay-as-you-go ($5 input / $30 output per 1M tokens for Fugu Ultra - heavy sessions can reach $10+ per query)	From $19.98/mo Starter - Pro $54.98, Pro Plus $79.98, Business $249.98 - full workspace including chat, images, video, writing, business tools at every tier
Images / Video / Writing	No	Yes
Side-by-side comparison	No	Yes, with one toggle

What Fugu's Launch Means for the Category

The most significant thing about Fugu is not Fugu itself - it is what its launch signals. A well-funded lab with peer-reviewed research behind it has decided that automatic model routing is important enough to build a commercial product around. That validates the category in enterprise conversations where "is this a real thing?" is still a blocking question.

For SmophyAI users, the practical implication is straightforward: the concept of automatic model selection that powers Smophy Mode now has academic and commercial legitimacy from an independent source. The two products serve different users and will coexist. What they share is the same foundational insight - that the manual model selection habit is a tax on your time and your output quality, and that a system with more data than your habit can make that decision better.