← Back to home

How TachiBot thinks.

From intent to a verified answer: how routing, planning, deliberation and verification fit together — and what happens on every single tool call.

Intelligent dispatch

Smart Routing & tachi

The tachi auto-router reads intent from your query, picks a mode, and assembles the right tool chain — so you never have to memorize 51 tool names.

AUTO-ROUTE · tachi → ToolExecutionService
·Press replay to trace the deliberation
QUERY
INTENT
keyword match
solve
mode
research
mode
judge
mode
create
mode
TOOL CHAIN
selected + params
ANSWER
Modes ranked by keyword priority: judge › architect › solve › research › verify › creative
solve
grok_code → gemini_judge
Generate a fix, then have a different model verify it.
research
perplexity → openai_reason
Pull live, cited facts, then reason over them.
create
gemini_brainstorm → openai
Diverge into many ideas, then refine the best.
judge
jury panel
Parallel jurors + a judge for any “which is best?”.
architect
grok_architect
A 4–16 agent swarm for system-level design.
verify
fresh-model audit
Check claims against evidence with adversarial eyes.
Council-based planning

The Planner Pipeline

planner_maker runs a 5-stage council — each step handled by a different model — then planner_runner executes the plan against goal-oriented verification gates that catch drift early.

PIPELINE · planner_maker
·Press replay to trace the deliberation
TASK
Research
grok_search
Analysis
qwen_coder
Critique
openai_reason
Synthesis
kimi_thinking
Validate
gemini_judge
PLAN
Context distilled to ~2.5k chars between steps (lost-in-middle mitigation)
Why a relay of different models?

planner_maker is a coordinator: instead of one model writing the whole plan, it returns one tool at a time and routes each stage to the model that's best at it. A single model would carry its own blind spots through every step; a relay lets a fresh perspective correct the previous one.

EXECUTION · planner_runner · verification gates
·Press replay to trace the deliberation
0%
Gemini · right thing?
10%
Grok · drift
25%
GPT · strategy
50%
Qwen · alignment
80%
Kimi · remaining
100%
GPT+Gemini · verdict
Evidence fed to gates: git diff · test results · changed files
0%
Gemini “Sherlock”
Six deduction questions — are we even building the right thing?
10%
Grok
Early drift detection — kill a wrong approach before it spreads.
25%
GPT
Strategy validation — and can amend the plan if needed.
50%
Qwen
Goal alignment with a genuinely different reasoning style.
80%
Kimi
Decomposes the remaining work so nothing is left dangling.
100%
GPT + Gemini (dual)
Two-judge final verdict + Reflexion Lite against the evidence.
Decision-Making Framework

How Seven Minds Think Together

A structured reasoning pipeline — ground in data, decompose, explore alternatives, stress test for holes, then judge.

01Ground Truth

Search real-time data from 4 providers. No thinking starts without facts.

PerplexityGrok SearchGemini SearchOpenAI Search
02Break Down

Decompose into atomic parts. Map dependencies, constraints, and execution order.

Kimi K2.6Qwen 235BQwen Algo
03Explore Paths

Generate alternative approaches from different training data and perspectives.

GPT-5.5GeminiGrokMiniMax
04Stress Test

Attack assumptions. Find holes, blind spots, and failure modes in every path.

GPT-5.5 CriticQwen Reason
05Judge

Synthesize the best elements from every model. Resolve conflicts. Score everything. Not 10? Here's why — and how to fix it.

Gemini 3.1 ProKimi K2.6
Confidence91%
Code Quality8.4/10
-1.6 — Cyclomatic complexity >10 in handler
Fix: Extract validation to utility class
Security7/10
-3 — No rate limiting on auth endpoints
Fix: Add express-rate-limit, 5 req/min on /login
Performance9/10
-1 — O(n²) nested loop on unindexed array
Fix: Use HashMap for O(1) lookup
Every deduction comes with a reason and a fix. No mystery scores.
Why these models

Each model was chosen for a specific strength. Different training data, different benchmarks, different blind spots.

98%
Qwen 235B
HMMT — Harvard-MIT math tournament. Proof-based, multi-step olympiad problems that PhD students struggle with.
95%
Gemini 3.1 Pro
AIME — American Invitational Math Exam. Top math competition, one level below the International Math Olympiad.
93.6%
GPT-5.5
GPQA Diamond — PhD-level science questions written by domain experts. Tests deep reasoning, not memorization.
#1
Kimi K2.6
SWE-bench Pro leader — long-horizon, multi-file coding on real GitHub issues. 1T-param MoE, open-weights.
56.2%
MiniMax M2.7
SWE-Bench Pro — with embedded SCoT, reflexion, and ReAct techniques. Per-task temperatures for optimal code generation.
91.4
Qwen3-235B-Thinking
LiveCodeBench — O(1)-first algorithm analysis. Codeforces 2056 Elo. Thinking mode of the 235B MoE.
Real Workflow Output

5 Models. One Answer.

Real GPT-5 → GPT-5.5 migration analysis, ~3 minutes

Query

"Should I migrate from GPT-5 to GPT-5.5? Differences, breaking changes, migration steps."

1
Version DiscoveryPerplexity

GPT-5.5 released Nov 12 — automatic migration, backward compatible

2
Feature ComparisonGrok

2 modes, 8 personalities, 25% better coding, 15% better factuality

3
Migration ImpactGPT-5.5

No breaking changes — backward compatible until Q1 2026

4
Performance AnalysisGemini

30% latency reduction, 10% fewer hallucinations

5
Final RecommendationKimi K2.6

Low-risk upgrade — enable in staging, test 1 week, then production

Verdict: Safe to upgrade
No breaking changes, backward compatible, easy rollback
92%
Tailored surfaces

Six profiles

Load only the tools you need. Set TACHIBOT_PROFILE to scale the surface from a lean 12 to the full 51.

minimal
12/ 51 tools

Core reasoning, one strong model per job.

code_focus
29/ 51 tools

The full coding & analysis suite.

research_power
31/ 51 tools

Search, research & multi-source synthesis.

balanced
39/ 51 tools

All major tools across every capability.

heavy_coding
45/ 51 tools

Code, testing, architecture & decomposition.

full
51/ 51 tools

Everything enabled — maximum capability.

Inside the engine

How a tool call flows

From your prompt to a frontier model and back, every request passes through the same six layers. The request flows down; the answer streams back up.

1
MCP Client
Claude Code · stdio (Model Context Protocol)
You ask in natural language; Claude emits a structured tool call and streams the result back to you.
2
FastMCP Server
server.ts · safeAddTool()
Registers tools by active profile + available API keys, then wraps every call with input validation, usage/cost tracking and a heartbeat.
3
Tool
src/tools/*.ts · execute()
The tool's own logic runs — e.g. jury, planner_maker, grok_reason. Simple tools call a provider directly; orchestrating tools drop into the next layer.
4
Orchestration
ToolExecutionService · ModelProviderRegistry · ToolRouter · WorkflowEngine
Resolves model aliases → real tools, builds parameters, routes by intent, and drives multi-step / multi-round flows.
5
Provider call
callGrok · callOpenAI · callGemini · callOpenRouter · callPerplexity
Normalizes the request, tries the OpenRouter gateway first, falls back to the direct API, applies per-model timeouts & reasoning effort.
6
Model API
xAI · OpenAI · Google · OpenRouter · Perplexity
The actual frontier model runs and returns its answer — which travels back up the stack to you.
Plain language in

Who parses “ask grok & perplexity, then let gemini judge”?

How Claude unpacks the request
“ask grok & perplexity for the latest on the MCP spec, then let gemini judge”
1“ask grok … latest on the MCP spec”grok_search
2“& perplexity”perplexity_ask
3“then let gemini judge”gemini_judgegets 1 + 2
Under the hood

How it holds together

Registry-driven, SOLID, and obsessive about token economy.

01 · ENTRY

Auto-routing dispatcher

tachi + focus read intent and orchestrate the right tools, modes and models for you.

02 · REGISTRIES

Provider & mode registries

ModelProviderRegistry maps 40+ aliases to tools; FocusModeRegistry adds reasoning modes without touching the core (OCP).

03 · COORDINATOR

One tool per step

The planner returns a single nextTool at a time — fully user-interruptible, with visible progress.

04 · CONTEXT

Distillation discipline

2.5k chars between steps, 6k for synthesis, truncated on clean boundaries — beating lost-in-the-middle recall loss.

05 · WORKFLOWS

YAML state machine

Variable interpolation, dependency resolution, parallel steps, retries and live manifest artifacts.

06 · TRACKING

Cost & usage aware

Every call wrapped by safeAddTool() — token tracking, validation, heartbeat to keep MCP alive.

Two front doors

tachi vs focus

Both orchestrate many models — they differ in how much you decide. One picks the tools for you; the other hands you the controls.

tachi

The concierge

decides WHAT — for you

Describe the goal in plain language. tachi reads your intent, picks the right task mode, and runs a pre-baked tool chain — zero config.

  • Auto-routes by keyword priority (ties resolve to the higher-stakes mode)
  • Each mode is an outcome recipe — generate, then verify
  • One-shot & stateless — returns the answer plus what it ran
Try it
› tachi "debug this null-pointer error"› tachi "microservices vs monolith for 10M users"› tachi "which is best — React, Vue or Svelte?"
solveresearchverifycreativearchitectjudge
focus

The cockpit

you decide HOW

You pick the reasoning strategy and the panel. focus runs a controlled, multi-round deliberation exactly the way you specify.

  • You choose a process mode — how the models think together
  • Knobs: domain · rounds · models · temperature · ping-pong style
  • Multi-round & stateful — sessions resume via continue_focus
Try it
› /focus architecture-debate Redis vs Memcached› /focus research React 19 new features› /focus deep-reasoning scale to 10k connections
deep-reasoningarchitecture-debatedebugcode-brainstormresearchanalyze
 tachifocus
You providea querya query + a chosen mode
Picks the modeautomatically (intent routing)you do, explicitly
Mode typetask outcomes — solve, judge…reasoning processes — debate, deep…
Roundsone-shot recipemulti-round (default 5), ping-pong
Config knobsalmost nonedomain · rounds · models · temp · style
Statestatelesssession-based, resumable
Reach for it when“just handle this — pick the tools”“reason this way, with these models”

Built in public. Backed by stars.

TachiBot is open source and actively maintained. If it helps your workflow, a star helps us keep going.

GitHub Starsnpm downloadsLast commit