How TachiBot thinks.

From intent to a verified answer: how routing, planning, deliberation and verification fit together — and what happens on every single tool call.

Intelligent dispatch

Smart Routing & tachi

The tachi auto-router reads intent from your query, picks a mode, and assembles the right tool chain — so you never have to memorize 51 tool names.

AUTO-ROUTE · tachi → ToolExecutionService

—·Press replay to trace the deliberation

QUERY

INTENT

keyword match

solve

mode

research

mode

judge

mode

create

mode

TOOL CHAIN

selected + params

ANSWER

Modes ranked by keyword priority: judge › architect › solve › research › verify › creative

solve

grok_code → gemini_judge

Generate a fix, then have a different model verify it.

research

perplexity → openai_reason

Pull live, cited facts, then reason over them.

create

gemini_brainstorm → openai

Diverge into many ideas, then refine the best.

judge

jury panel

Parallel jurors + a judge for any “which is best?”.

architect

grok_architect

A 4–16 agent swarm for system-level design.

verify

fresh-model audit

Check claims against evidence with adversarial eyes.

Council-based planning

The Planner Pipeline

planner_maker runs a 5-stage council — each step handled by a different model — then planner_runner executes the plan against goal-oriented verification gates that catch drift early.

PIPELINE · planner_maker

—·Press replay to trace the deliberation

TASK

Research

grok_search

Analysis

qwen_coder

Critique

openai_reason

Synthesis

kimi_thinking

Validate

gemini_judge

PLAN

Context distilled to ~2.5k chars between steps (lost-in-middle mitigation)

Why a relay of different models?

planner_maker is a coordinator: instead of one model writing the whole plan, it returns one tool at a time and routes each stage to the model that's best at it. A single model would carry its own blind spots through every step; a relay lets a fresh perspective correct the previous one.

EXECUTION · planner_runner · verification gates

—·Press replay to trace the deliberation

Gemini · right thing?

10%

Grok · drift

25%

GPT · strategy

50%

Qwen · alignment

80%

Kimi · remaining

100%

GPT+Gemini · verdict

Evidence fed to gates: git diff · test results · changed files

Gemini “Sherlock”

Six deduction questions — are we even building the right thing?

10%

Grok

Early drift detection — kill a wrong approach before it spreads.

25%

GPT

Strategy validation — and can amend the plan if needed.

50%

Qwen

Goal alignment with a genuinely different reasoning style.

80%

Kimi

Decomposes the remaining work so nothing is left dangling.

100%

GPT + Gemini (dual)

Two-judge final verdict + Reflexion Lite against the evidence.

Decision-Making Framework

How Seven Minds Think Together

A structured reasoning pipeline — ground in data, decompose, explore alternatives, stress test for holes, then judge.

01Ground Truth

Search real-time data from 4 providers. No thinking starts without facts.

PerplexityGrok SearchGemini SearchOpenAI Search

02Break Down

Decompose into atomic parts. Map dependencies, constraints, and execution order.

Kimi K2.7-CodeQwen 235BQwen Algo

03Explore Paths

Generate alternative approaches from different training data and perspectives.

GPT-5.5GeminiGrokMiniMax

04Stress Test

Attack assumptions. Find holes, blind spots, and failure modes in every path.

GPT-5.5 CriticQwen Reason

05Judge

Synthesize the best elements from every model. Resolve conflicts. Score everything. Not 10? Here's why — and how to fix it.

Gemini 3.1 ProKimi K2.7-Code

Confidence91%

Code Quality8.4/10

-1.6 — Cyclomatic complexity >10 in handler

Fix: Extract validation to utility class

Security7/10

-3 — No rate limiting on auth endpoints

Fix: Add express-rate-limit, 5 req/min on /login

Performance9/10

-1 — O(n²) nested loop on unindexed array

Fix: Use HashMap for O(1) lookup

Every deduction comes with a reason and a fix. No mystery scores.

Why these models

Each model was chosen for a specific strength. Different training data, different benchmarks, different blind spots.

98%

Qwen 235B

HMMT — Harvard-MIT math tournament. Proof-based, multi-step olympiad problems that PhD students struggle with.

95%

Gemini 3.1 Pro

AIME — American Invitational Math Exam. Top math competition, one level below the International Math Olympiad.

93.6%

GPT-5.5

GPQA Diamond — PhD-level science questions written by domain experts. Tests deep reasoning, not memorization.

Kimi K2.7-Code

SWE-bench Pro leader — long-horizon, multi-file coding on real GitHub issues. 1T-param MoE, open-weights.

MiniMax M3

Token context — MSA sparse attention (~1/20 compute at 1M). Embedded SCoT, reflexion, and ReAct techniques with per-task temperatures.

91.4

Qwen3-235B-Thinking

LiveCodeBench — O(1)-first algorithm analysis. Codeforces 2056 Elo. Thinking mode of the 235B MoE.

Real Workflow Output

5 Models. One Answer.

Real GPT-5 → GPT-5.5 migration analysis, ~3 minutes

Query

"Should I migrate from GPT-5 to GPT-5.5? Differences, breaking changes, migration steps."

Version DiscoveryPerplexity

GPT-5.5 released Nov 12 — automatic migration, backward compatible

Feature ComparisonGrok

2 modes, 8 personalities, 25% better coding, 15% better factuality

Migration ImpactGPT-5.5

No breaking changes — backward compatible until Q1 2026

Performance AnalysisGemini

30% latency reduction, 10% fewer hallucinations

Final RecommendationKimi K2.7-Code

Low-risk upgrade — enable in staging, test 1 week, then production

Verdict: Safe to upgrade

No breaking changes, backward compatible, easy rollback

92%

Try TachiBot Yourself

Tailored surfaces

Six profiles

Load only the tools you need. Set TACHIBOT_PROFILE to scale the surface from a lean 12 to the full 51.

minimal

13/ 51 tools

Core reasoning, one strong model per job.

code_focus

42/ 51 tools

The full coding & analysis suite.

research_power

35/ 51 tools

Search, research & multi-source synthesis.

balanced

53/ 51 tools

All major tools across every capability.

heavy_coding

57/ 51 tools

Code, testing, architecture & decomposition.

full

64/ 51 tools

Everything enabled — maximum capability (default).

Inside the engine

How a tool call flows

From your prompt to a frontier model and back, every request passes through the same six layers. The request flows down; the answer streams back up.

MCP Client

Claude Code · stdio (Model Context Protocol)

You ask in natural language; Claude emits a structured tool call and streams the result back to you.

FastMCP Server

server.ts · safeAddTool()

Registers tools by active profile + available API keys, then wraps every call with input validation, usage/cost tracking and a heartbeat.

Tool

src/tools/*.ts · execute()

The tool's own logic runs — e.g. jury, planner_maker, grok_reason. Simple tools call a provider directly; orchestrating tools drop into the next layer.

Orchestration

ToolExecutionService · ModelProviderRegistry · ToolRouter · WorkflowEngine

Resolves model aliases → real tools, builds parameters, routes by intent, and drives multi-step / multi-round flows.

Provider call

callGrok · callOpenAI · callGemini · callOpenRouter · callPerplexity

Normalizes the request, tries the OpenRouter gateway first, falls back to the direct API, applies per-model timeouts & reasoning effort.

Model API

xAI · OpenAI · Google · OpenRouter · Perplexity

The actual frontier model runs and returns its answer — which travels back up the stack to you.

Plain language in

Who parses “ask grok & perplexity, then let gemini judge”?

How Claude unpacks the request

“ask grok & perplexity for the latest on the MCP spec, then let gemini judge”

1“ask grok … latest on the MCP spec”→grok_search

2“& perplexity”→perplexity_ask

3“then let gemini judge”→gemini_judgegets 1 + 2

Under the hood

How it holds together

Registry-driven, SOLID, and obsessive about token economy.

01 · ENTRY

Auto-routing dispatcher

tachi + focus read intent and orchestrate the right tools, modes and models for you.

02 · REGISTRIES

Provider & mode registries

ModelProviderRegistry maps 40+ aliases to tools; FocusModeRegistry adds reasoning modes without touching the core (OCP).

03 · COORDINATOR

One tool per step

The planner returns a single nextTool at a time — fully user-interruptible, with visible progress.

04 · CONTEXT

Distillation discipline

2.5k chars between steps, 6k for synthesis, truncated on clean boundaries — beating lost-in-the-middle recall loss.

05 · WORKFLOWS

YAML state machine

Variable interpolation, dependency resolution, parallel steps, retries and live manifest artifacts.

06 · TRACKING

Cost & usage aware

Every call wrapped by safeAddTool() — token tracking, validation, heartbeat to keep MCP alive.

Two front doors

tachi vs focus

Both orchestrate many models — they differ in how much you decide. One picks the tools for you; the other hands you the controls.

tachi

The concierge

decides WHAT — for you

Describe the goal in plain language. tachi reads your intent, picks the right task mode, and runs a pre-baked tool chain — zero config.

→Auto-routes by keyword priority (ties resolve to the higher-stakes mode)
→Each mode is an outcome recipe — generate, then verify
→One-shot & stateless — returns the answer plus what it ran

Try it

› tachi "debug this null-pointer error"› tachi "microservices vs monolith for 10M users"› tachi "which is best — React, Vue or Svelte?"

solveresearchverifycreativearchitectjudge

focus

The cockpit

you decide HOW

You pick the reasoning strategy and the panel. focus runs a controlled, multi-round deliberation exactly the way you specify.

→You choose a process mode — how the models think together
→Knobs: domain · rounds · models · temperature · ping-pong style
→Multi-round & stateful — sessions resume via continue_focus

Try it

› /focus architecture-debate Redis vs Memcached› /focus research React 19 new features› /focus deep-reasoning scale to 10k connections

deep-reasoningarchitecture-debatedebugcode-brainstormresearchanalyze

	tachi	focus
You provide	a query	a query + a chosen mode
Picks the mode	automatically (intent routing)	you do, explicitly
Mode type	task outcomes — solve, judge…	reasoning processes — debate, deep…
Rounds	one-shot recipe	multi-round (default 5), ping-pong
Config knobs	almost none	domain · rounds · models · temp · style
State	stateless	session-based, resumable
Reach for it when	“just handle this — pick the tools”	“reason this way, with these models”

Open source

Built in public. Actively maintained.

TachiBot is open source and actively maintained — new releases ship regularly. If it helps your workflow, a star helps others find it.

Star on GitHub Report a bug or request a feature