TachiBot
← Back to Home

Getting Started

  • Introduction
  • Installation
  • Quick Start
  • Examples

Core Features

  • Tools Overview
  • Planner
  • Prompt Techniques
  • Workflows

Configuration

  • Tool Profiles
  • Tools Config
  • Themes
  • Usage Stats
  • API Keys

Resources

  • What's New

Legal

  • Terms & Conditions
  • Cookie Policy

Loading documentation...

What's New

Changelog and release history for TachiBot MCP. Every release brings new models, tools, and orchestration capabilities.

v2.19.1March 21, 2026

MiniMax M2.7 — Self-Evolving AI

  • MiniMax M2.5 → M2.7 — 2,300B MoE (100B active), 200K context. #1 on Artificial Analysis Intelligence Index
  • SWE-Pro 56.22% — matches GPT-5.3-Codex. Multi-SWE-Bench 52.7% (#1, beats Opus 4.6 and GPT-5.4)
  • Same pricing — $0.30/$1.20 per M tokens. Massive quality leap at zero extra cost
v2.18.0March 21, 2026Major Release

AI That Proves Its Work

Stop babysitting LLMs. Deploy a pipeline that reads actual files, cross-examines across five models, and demands passing tests before moving forward.

  • Absolute goal alignment — define success criteria once. The engine verifies every step against your exact goals — drift gets caught at step 1, not step 50
  • No blind spots reach production — 5-model rotation cross-examines code: Gemini deduces, Grok detects drift, GPT validates strategy, Qwen cross-checks, Kimi decomposes
  • Hard evidence, not hallucinated progress — checkpoints demand raw git diffs, passing test results, and modified file lists. Zero reliance on paraphrased summaries
  • Never hit a dead end — structured amendment protocol detects drift, proposes revisions with evidence and impact analysis. You approve before it pivots
  • 39 tools operate in reality — every analysis tool reads actual source code from disk via the files parameter. Models judge implementations, not stories about them
  • Your project gets smarter every run — post-completion reflexion saves architectural lessons to your devlog. Knowledge compounds across sessions
v2.17.2March 21, 2026

Files Parameter Rollout + Smart File Reader

  • files on 8 more tools — grok_architect, grok_brainstorm, openai_explain, openai_search, kimi_code, kimi_long_context, gemini_judge, gemini_brainstorm
  • Directory expansion — pass src/tools/ to read all code files in a directory
  • Smart char budget — multi-file reads distribute tokens across files to prevent context overflow
  • 23 of 37 tools now support the files parameter
v2.17.1March 21, 2026

Smart Task Decomposition

  • kimi_decompose readability overhaul — output now uses OVERVIEW / STRUCTURE / DETAILS / RISKS sections
  • Smart decomposition — infers context, constraints, risks, and measurable criteria automatically
  • Reasoning leak fixed — strips Kimi K2.5 chain-of-thought from output
  • Tuned for format adherence — temp 0.3, 4500 tokens, 360s timeout
v2.17.0March 21, 2026

GPT-5.4-mini + Model Cleanup

  • GPT-5.4-mini — new fast coding model (400k context, $0.75/$4.50 per 1M tokens, SWE-Bench 54.4%)
  • Code tasks upgraded — openai_code_review now uses gpt-5.4-mini — 94% of flagship quality, 70% cheaper
  • GPT-5.3 series retired — gpt-5.3-codex and gpt-5.3 removed; coding capabilities absorbed into gpt-5.4
  • Simplified lineup — gpt-5.4 (flagship), gpt-5.4-mini (coding/fast), gpt-5.4-pro (expert)
v2.16.1March 6, 2026

Gemini 3.1 Pro Migration

  • Gemini 3.1 Pro — migrated from gemini-3-pro-preview to gemini-3.1-pro-preview before March 9 retirement
  • 1M context window — enhanced reasoning capabilities with Gemini 3.1 Pro
  • Stale entries removed — cleaned up old display names and pricing for retired model
v2.16.0March 6, 2026

GPT-5.4 Upgrade + Brainstorm Fix

  • GPT-5.4 default — most capable model (Mar 2026), $2.50/$15 per 1M tokens
  • GPT-5.4-pro — expert model with higher compute ($30/$180 per 1M tokens)
  • GPT-5.3-codex — new agentic coding model for code review tasks
  • Gemini 3.1 Flash-Lite — added as fastest/cheapest option in 3.1 series
  • openai_brainstorm fixed — eliminated fragile duplicate API function; now uses shared retry/fallback logic
  • Token limits bumped — GPT-5.4 reasoning tokens eat into output limit; all OpenAI tools now have higher defaults
v2.15.6February 26, 2026

Full Audit: 6 More Fixes + Cost Optimization

  • 6 tools had required enum anti-pattern — usage_stats, openrouter_multi, gemini_judge, planner_maker, planner_runner, create_workflow. All fixed
  • gemini_judge — had zero required params. Made perspectives required as primary content param
  • perplexity_reason downgraded — sonar-pro ($3/$15/M) → sonar-reasoning ($1/$5/M), 3x cheaper
  • perplexity_research removed — sonar-deep-research ($5/$25/M) was burning $12 in 3 days
  • All 51 tools audited — zero remaining required enum violations
v2.15.5February 26, 2026

Tool Parameter Fixes + Gemini Stability

  • Fixed parameter validation on qwen_coder, kimi_code, minimax_code — AI clients were misusing required enum task param. Added query as required primary param, made task optional with defaults
  • kimi_long_context — task enum now optional (default: analyze)
  • Gemini 3.1 → 3.0 rollback — Reverted to stable gemini-3-pro-preview (3.1 had timeout/503 issues)
  • Gemini timeout 30s → 90s — Pro models need longer than Flash
v2.15.0February 12, 2026

31 Prompt Techniques + /blueprint Skill + MiniMax M2.5

  • 9 new prompt techniques — reflexion (Shinn 2023), react (Yao 2022), scot (Li 2025, +13.79% HumanEval), pre_mortem, rubber_duck, test_driven, pre_post, bdd_spec, least_to_most. Total: 31 techniques
  • /blueprint skill — Multi-model council → bite-sized TDD implementation plans. 7-step pipeline: Grok search → Qwen+Kimi analysis → GPT pre-mortem → Gemini final TDD output
  • MiniMax M2.5 — SWE-Bench 80.2% (was 72.5%). Embedded SCoT, reflexion, rubber_duck techniques. Per-task temperatures
  • Planner → writing-plans bridge — planner_maker now outputs bite-sized TDD steps (exact files, test-first, commit points)
  • Enhanced skills — /breakdown uses least_to_most + pre_mortem, /judge adds pre-mortem to critique, /decompose adds contracts, /prompt auto-recommends from 30 intents
  • 51 tools across 7 providers, 9 skills for Claude Code
v2.14.7February 5, 2026

Gemini Judge + Multi-Model Jury

  • gemini_judge — Science-backed LLM-as-a-Judge evaluation (arXiv:2411.15594). 4 modes: synthesize, evaluate, rank, resolve
  • jury — Multi-model jury panel. Configurable jurors (grok, openai, qwen, kimi, perplexity, minimax) run in parallel, Gemini synthesizes verdict. Based on "Replacing Judges with Juries" (Cohere, arXiv:2404.18796)
  • Perplexity fix — sonar-pro model ID corrected (was using lightweight sonar by mistake)
  • perplexity_research — Removed in v2.15.6 (cost too high)
  • 51 tools across 7 providers in the full profile
v2.14.6February 5, 2026

Qwen3-Coder-Next

  • qwen_coder upgraded — Qwen3-Coder-Next (80B/3B MoE, 262K context, SWE-Bench >70%)
  • 3x cheaper — $0.07/$0.30 per M tokens (was $0.22/$0.88)
  • 2x context — 262K tokens (was 131K)
  • Auto-fallback — Falls back to legacy 480B coder on provider failure
v2.14.5February 2, 2026

Claude Code Integration

  • Tool annotations — All 51 tools now have MCP-standard annotations for better discovery
  • Token overhead reduced — Stripped ANSI formatting, clean plain text output
  • 25K character safety net — Smart truncation prevents Claude Code context overflow
v2.10January 28, 2026

Multi-Model Planner

  • Multi-model council creates verified implementation plans
  • Model roles — Grok searches ground truth, Qwen analyzes feasibility, GPT-5.2 critiques, Gemini scores quality
  • New models — Kimi K2.5 (multimodal + agent swarm), MiniMax M2.5 (SWE-Bench 80.2%)
  • New tools — qwen_reason, minimax_code, minimax_agent, gemini_search
  • Smart routing — Tool routing based on availability, cost, and quality
v2.8January 20, 2026

Prompt Techniques

  • FocusExecutionService for clean mode orchestration
  • 22 research-backed techniques — first_principles, tree_of_thoughts, council_of_experts, and more
  • Preview before execute — See enhanced prompts before running them
  • Heartbeat support for long-running operations
v2.7.9January 2, 2026

Search Grounding

  • qwen_algo — O(1)-first algorithm analysis with Qwen3-235B-Thinking (235B MoE, LiveCodeBench 91.4)
  • gemini_search — Google Search grounding with dynamic retrieval
  • Format utilities for consistent output across tools
v2.3December 28, 2025

Enhanced Thinking

  • nextThought with finalJudge — Auto-call judge model when session completes
  • Context aliases — Use "none", "recent", "all" instead of magic numbers
  • Context distillation — Compress 8000+ tokens to ~500 (5x savings)
  • usage_stats tool for tracking tool usage and costs
v2.1November 25, 2025

Gateway Mode

  • OpenRouter Gateway — One API key for all models
  • Unified billing through OpenRouter
v2.0October 15, 2025

Major Rewrite

  • Multi-model orchestration rebuilt from scratch
  • Tool profiles for context control
  • YAML workflow engine with variable interpolation
  • 6 AI providers, 31+ tools (now 51 tools across 7 providers)

Explore More

Dive deeper into TachiBot's capabilities

Documentation
Back to the main docs
Tools Overview
See all 51 available tools
Examples
Practical usage patterns and workflows