What's New
Changelog and release history for TachiBot MCP. Every release brings new models, tools, and orchestration capabilities.
v2.16.1March 6, 2026
Gemini 3.1 Pro Migration
- Gemini 3.1 Pro — migrated from
gemini-3-pro-preview to gemini-3.1-pro-preview before March 9 retirement - 1M context window — enhanced reasoning capabilities with Gemini 3.1 Pro
- Stale entries removed — cleaned up old display names and pricing for retired model
v2.16.0March 6, 2026
GPT-5.4 Upgrade + Brainstorm Fix
- GPT-5.4 default — most capable model (Mar 2026), $2.50/$15 per 1M tokens
- GPT-5.4-pro — expert model with higher compute ($30/$180 per 1M tokens)
- GPT-5.3-codex — new agentic coding model for code review tasks
- Gemini 3.1 Flash-Lite — added as fastest/cheapest option in 3.1 series
openai_brainstorm fixed — eliminated fragile duplicate API function; now uses shared retry/fallback logic- Token limits bumped — GPT-5.4 reasoning tokens eat into output limit; all OpenAI tools now have higher defaults
v2.15.6February 26, 2026
Full Audit: 6 More Fixes + Cost Optimization
- 6 tools had required enum anti-pattern —
usage_stats, openrouter_multi, gemini_judge, planner_maker, planner_runner, create_workflow. All fixed gemini_judge — had zero required params. Made perspectives required as primary content paramperplexity_reason downgraded — sonar-pro ($3/$15/M) → sonar-reasoning ($1/$5/M), 3x cheaperperplexity_research removed — sonar-deep-research ($5/$25/M) was burning $12 in 3 days- All 51 tools audited — zero remaining required enum violations
v2.15.5February 26, 2026
Tool Parameter Fixes + Gemini Stability
- Fixed parameter validation on
qwen_coder, kimi_code, minimax_code — AI clients were misusing required enum task param. Added query as required primary param, made task optional with defaults kimi_long_context — task enum now optional (default: analyze)- Gemini 3.1 → 3.0 rollback — Reverted to stable
gemini-3-pro-preview (3.1 had timeout/503 issues) - Gemini timeout 30s → 90s — Pro models need longer than Flash
v2.15.0February 12, 2026
31 Prompt Techniques + /blueprint Skill + MiniMax M2.5
- 9 new prompt techniques —
reflexion (Shinn 2023), react (Yao 2022), scot (Li 2025, +13.79% HumanEval), pre_mortem, rubber_duck, test_driven, pre_post, bdd_spec, least_to_most. Total: 31 techniques /blueprint skill — Multi-model council → bite-sized TDD implementation plans. 7-step pipeline: Grok search → Qwen+Kimi analysis → GPT pre-mortem → Gemini final TDD output- MiniMax M2.5 — SWE-Bench 80.2% (was 72.5%). Embedded SCoT, reflexion, rubber_duck techniques. Per-task temperatures
- Planner → writing-plans bridge —
planner_maker now outputs bite-sized TDD steps (exact files, test-first, commit points) - Enhanced skills — /breakdown uses least_to_most + pre_mortem, /judge adds pre-mortem to critique, /decompose adds contracts, /prompt auto-recommends from 30 intents
- 51 tools across 7 providers, 9 skills for Claude Code
v2.14.7February 5, 2026
Gemini Judge + Multi-Model Jury
gemini_judge — Science-backed LLM-as-a-Judge evaluation (arXiv:2411.15594). 4 modes: synthesize, evaluate, rank, resolvejury — Multi-model jury panel. Configurable jurors (grok, openai, qwen, kimi, perplexity, minimax) run in parallel, Gemini synthesizes verdict. Based on "Replacing Judges with Juries" (Cohere, arXiv:2404.18796)- Perplexity fix —
sonar-pro model ID corrected (was using lightweight sonar by mistake) perplexity_research — Removed in v2.15.6 (cost too high)- 51 tools across 7 providers in the full profile
v2.14.6February 5, 2026
Qwen3-Coder-Next
qwen_coder upgraded — Qwen3-Coder-Next (80B/3B MoE, 262K context, SWE-Bench >70%)- 3x cheaper — $0.07/$0.30 per M tokens (was $0.22/$0.88)
- 2x context — 262K tokens (was 131K)
- Auto-fallback — Falls back to legacy 480B coder on provider failure
v2.14.5February 2, 2026
Claude Code Integration
- Tool annotations — All 51 tools now have MCP-standard annotations for better discovery
- Token overhead reduced — Stripped ANSI formatting, clean plain text output
- 25K character safety net — Smart truncation prevents Claude Code context overflow
v2.10January 28, 2026
Multi-Model Planner
- Multi-model council creates verified implementation plans
- Model roles — Grok searches ground truth, Qwen analyzes feasibility, GPT-5.2 critiques, Gemini scores quality
- New models — Kimi K2.5 (multimodal + agent swarm), MiniMax M2.5 (SWE-Bench 80.2%)
- New tools —
qwen_reason, minimax_code, minimax_agent, gemini_search - Smart routing — Tool routing based on availability, cost, and quality
v2.8January 20, 2026
Prompt Techniques
- FocusExecutionService for clean mode orchestration
- 22 research-backed techniques —
first_principles, tree_of_thoughts, council_of_experts, and more - Preview before execute — See enhanced prompts before running them
- Heartbeat support for long-running operations
v2.7.9January 2, 2026
Search Grounding
qwen_algo — O(1)-first algorithm analysis with Qwen3-235B-Thinking (235B MoE, LiveCodeBench 91.4)gemini_search — Google Search grounding with dynamic retrieval- Format utilities for consistent output across tools
v2.3December 28, 2025
Enhanced Thinking
- nextThought with finalJudge — Auto-call judge model when session completes
- Context aliases — Use
"none", "recent", "all" instead of magic numbers - Context distillation — Compress 8000+ tokens to ~500 (5x savings)
usage_stats tool for tracking tool usage and costs
v2.1November 25, 2025
Gateway Mode
- OpenRouter Gateway — One API key for all models
- Unified billing through OpenRouter
v2.0October 15, 2025
Major Rewrite
- Multi-model orchestration rebuilt from scratch
- Tool profiles for context control
- YAML workflow engine with variable interpolation
- 6 AI providers, 31+ tools (now 51 tools across 7 providers)
Explore More
Dive deeper into TachiBot's capabilities