Kimi K2.7 Code vs MiniMax M3: Open-Source AI Coding Models Compared (June 2026)
Last Updated: June 12, 2026
June 2026 delivered two of the most significant open-source AI model releases this year. Moonshot AI's Kimi K2.7 Code arrived on June 12, while MiniMax dropped M3 on June 1. Both target developers building AI-powered coding tools, but they take fundamentally different approaches. K2.7 Code optimizes for token efficiency on coding tasks. M3 pushes the frontier on context length and multimodality. Here's how they compare.
What Are Kimi K2.7 Code and MiniMax M3?
Kimi K2.7 Code is Moonshot AI's latest open-source coding model (released June 12, 2026), built on a 1 trillion parameter Mixture-of-Experts architecture with 32 billion active parameters. It reduces reasoning token usage by 30% compared to its predecessor K2.6, making long autonomous coding sessions more cost-effective.
MiniMax M3 (released June 1, 2026) is MiniMax's open-weights multimodal foundation model featuring a 1 million token context window and native support for text, image, and video inputs. It uses a proprietary MiniMax Sparse Attention (MSA) architecture that achieves 9× faster prefilling and 15× faster decoding at long context lengths compared to its predecessor.
Head-to-Head: Architecture and Specs
The two models take very different architectural approaches to solving similar problems:
Kimi K2.7 Code:
- 1 trillion total parameters (Mixture-of-Experts)
- 32 billion active parameters per token
- 384 experts, 8 activated per token
- 256K context window
- MLA attention with SwiGLU
- Text-only (coding specialist)
- Modified MIT License
MiniMax M3:
- Open-weights (exact parameter count not disclosed)
- MiniMax Sparse Attention (MSA) architecture
- 1 million token context window (512K guaranteed via API)
- Native multimodal: text + image + video input
- Computer use capabilities (OSWorld: 70.0%)
- Custom commercial license
The key difference: K2.7 Code is a specialist coding model that does one thing extremely efficiently. M3 is a generalist with frontier coding ability plus massive context and multimodality.
Coding Benchmarks: Where Each Model Wins
For developers evaluating these models for coding workflows, the benchmarks tell a clear story:
SWE-Bench Pro (real-world software engineering):
- MiniMax M3: 59.0%
- Kimi K2.7 Code: Not yet reported (K2.6 scored 58.6%)
- GPT-5.5: ~55%
- Claude Opus 4.8: 69.2%
Terminal-Bench 2.x (command-line coding tasks):
- MiniMax M3: 66.0% (Terminal-Bench 2.1)
- Kimi K2.6: 66.7% (Terminal-Bench 2.0)
Kimi Code Bench v2:
- Kimi K2.7 Code: 62.0
- Kimi K2.6: 50.9
Program Bench:
- Kimi K2.7 Code: 53.6
- Kimi K2.6: 48.3
SWE-fficiency (code efficiency metric):
- MiniMax M3: 34.8%
KernelBench Hard (CUDA/kernel optimization):
- MiniMax M3: 28.8%
The verdict: On pure coding benchmarks, both models are competitive with GPT-5.5 but trail Claude Opus 4.8. MiniMax M3 has a slight edge on SWE-Bench Pro, while K2.7 Code shows dramatic improvements over its predecessor on internal coding benchmarks.
Token Efficiency vs Context Length: The Real Trade-Off
This is where the choice between these models becomes practical:
Kimi K2.7 Code's advantage — token efficiency:
- 30% fewer reasoning tokens than K2.6
- Preserve_thinking mode retains full reasoning across multi-turn interactions
- Same deployment infrastructure as K2.5 and K2.6
- OpenAI-compatible API
MiniMax M3's advantage — context and multimodality:
- 1M token context window (4× larger than K2.7 Code)
- 1/20 the per-token compute cost at 1M context vs previous generation
- Native image and video understanding (MMMU-Pro: ~80%)
- BrowseComp: 83.5 (autonomous browsing and research)
- Computer use: 70.0% on OSWorld-Verified
For teams running long autonomous coding sessions where cost is the primary concern, K2.7 Code's 30% token reduction is compelling. For teams that need to process entire codebases, documentation sets, or multimodal inputs alongside coding tasks, M3's 1M context window is the differentiator.
Pricing and Accessibility
Kimi K2.7 Code:
- Available on Hugging Face ( Modified MIT License)
- Kimi platform API
- Same hardware requirements as K2.6 (~64GB+ VRAM for FP16)
- Requires
transformers >= 4.57.1
MiniMax M3:
- Promotional pricing: $0.30 per 1M input tokens, $1.20 per 1M output tokens
- Available via MiniMax API, OpenRouter
- Open-weights with custom commercial license
- MiniMax Code CLI and desktop agent
MiniMax M3 is notably cheap at current promotional pricing — roughly 5-10% of the cost of comparable closed-source models according to VentureBeat reporting.
Agentic Capabilities: Beyond Single-Turn Coding
Both models are designed for multi-step autonomous workflows, but they emphasize different aspects:
K2.7 Code inherits K2.6's agent swarm capabilities (up to 300 sub-agents, 4,000 coordinated steps, 12-hour runs). The token efficiency improvement makes these long runs significantly more affordable.
M3 demonstrated impressive autonomous capabilities in MiniMax's testing:
- Independently reproduced an ICLR 2025 Outstanding Paper over 12 hours (18 commits, 23 experimental figures)
- Optimized an FP8 GEMM CUDA kernel over 24 hours (147 benchmark submissions, 1,959 tool calls, 9.4× speedup)
- Strong BrowseComp performance (83.5) suggests capable research agents
Which Model Should You Choose?
Choose Kimi K2.7 Code if:
- You're already using K2.5 or K2.6 and want a drop-in upgrade
- Token cost optimization is your primary concern
- You need OpenAI API compatibility
- Your coding tasks are text-only
Choose MiniMax M3 if:
- You need to process entire codebases or documentation in a single context
- Your workflow involves images, screenshots, or video alongside code
- You want the cheapest frontier-level coding model available
- You need autonomous computer use or browsing capabilities
Use both if:
- K2.7 Code for high-volume routine coding tasks (cheaper per-token)
- M3 for complex multi-modal tasks requiring massive context (broader capability)
The Bigger Picture: Open-Source Is Closing the Gap
Both releases reinforce a trend that's accelerating in 2026: open-source models are no longer catching up to frontier closed-source models — they're competing directly.
MiniMax M3 matches or exceeds GPT-5.5 on SWE-Bench Pro (59.0% vs ~55%). Kimi K2.7 Code nearly matches GPT-5.5 on multi-language coding benchmarks (35.1 vs 35.5 on MLS Bench Lite). These are gaps measured in single-digit percentages, not the 20-30% gaps we saw a year ago.
For development teams, the practical implication is clear: the cost premium for closed-source coding models is getting harder to justify. Two open-source models released within two weeks of each other now offer frontier-level coding at a fraction of the price.
Frequently Asked Questions
What is the difference between Kimi K2.7 Code and MiniMax M3?
Kimi K2.7 Code is a specialist coding model focused on token efficiency (30% fewer reasoning tokens than K2.6) with a 256K context window. MiniMax M3 is a generalist multimodal model with a 1M context window that also handles images and video. Both are open-source/open-weights and competitive with GPT-5.5 on coding benchmarks.
Which is better for coding: Kimi K2.7 Code or MiniMax M3?
On SWE-Bench Pro, MiniMax M3 scores 59.0% while Kimi K2.6 (K2.7's predecessor) scored 58.6%. The models are roughly matched on pure coding benchmarks. K2.7 Code uses 30% fewer reasoning tokens, making it cheaper for high-volume coding. M3 offers a 1M context window, better for processing entire codebases.
Is MiniMax M3 free to use?
MiniMax M3 is open-weights with promotional API pricing of $0.30 per 1M input tokens and $1.20 per 1M output tokens. The model can also be run locally. Check the license terms for commercial use conditions.
Is Kimi K2.7 Code free to use?
Kimi K2.7 Code is released under a Modified MIT License that permits commercial use with attribution. Weights are available on Hugging Face and can be run on your own infrastructure.
Can MiniMax M3 process images and video?
Yes, MiniMax M3 is natively multimodal, supporting text, image, and video inputs. It scored approximately 80% on MMMU-Pro, comparable to GPT-5.5. This makes it suitable for tasks like understanding UI screenshots alongside code.

