Ornith 1.0: Self-Improving Open-Source Models for Agentic Coding
Ornith 1.0 is a family of MIT-licensed LLMs from DeepReinforce that jointly learn to solve coding tasks and build their own scaffolds. From 9B edge-deployable to 397B MoE rivaling Claude Opus 4.7 — run locally with vLLM, Ollama, or LM Studio.
Jun 25, 2026
Release Date
9B → 397B
Model Sizes
MIT
License
82.4%
SWE-Bench
What Is Ornith 1.0?
Ornith 1.0 is a family of open-source large language models built by DeepReinforce AI specifically for agentic coding. Released on June 25, 2026, Ornith 1.0 spans four parameter sizes — 9B Dense, 31B Dense, 35B MoE, and 397B MoE — all under MIT license with no regional restrictions. The name Ornith comes from the ancient Greek word for bird, and like a bird building its own nest, Ornith 1.0 learns to construct its own scaffolding before solving coding tasks.
The core innovation behind Ornith 1.0 is self-scaffolding reinforcement learning. Traditional coding agents rely on human-designed harnesses — fixed workflows for tool calls, error recovery, and task decomposition. Ornith 1.0 treats the scaffold as a learnable object that co-evolves with the model's policy during RL training. This means Ornith 1.0 generates its own task plans, launches tools, inspects intermediate results, and rewrites failing steps without human intervention.
At flagship scale, Ornith 1.0-397B achieves 77.5 on Terminal-Bench 2.1 and 82.4 on SWE-Bench Verified, surpassing Claude Opus 4.7 and outperforming every other open-source model of comparable size. The smaller Ornith 1.0-35B MoE scores 64.2 on Terminal-Bench 2.1 — beating Qwen 3.5-397B (53.5) with a fraction of the parameters. All Ornith 1.0 variants are available on Hugging Face with FP8, GGUF, and bf16 weights.
How Ornith 1.0 Works
Ornith 1.0 introduces a self-improving training framework that sets it apart from standard coding models:
Self-Improving RL Training
Unlike standard fine-tuning, Ornith 1.0 jointly optimizes both the solution code and the scaffold (task plan, tool calls, error recovery) during reinforcement learning.
Scaffold Co-Evolution
The model learns to construct its own orchestration framework — which tools to call, when to retry, how to decompose complex tasks — rather than relying on human-designed harnesses.
Anti-Reward Hacking
Three layers of safeguards — fixed trust boundary, deterministic monitor, and frozen LLM judge — prevent the model from gaming benchmark scores during RL training.
Reasoning + Tool Calls
Every response starts with a thinking block before the final answer. The models emit well-formed tool calls for agent loops, compatible with any OpenAI-format agent framework.
Ornith 1.0 Use Cases
Ornith 1.0 excels in scenarios where agentic coding agents need to autonomously plan, execute, and recover from errors. Here are the top Ornith 1.0 applications:
Multi-File Refactoring
Ornith 1.0 generates its own task plans, launches tools, inspects intermediate results, and rewrites failing steps — ideal for repository-scale refactoring across dozens of files.
Bug Localization & Fixes
The self-scaffolding approach lets Ornith 1.0 systematically search a codebase, narrow down root causes, and produce test-driven patches with minimal human intervention.
Terminal-Based Agents
Ornith 1.0 is optimized for terminal-native coding agents. Works directly with Claude Code, OpenHands, OpenClaw, and Hermes Agent out of the box.
Edge / Offline Coding
The 9B and 35B models run on consumer hardware — a gaming GPU or MacBook Pro — giving you a fully private, offline AI coding assistant with no API costs.
Ornith 1.0 Benchmark Highlights
Ornith 1.0-397B is the top open-source agentic coding model on major benchmarks. See the full benchmark page for all model sizes.
| Benchmark | Ornith 397B | Qwen 3.5 | Opus 4.7 |
|---|---|---|---|
| Terminal-Bench 2.1 | 77.5 | 53.5 | 70.3 |
| SWE-Bench Verified | 82.4 | 76.4 | 80.8 |
| SWE-Bench Pro | 62.2 | 51.6 | 64.3 |
| SWE-Bench Multilingual | 78.9 | 69.3 | — |
Ornith 1.0 Model Family
Choose the right Ornith 1.0 model for your hardware. Every Ornith 1.0 variant uses the same self-scaffolding RL training. See the full Ornith 1.0 model comparison for detailed specs.
9B Dense
9B
~19 GB (bf16), ~6 GB (Q4)
Best for: Edge deployment, single-GPU setups, fast triage
31B Dense
31B
~62 GB (bf16), ~20 GB (Q4)
Best for: Balanced quality and speed
35B MoE (~3B active/token)
35B
~25 GB (Q5_K_M)
Best for: Best value — faster than 9B, more accurate than 31B
Recommended
397B MoE
397B
~400 GB (bf16), ~200 GB (FP8)
Best for: Maximum accuracy, production agent pipelines
Ornith 1.0 FAQ
Frequently Asked Questions
What is Ornith 1.0?
Who made Ornith 1.0?
What base models is Ornith 1.0 built on?
Which Ornith 1.0 model should I choose?
How much VRAM do I need for Ornith 1.0?
Ready to Run Ornith 1.0?
Get started with Ornith 1.0 locally using vLLM, Ollama, or LM Studio — Ornith 1.0 needs no API keys, is completely free, and keeps your code private.