NVIDIA: Nemotron 3 Ultra Coding Benchmark
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...
Try NVIDIA: Nemotron 3 Ultra in Kilo Code
Experience this model with the most popular open source coding agent. Free to start, pay only for AI usage. Use in popular IDEs like VS Code, JetBrains, command line, or cloud agents.
Downloads
models supported
to Start
Access 500+ models including NVIDIA: Nemotron 3 Ultra and many more in Kilo Code
Coding Performance
Coding benchmarks and performance metrics for development tasks
Kilo Bench
- % Completion on Terminal Bench 2.0
- 19.1%
- Cost per attempt (USD)
- $101.82
- Benchmark
- Terminal Bench 2.0
Official Kilo eval results. Cost is averaged per complete benchmark attempt.
OpenClaw Benchmarks
PinchBench measures how NVIDIA: Nemotron 3 Ultra performs on real OpenClaw agent tasks: multi-step execution, tool use, recovery, latency, and cost.
Average score
#5 of 50 official models
Average time
5 runs · per OpenClaw task
Average cost
Per benchmark run
Category breakdown
Best verified PinchBench v2 run by OpenClaw task family.
Top task results
Highest-scoring benchmark tasks from the same submission.
Autonomous task execution
NVIDIA: Nemotron 3 Ultra shows strong average success across OpenClaw-style benchmark runs, useful for recurring research, browser, and file-based automations.
Tool use and recovery
PinchBench tasks stress multi-step planning, tool calls, and judge-verified completion rather than single prompt coding snippets.
Agent workflow fit
Its deliberate average runtime and cost-aware run cost help set expectations for long-running KiloClaw agents and production workflows.
Agentic benchmarks from the PinchBench Leaderboard
Real-World Usage
Real-world usage statistics from the Kilo Code community
Weekly Token Usage
No ranking data available for this model yet.
Real-world metrics from the Kilo Code Leaderboard
Pricing
Cost per 1 million tokens
Example Cost
Analyzing a 10,000 line codebase (≈40k input tokens, 10k output tokens) costs approximately $0.0420
Coding Capabilities
Features and parameters relevant to coding tasks
Coding Features
Pricing details from OpenRouter
Technical Details
Architecture and implementation specifications
- Model ID
- nvidia/nemotron-3-ultra-550b-a55b
- Created
- June 4, 2026
- Tokenizer
- Other
- Input Modalities
- Text
- Context Window
- 262,144 tokens
- Max Completion Tokens
- 16,384 tokens
- Input Price
- $0.50 per 1M tokens
- Output Price
- $2.20 per 1M tokens
- Cache Read Price
- $0.10 per 1M tokens
- Content Moderation
- Disabled
Ready to try NVIDIA: Nemotron 3 Ultra?
Install Kilo Code and start using NVIDIA: Nemotron 3 Ultra for your coding projects today. Choose from 500+ AI models with complete freedom.
Install Kilo Code
Get the extension from VS Code Marketplace, JetBrains Plugin Repository, or the CLI.
Open the model selector
Click the model name in the Kilo Code chat panel to open the selector.
Choose your model
Search or browse to find and select your preferred model.
Start coding
Use Code, Ask, Debug, or Plan mode — the model is ready immediately.