Rio3.5: The Free 397B AI Model Released by a City Government — Better Than Qwen3.7?
Rio de Janeiro's city government just released Rio3.5 Open 397B — a free, MIT-licensed AI model that beats Qwen3.7 Plus on coding and math benchmarks. Here is what it is, how it compares, and how to try it.

On June 14, 2026 — one day after the US government suspended Anthropic's Claude Fable 5 and Mythos 5 worldwide — Rio de Janeiro released Rio3.5 Open 397B.
A city government. Not a tech giant. Not a well-funded AI lab. The municipal IT department of Brazil's second-largest city.
And it is beating Alibaba's Qwen3.7 Plus on four of the five major benchmarks they tested.
Here is everything you need to know about Rio3.5, why it matters, and how to try it.
What Is Rio3.5 Open 397B?
Rio3.5 Open 397B is a large language model developed by IplanRIO — Empresa Municipal de Informatica e Planejamento S.A., the city of Rio de Janeiro's official technology company. It was released June 14, 2026, on Hugging Face under an MIT license, meaning it is completely free to use, modify, and deploy commercially.
The technical basics:
- 397 billion total parameters — but only ~17 billion are active at any given time, thanks to a Mixture-of-Experts (MoE) architecture. This means it is far more efficient than a dense 397B model.
- Base model: Fine-tuned from Alibaba's Qwen3.5-397B-A17B
- Context window: 262,144 tokens natively (~200,000 words), extendable to approximately 1,010,000 tokens
- Multimodal: Supports both text and images
- Languages: English, Portuguese, Chinese, and 30+ others
- License: MIT — fully open for research and commercial use
- Model size: ~807 GB (requires multiple GPUs to run)
The model card is live at: huggingface.co/prefeitura-rio/Rio-3.5-Open-397B
How Does It Perform? The Benchmark Results
Rio3.5 claims competitive performance against Qwen3.7 Plus — the previous state-of-the-art open-weight model at this scale — across five major benchmarks:

| Benchmark | What It Tests | Rio3.5 Score | Qwen3.7 Plus | Winner |
|---|---|---|---|---|
| Terminal-Bench 2.1 | Agentic terminal tasks | 70.8 | 70.3 | Rio3.5 |
| SWE-Bench Multilingual | Multilingual code fixing | 77.0 | 75.8 | Rio3.5 |
| SWE-Bench Pro | Real software engineering | 58.1 | 57.6 | Rio3.5 |
| IMOAnswerBench | Olympiad-level math | 89.5 | 86.0 | Rio3.5 |
| MMLU-Pro | Broad general knowledge | 88.0 | 88.5 | Qwen3.7 |
These are first-party results from IplanRIO's model card and have not yet been independently verified by the broader AI research community. They should be treated as promising but unconfirmed until external benchmarking occurs. That said, the margins are consistent with what you would expect from a focused post-training effort on an already strong base model.
What Is SwiReasoning?
The headline innovation behind Rio3.5's gains is SwiReasoning — a training-free inference framework developed by IplanRIO.
The core idea: most AI models either always "think out loud" (chain-of-thought, visible reasoning steps) or always reason internally. SwiReasoning dynamically switches between the two approaches based on how complex the current problem is.
When a question is straightforward, it skips the visible reasoning and answers immediately. When a problem requires deep analysis, it switches to explicit chain-of-thought. This is controlled using entropy signals — essentially, how "uncertain" the model is about its next token.
The claimed results:
- 1.8%–3.1% accuracy improvements across benchmarks
- 57%–79% fewer tokens used on simpler tasks (meaning lower compute cost and faster responses)
The honest caveat: IplanRIO has not yet released the SwiReasoning implementation code. This makes independent testing impossible right now. If the gains hold up under external evaluation, SwiReasoning will be a genuinely significant contribution to open-source AI inference. If they do not, Rio3.5 is still a competitive open-weight model at a size category that was previously only accessible through Alibaba's API.
Why Is a City Government Releasing a Frontier AI Model?
This is the question everyone is asking — and the context is important.
Rio de Janeiro has been running a "Rio AI City" initiative for the past two years, aiming to position the city as a technology hub in Latin America. IplanRIO, the city's tech arm, has been building AI infrastructure for municipal services: document processing, multilingual citizen support, public records, urban planning.
Releasing an open-weight model serves several purposes:
-
AI sovereignty. The US government's suspension of Claude Fable 5 just proved that governments dependent on foreign AI APIs can be cut off without notice. Rio3.5 gives Brazil — and any other country — a sovereign alternative.
-
Cost. Training a frontier model from scratch costs hundreds of millions of dollars. Post-training an existing open-weight model (Qwen3.5 in this case) to achieve competitive results costs far less. Rio proved that municipal-level compute budgets can produce frontier-class results.
-
Language and culture. Portuguese-language AI performance is historically underserved. Rio3.5 is likely fine-tuned with strong Portuguese data, making it immediately valuable for Brazilian government and business use cases.
The timing of the release — one day after the US pulled Fable 5 globally — almost certainly accelerated the announcement, even if the model was ready earlier.
How to Try Rio3.5
Running the full 807 GB model locally is not practical for most people — you need approximately 8 × A100 80GB GPUs, which costs tens of thousands of dollars to own or ~$40–$80/hour to rent from cloud GPU providers.
Option 1: Cloud API Access (Recommended for Most Users)
The fastest way to try Rio3.5 is through inference platforms that are likely to add it in coming days:
- Together.ai — paste
prefeitura-rio/Rio-3.5-Open-397Bas the model ID once it is listed - Replicate — watch for a community deployment
- Hugging Face Inference Endpoints — create a dedicated endpoint using the model ID
For businesses that want to deploy a custom knowledge base or assistant on top of Rio3.5 without managing any infrastructure, tools like CustomGPT (affiliate link) let you connect your own data to powerful AI models through a no-code interface.
Option 2: Python + Transformers (If You Have GPU Access)
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "prefeitura-rio/Rio-3.5-Open-397B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
messages = [{"role": "user", "content": "What can you do?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.6, top_p=0.95)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
Option 3: vLLM Server (Production)
vllm serve prefeitura-rio/Rio-3.5-Open-397B \
--tensor-parallel-size 8 \
--max-model-len 262144 \
--trust-remote-code
Not sure what VRAM you need for this? Check our guide to checking VRAM for AI models.
How Rio3.5 Compares to Other Open Models
Rio3.5 sits in a competitive landscape of large open-weight models released in 2026:
| Model | Released by | Parameters | License | Notable strength |
|---|---|---|---|---|
| Rio3.5 Open 397B | IplanRIO / Rio Gov | 397B (17B active) | MIT | Coding + math, multilingual, SwiReasoning |
| Qwen3.5-397B | Alibaba | 397B (17B active) | Apache 2.0 | Strong all-round base |
| DeepSeek-V4-Pro | DeepSeek | 1.6T total | Custom | Massive scale, strong reasoning |
| Llama 4 | Meta | Various | Custom | Large community, ecosystem support |
| Mistral Large 3 | Mistral | Unknown | MRL | European AI, privacy-focused |
For coding agents and multilingual tasks, Rio3.5 is now the leading MIT-licensed option at its parameter count — assuming the benchmarks hold.
What This Means for Open-Source AI
Rio3.5 is the clearest signal yet that frontier AI development is no longer the exclusive domain of Silicon Valley and a handful of Chinese companies.
A municipal government just published a model that outperforms Alibaba's state-of-the-art open-weight model on most benchmarks — using an existing open-source base, a focused post-training effort, and a training-free inference innovation. The total compute cost was almost certainly a fraction of what it would cost to train a comparable model from scratch.
That pattern is going to repeat. Smaller teams, universities, national governments, and regional labs are going to keep pushing the frontier of open-weight AI. Rio proved it works.
The political backdrop — released one day after the US pulled Fable 5 globally — makes the timing hard to ignore. Whether or not that was intentional, the message landed: AI sovereignty is achievable, and it does not require a trillion-dollar company to get there.
Frequently Asked Questions
What is Rio3.5? Rio3.5 Open 397B is a 397-billion-parameter AI model released by IplanRIO, the municipal IT company of Rio de Janeiro. It is fine-tuned from Alibaba's Qwen3.5-397B-A17B base model, released under an MIT license, and claims to beat Qwen3.7 Plus on four of five major benchmarks.
Is Rio3.5 free to use? Yes. Rio3.5 is released under an MIT license, which means it is free for personal, research, and commercial use. You can download it from Hugging Face at no cost.
Can I run Rio3.5 on my laptop or home PC? No — at 807 GB, Rio3.5 requires approximately 8 × A100 80GB GPUs to run. It is designed for server or cloud GPU deployment. To try it, use cloud inference platforms like Together.ai or Replicate once they list the model.
Who made Rio3.5? IplanRIO — Empresa Municipal de Informatica e Planejamento S.A., the official technology and planning company of Rio de Janeiro's city government.
How does Rio3.5 compare to GPT-5.5 or Claude Opus 4.8? Rio3.5's benchmarks show it is competitive with GPT-4-class performance on coding and math tasks. It lags behind the current frontier (GPT-5.5, Claude Fable 5) in general reasoning and instruction-following, but it is fully open-source and free — while the frontier models require paid API access.
What is SwiReasoning? SwiReasoning is a training-free inference technique developed by IplanRIO that dynamically switches between explicit chain-of-thought reasoning and faster latent-space reasoning based on problem complexity. The claimed result is 57%–79% fewer tokens used on simple tasks, improving efficiency without sacrificing accuracy. The implementation code has not yet been released publicly.
Is Rio3.5 safe to use for business? For businesses, Rio3.5's MIT license is highly permissive — it can be used in commercial products without restrictions. However, as a newly released model without independent safety evaluations, you should test it carefully for your use case before deploying in customer-facing applications.
Where can I download Rio3.5? The model is available on Hugging Face at huggingface.co/prefeitura-rio/Rio-3.5-Open-397B.

Alex the Engineer
•Founder & AI ArchitectSenior software engineer turned AI Agency owner. I build massive, scalable AI workflows and share the exact blueprints, financial models, and code I use to generate automated revenue in 2026.
Related Articles

Google's AI Brain Drain: Nobel Scientist John Jumper Joins Anthropic (What It Means for Claude)
Nobel Prize winner John Jumper just left Google DeepMind for Anthropic — days after Gemini's co-lead left for OpenAI. Here's why the world's best AI scientists are abandoning Google, and what it means for the AI tools you use.

What is MCP (Model Context Protocol)? A Beginner's Guide for 2026
MCP (Model Context Protocol) explained for beginners — what it is, how it works, why every AI tool is adding it, and how to use it without writing code.

How AI Is Making Cyberattacks More Sophisticated in 2026 (And How to Stay Safe)
AI tools are enabling a new generation of cyberattacks — faster, cheaper, and harder to detect. Here's what's actually happening and five practical steps to protect yourself in 2026.