Rio3.5: The Free 397B AI Model Released by a City Government — Better Than Qwen3.7?

Q: Where can I download Rio3.5?

The model is available on Hugging Face at [huggingface.co/prefeitura-rio/Rio-3.5-Open-397B](https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B).

On June 14, 2026 — one day after the US government suspended Anthropic's Claude Fable 5 and Mythos 5 worldwide — Rio de Janeiro released Rio3.5 Open 397B.

A city government. Not a tech giant. Not a well-funded AI lab. The municipal IT department of Brazil's second-largest city.

And it is beating Alibaba's Qwen3.7 Plus on four of the five major benchmarks they tested.

Here is everything you need to know about Rio3.5, why it matters, and how to try it.

What Is Rio3.5 Open 397B?

Rio3.5 Open 397B is a large language model developed by IplanRIO — Empresa Municipal de Informatica e Planejamento S.A., the city of Rio de Janeiro's official technology company. It was released June 14, 2026, on Hugging Face under an MIT license, meaning it is completely free to use, modify, and deploy commercially.

The technical basics:

397 billion total parameters — but only ~17 billion are active at any given time, thanks to a Mixture-of-Experts (MoE) architecture. This means it is far more efficient than a dense 397B model.
Base model: Fine-tuned from Alibaba's Qwen3.5-397B-A17B
Context window: 262,144 tokens natively (~200,000 words), extendable to approximately 1,010,000 tokens
Multimodal: Supports both text and images
Languages: English, Portuguese, Chinese, and 30+ others
License: MIT — fully open for research and commercial use
Model size: ~807 GB (requires multiple GPUs to run)

The model card is live at: huggingface.co/prefeitura-rio/Rio-3.5-Open-397B

How Does It Perform? The Benchmark Results

Rio3.5 claims competitive performance against Qwen3.7 Plus — the previous state-of-the-art open-weight model at this scale — across five major benchmarks:

Rio3.5 vs Qwen3.7 Plus — benchmark comparison chart

Benchmark	What It Tests	Rio3.5 Score	Qwen3.7 Plus	Winner
Terminal-Bench 2.1	Agentic terminal tasks	70.8	70.3	Rio3.5
SWE-Bench Multilingual	Multilingual code fixing	77.0	75.8	Rio3.5
SWE-Bench Pro	Real software engineering	58.1	57.6	Rio3.5
IMOAnswerBench	Olympiad-level math	89.5	86.0	Rio3.5
MMLU-Pro	Broad general knowledge	88.0	88.5	Qwen3.7

These are first-party results from IplanRIO's model card and have not yet been independently verified by the broader AI research community. They should be treated as promising but unconfirmed until external benchmarking occurs. That said, the margins are consistent with what you would expect from a focused post-training effort on an already strong base model.

What Is SwiReasoning?

The headline innovation behind Rio3.5's gains is SwiReasoning — a training-free inference framework developed by IplanRIO.

The core idea: most AI models either always "think out loud" (chain-of-thought, visible reasoning steps) or always reason internally. SwiReasoning dynamically switches between the two approaches based on how complex the current problem is.

When a question is straightforward, it skips the visible reasoning and answers immediately. When a problem requires deep analysis, it switches to explicit chain-of-thought. This is controlled using entropy signals — essentially, how "uncertain" the model is about its next token.

The claimed results:

1.8%–3.1% accuracy improvements across benchmarks
57%–79% fewer tokens used on simpler tasks (meaning lower compute cost and faster responses)

The honest caveat: IplanRIO has not yet released the SwiReasoning implementation code. This makes independent testing impossible right now. If the gains hold up under external evaluation, SwiReasoning will be a genuinely significant contribution to open-source AI inference. If they do not, Rio3.5 is still a competitive open-weight model at a size category that was previously only accessible through Alibaba's API.

Why Is a City Government Releasing a Frontier AI Model?

This is the question everyone is asking — and the context is important.

Rio de Janeiro has been running a "Rio AI City" initiative for the past two years, aiming to position the city as a technology hub in Latin America. IplanRIO, the city's tech arm, has been building AI infrastructure for municipal services: document processing, multilingual citizen support, public records, urban planning.

Releasing an open-weight model serves several purposes:

AI sovereignty. The US government's suspension of Claude Fable 5 just proved that governments dependent on foreign AI APIs can be cut off without notice. Rio3.5 gives Brazil — and any other country — a sovereign alternative.
Cost. Training a frontier model from scratch costs hundreds of millions of dollars. Post-training an existing open-weight model (Qwen3.5 in this case) to achieve competitive results costs far less. Rio proved that municipal-level compute budgets can produce frontier-class results.
Language and culture. Portuguese-language AI performance is historically underserved. Rio3.5 is likely fine-tuned with strong Portuguese data, making it immediately valuable for Brazilian government and business use cases.

The timing of the release — one day after the US pulled Fable 5 globally — almost certainly accelerated the announcement, even if the model was ready earlier.

How to Try Rio3.5

Running the full 807 GB model locally is not practical for most people — you need approximately 8 × A100 80GB GPUs, which costs tens of thousands of dollars to own or ~$40–$80/hour to rent from cloud GPU providers.

Option 1: Cloud API Access (Recommended for Most Users)

The fastest way to try Rio3.5 is through inference platforms that are likely to add it in coming days:

Together.ai — paste prefeitura-rio/Rio-3.5-Open-397B as the model ID once it is listed
Replicate — watch for a community deployment
Hugging Face Inference Endpoints — create a dedicated endpoint using the model ID

For businesses that want to deploy a custom knowledge base or assistant on top of Rio3.5 without managing any infrastructure, tools like CustomGPT (affiliate link) let you connect your own data to powerful AI models through a no-code interface.

Option 2: Python + Transformers (If You Have GPU Access)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prefeitura-rio/Rio-3.5-Open-397B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

messages = [{"role": "user", "content": "What can you do?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.6, top_p=0.95)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Option 3: vLLM Server (Production)

vllm serve prefeitura-rio/Rio-3.5-Open-397B \
    --tensor-parallel-size 8 \
    --max-model-len 262144 \
    --trust-remote-code

Not sure what VRAM you need for this? Check our guide to checking VRAM for AI models.

How Rio3.5 Compares to Other Open Models

Rio3.5 sits in a competitive landscape of large open-weight models released in 2026:

Model	Released by	Parameters	License	Notable strength
Rio3.5 Open 397B	IplanRIO / Rio Gov	397B (17B active)	MIT	Coding + math, multilingual, SwiReasoning
Qwen3.5-397B	Alibaba	397B (17B active)	Apache 2.0	Strong all-round base
DeepSeek-V4-Pro	DeepSeek	1.6T total	Custom	Massive scale, strong reasoning
Llama 4	Meta	Various	Custom	Large community, ecosystem support
Mistral Large 3	Mistral	Unknown	MRL	European AI, privacy-focused

For coding agents and multilingual tasks, Rio3.5 is now the leading MIT-licensed option at its parameter count — assuming the benchmarks hold.

What This Means for Open-Source AI

Rio3.5 is the clearest signal yet that frontier AI development is no longer the exclusive domain of Silicon Valley and a handful of Chinese companies.

A municipal government just published a model that outperforms Alibaba's state-of-the-art open-weight model on most benchmarks — using an existing open-source base, a focused post-training effort, and a training-free inference innovation. The total compute cost was almost certainly a fraction of what it would cost to train a comparable model from scratch.

That pattern is going to repeat. Smaller teams, universities, national governments, and regional labs are going to keep pushing the frontier of open-weight AI. Rio proved it works.

The political backdrop — released one day after the US pulled Fable 5 globally — makes the timing hard to ignore. Whether or not that was intentional, the message landed: AI sovereignty is achievable, and it does not require a trillion-dollar company to get there.

Frequently Asked Questions

What is Rio3.5? Rio3.5 Open 397B is a 397-billion-parameter AI model released by IplanRIO, the municipal IT company of Rio de Janeiro. It is fine-tuned from Alibaba's Qwen3.5-397B-A17B base model, released under an MIT license, and claims to beat Qwen3.7 Plus on four of five major benchmarks.

Is Rio3.5 free to use? Yes. Rio3.5 is released under an MIT license, which means it is free for personal, research, and commercial use. You can download it from Hugging Face at no cost.

Can I run Rio3.5 on my laptop or home PC? No — at 807 GB, Rio3.5 requires approximately 8 × A100 80GB GPUs to run. It is designed for server or cloud GPU deployment. To try it, use cloud inference platforms like Together.ai or Replicate once they list the model.

Who made Rio3.5? IplanRIO — Empresa Municipal de Informatica e Planejamento S.A., the official technology and planning company of Rio de Janeiro's city government.

How does Rio3.5 compare to GPT-5.5 or Claude Opus 4.8? Rio3.5's benchmarks show it is competitive with GPT-4-class performance on coding and math tasks. It lags behind the current frontier (GPT-5.5, Claude Fable 5) in general reasoning and instruction-following, but it is fully open-source and free — while the frontier models require paid API access.

What is SwiReasoning? SwiReasoning is a training-free inference technique developed by IplanRIO that dynamically switches between explicit chain-of-thought reasoning and faster latent-space reasoning based on problem complexity. The claimed result is 57%–79% fewer tokens used on simple tasks, improving efficiency without sacrificing accuracy. The implementation code has not yet been released publicly.

Is Rio3.5 safe to use for business? For businesses, Rio3.5's MIT license is highly permissive — it can be used in commercial products without restrictions. However, as a newly released model without independent safety evaluations, you should test it carefully for your use case before deploying in customer-facing applications.

Where can I download Rio3.5? The model is available on Hugging Face at huggingface.co/prefeitura-rio/Rio-3.5-Open-397B.