Yesterday in AI: 04 June 2026 — NVIDIA Drops 550B Open Weights at 7× Rivals' Speed

NVIDIA's Nemotron 3 Ultra weights hit HuggingFace at 420 t/s; Claude Code v2.1.163 ships MCP session IDs and enterprise fleet controls; Meta launches enterprise AI agent while Muse Spark API slips two months.

By OMC Editorial on 2026-06-05

TL;DR — NVIDIA released Nemotron 3 Ultra weights on June 4, making the 550B open model the fastest US open-weights frontier LLM at 420 t/s; Claude Code v2.1.163 shipped MCP session IDs, enterprise version locks, and auto mode on Bedrock/Vertex/Foundry; Meta unveiled an enterprise AI agent while Muse Spark's developer API slipped two months past its promised date. --- 1️⃣ NVIDIA Drops 550B Open Weights at 7× the Speed of Comparable Models - What: NVIDIA released Nemotron 3 Ultra weights on June 4 to HuggingFace, OpenRouter, ModelScope, and NVIDIA NIM — a 550B-parameter 55B active hybrid Mamba-2/Transformer MoE model with 1M-token context. - Why it matters: At an Intelligence Index of 48 it tops every US open-weights model; at 420 t/s median it runs more than 7× faster than comparable open models, making frontier-class reasoning viable in production agent pipelines. - Key number: 420.2 tokens/second median across providers vs. a 57.7 t/s median for comparable open-weight models of similar size. NVIDIA announced the model at its Computex 2026 keynote on June 1, with weights landing four days later. The architecture uses LatentMoE — compressing tokens into a low-rank latent space before routing to expert networks — allowing 4× more specialist activations per inference dollar versus standard MoE designs. It supports a 1M-token context window and ships under the NVIDIA Open Model License permitting commercial use. On Artificial Analysis's Intelligence Index, Nemotron 3 Ultra scores 48 — ahead of Gemma 4 31B 39 and Nemotron 3 Super 36, though still behind Chinese-led open models like Kimi K2.6 54. The weights are live at nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 on HuggingFace. 📎 HuggingFace Model Cardhttps://huggingface.co/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 · Artificial Analysishttps://artificialanalysis.ai/models/nvidia-nemotron-3-ultra-550b-a55b · NVIDIA Bloghttps://blogs.nvidia.com/blog/nvidia-gtc-taipei-computex-2026-news/ · ChatForesthttps://ch