Podcast Episode

Tiny models punching up, 12-million-token context, and proteins that finally move

May 16, 2026

0:00

16:49

Today's NewsPodLM digs into Zyphra's surprisingly small ZAYA1-8B reasoning MoE trained on AMD, the launch of SubQ with a 12-million-token context window, a Swiss neural network that finally captures protein motion, AI diagnosing patients better than doctors, humanoid robots reporting to work at Haneda, and a neuro-symbolic architecture that slashes AI energy use by 100x.

Today's episode is heavy on architecture and efficiency, light on the usual scale-everything mantra. Here are the stories we covered.

A tiny reasoning model from Zyphra

Zyphra has released ZAYA1-8B, an 8B-parameter mixture-of-experts reasoning model with only ~760M active parameters per token. Trained entirely on AMD Instinct MI300X hardware, it matches or beats much larger open-weights models on reasoning, math, and code, even edging Claude 4.5 Sonnet on HMMT. Weights are open under Apache 2.0.

SubQ: a 12-million-token context window

Miami startup Subquadratic launched SubQ, the first commercial subquadratic LLM. It scales linearly with input length using a new attention variant, runs ~52x faster than FlashAttention at 1M tokens, and offers a native 12M-token context. Two products shipped: an API and a whole-repo coding agent.

Proteins finally get to move

Researchers at EPFL published a neural network that generates all-atom dynamic models of proteins rather than static snapshots, capturing the side-chain rearrangements AlphaFold tends to miss. The architecture blends diffusion with physics-aware constraints, with major implications for drug discovery.

An AI reasoning model beats doctors at diagnosis

A Harvard / Beth Israel Deaconess study found that an OpenAI reasoning model outperformed two experienced physicians at diagnosing patients from electronic health records on equal-information footing. A companion system, DeepRare, hit 64.4% first-attempt accuracy on rare diseases versus 54.6% for specialist physicians.

Humanoid robots report for duty at Haneda

Japan Airlines and GMO have begun a three-year humanoid robot pilot at Tokyo's Haneda Airport, using Unitree-based platforms (~$15,400/unit) for baggage and container handling, the first official deployment of its kind in Japan. AGIBOT also open-sourced its AGIBOT World 2026 robotics dataset to accelerate embodied AI training.

Neuro-symbolic AI cuts energy 100x

Researchers unveiled a neuro-symbolic architecture that hit 95% on the Tower of Hanoi planning task (versus 34% for standard neural systems) while using only 1% of training energy and 5% of inference energy.

Local 4K video generation arrives

Lightricks released LTX-2, an open video model that generates up to 20 seconds of 4K video on consumer GPUs, with built-in audio and multi-keyframe control, and slots directly into ComfyUI.

Microsoft's multi-agent bug hunter

Microsoft unveiled MDASH, a multi-model agentic security system using 100+ specialised agents that just topped a leading industry benchmark for vulnerability discovery and remediation.

Google's Gemini Omni surfaces

Early demos of Google's upcoming Gemini Omni suggest a unified video/audio/text architecture that processes all three modalities without intermediate transcription, likely to be revealed in full at I/O.

Published May 16, 2026 at 1:35pm