Up to 3x the speed with no loss of quality—is it too good to be true?
Ars Technica
May 06, 2026 • 15:44
Google’s Gemma 4 open AI models use “speculative decoding” to get up to 3x faster - Ars Technica
By Ryan Whitwam