Simon WillisonGoogleSunday, April 12, 2026

Gemma 4 audio with MLX

AI-Generated Summary

Google's Gemma 4 model is now available for audio processing on macOS through MLX, a machine learning framework optimized for Apple silicon. The 10.28 GB Gemma 4 E2B variant can be run locally using a command-line recipe that leverages MLX and the mlx-vlm library, enabling users to transcribe audio files directly on their machines without relying on cloud services.

This development makes advanced audio understanding capabilities more accessible to developers and researchers working on Apple hardware. By using MLX, which is designed to run efficiently on Apple's neural processing units, users can process audio locally while maintaining privacy and avoiding the latency associated with cloud-based transcription services.

The availability of this tool matters because it democratizes access to state-of-the-art audio processing technology, allowing individuals and smaller organizations to implement sophisticated audio transcription and analysis workflows on consumer-grade hardware. The local-first approach also addresses growing concerns about data privacy while reducing dependency on external API services.

Key Takeaways

Google's Gemma 4 model is now available for audio processing on macOS through MLX, a machine learning framework optimized for Apple silicon.
28 GB Gemma 4 E2B variant can be run locally using a command-line recipe that leverages MLX and the mlx-vlm library, enabling users to transcribe audio files directly on their machines without relying on cloud services.
This development makes advanced audio understanding capabilities more accessible to developers and researchers working on Apple hardware.
By using MLX, which is designed to run efficiently on Apple's neural processing units, users can process audio locally while maintaining privacy and avoiding the latency associated with cloud-based transcription services.

Read the full article on Simon Willison

Read on Simon Willison

The Register21h ago

Growing void between enterprise and frontier AI puts open weights models in the spotlight

Google

A significant gap is emerging between the capabilities of frontier AI models—the largest and most advanced systems—and what most enterprise customers actually need for their business operations. Organizations increasingly recognize that the most expensive, cutting-edge models often provide unnecessary overhead, leading them to seek alternative solutions that offer better value propositions without sacrificing essential performance.

TechCrunch1 day ago

Google and Intel deepen AI infrastructure partnership

Google

Google and Intel have announced an expanded partnership focused on co-developing custom chips designed specifically for AI infrastructure. The collaboration comes amid surging global demand for computing processors, which has created significant supply constraints across the technology sector.

Simon Willison1 day ago

Google AI Edge Gallery

Google

Google has released an official application called AI Edge Gallery that enables users to run Gemma 4 language models directly on iPhones without requiring internet connectivity. The app supports multiple model sizes, including the E2B and E4B variants of Gemma 4, as well as select models from the Gemma 3 family. The E2B model requires a 2.54GB download and reportedly delivers both speed and capable performance.

Google AI1 day ago

New ways to balance cost and reliability in the Gemini API

Google

Google has introduced new configuration options for the Gemini API that let developers optimize their spending versus performance guarantees. This matters because AI API costs represent a significant operational expense for builders, and the ability to tune reliability requirements means smaller companies can access powerful models affordably while enterprises can pay for guaranteed uptime when needed.

Google AI1 day ago

Create, edit and share videos at no cost in Google Vids

Google

Google has announced new AI-powered features for its Vids platform, leveraging its advanced Lyria 3 and Veo 3.1 models. The updates will enable users to generate, edit, and share videos at no cost, marking a significant expansion of the tool's capabilities and accessibility.