
Ollama Integrates Apple's MLX Framework, Boosts Performance on Apple Silicon Macs
Key Takeaways
- Ollama integrates Apple's MLX framework on Apple Silicon to accelerate local AI models.
- MLX enables unified memory usage, boosting LLM processing speed on Macs.
- NVFP4 format support for model compression improves memory efficiency.
MLX Integration
Ollama integrated Apple’s MLX framework to boost performance on Apple Silicon Macs.
“Ollama, l’un des meilleurs outils pour exécuter des modèles d’IA localement sur Mac, vient de franchir un cap”
The update allows Ollama to take advantage of unified memory and GPU Neural Accelerators.

The update supports running the 35-billion-parameter variant of Alibaba’s Qwen3.5 model.
Caching and Compression
Ollama 0.19 improves caching by reusing cache across conversations.
The update adds support for Nvidia’s NVFP4 compression format.

These enhancements aim to make local models more practical.
Hardware Requirements
Running the preview requires at least 32GB of unified memory.
“Ollama, a runtime system for operating large language models on a local computer, has introduced support for Apple’s open source MLX framework for machine learning”
This could limit accessibility for many users.
Ollama is working to support additional models in future updates.
Implications
Ollama exemplifies the growing viability of local AI model execution.
The update positions Apple Silicon Macs as capable for AI development.
More on Technology and Science

Baidu's Apollo Go Robotaxis Halt in Wuhan; 100+ Vehicles Stranded in Traffic
15 sources compared

NASA Launches Artemis II: Four Astronauts Begin Historic Lunar Orbit Mission
13 sources compared
NASA Launches Artemis II, First Crewed Lunar Flyby Since Apollo
19 sources compared

New York City Reverses TikTok Ban, Allows Use With Security Protocols
12 sources compared