DeepSeek unveils Engram

DeepSeek introduced Engram, a conditional memory system that separates memory from computation in LLMs, reducing GPU memory usage and improving efficiency for future large-scale AI models.

DeepSeek, in collaboration with academic researchers, has unveiled Engram, a novel conditional memory system designed for large language models.

Engram decouples memory storage from computation, allowing models to store and retrieve knowledge efficiently without overloading GPU memory. This approach significantly reduces reliance on expensive high-bandwidth memory while improving reasoning depth and inference efficiency.

By caching knowledge instead of full context, Engram addresses one of the biggest bottlenecks in scaling AI models. The innovation is expected to influence the architecture of next-generation models, including DeepSeek V4, and could reshape how large models balance performance, cost, and scalability.

DeepSeek