AMD's Mext buy shows how AI could solve the RAM shortage it created
Running low on memory, can't afford more? The House of Zen's latest acquisition puts an AI spin on flash-based memory expansion Tobias Mann Tobias Mann Systems editor Published tue // UTC With no end in sight to the memory crunch, AMD thinks that AI, the main cause of the shortage, could be part of the solution. This week, the House of Zen acquired predictive memory startup Mext for an undisclosed sum, setting the stage for a world where bots decide which data to put into RAM and which to store in less-expensive flash. Founded in 2023, the Mext proactive memory platform uses machine learning algorithms and learned heuristics to proactively offload "cold" memory to flash storage, and, based on data access patterns, restore it before its needed again. Modern flash arrays are already approaching main memory in terms of aggregate bandwidth, but swapping to disk still imposes a stiff latency penalty. Mext claims it can expand the effective memory of a system by 2 to 4x using flash, which gig for gig is still vastly less expensive than DRAM. This flash memory is exposed to the operating system like regular memory simply . Memory tiering is nothing new and has seen various reincarnations over the years with some being software based and others, like Intel Optane persistent memory, using special 3D XPoint memory tech co-developed by Micron. Mext stands out for its use of machine learning to migrate data from hot memory to cold storage almost like a branch predictor — something AMD has an awful lot of experience with. Mext isn't using one model to decide when to shuffle your data. Instead it uses a series of heuristics, long short term memory, and modern transformer architectures depending on which combination renders the best results. “This approach has the potential to reduce infrastructure costs, improve resource utilization, and help customers more effectively scale general-purpose and AI workloads,” Dan McNamara SVP of AMD’s compute and enterprise AI biz wrote in a blog post this week. Beyond enterprise applications, the technology could have implications for AI serving. Modern mixture of experts (MoE) models are, as their name suggests, comprised of multiple sub-models. For each token predicted, a different selection of experts may be used. In practice an LLM may use some experts more frequently and others rarely. We wouldn't be surprised to see AMD use Mext's prediction algorithms to offload infrequently utilized experts from HBM to slower system memory, enabling enterprises to take advantage of larger more capable models with fewer resources.
Original story by The Register • View original source
Anonymous Discussion
Real voices. Real opinions. No censorship. Resets in 15 hours.
About NewsBin
Freedom of speech first. Anonymous discussion on today's news. All content resets every 24 hours.
No accounts. No tracking. No censorship. Just honest conversation.
Loading comments...