WEKA Maximizes Token Output With Lower Cost Per Token on NVIDIA BlueField-4 STX
|
EQS-News: WEKA
/ Key word(s): Product Launch/Miscellaneous
NeuralMesh and Augmented Memory Grid Integration with NVIDIA STX Increases Token Production by 6.5x in the Same GPU Footprint, Slashing Cost of Inference for AI-Driven Organizations SAN JOSE, Calif. and CAMPBELL, Calif., March 16, 2026 /PRNewswire/ -- From GTC 2026: WEKA, the AI storage and memory systems company, today announced the integration of its NeuralMesh™ software with the NVIDIA STX reference architecture. WEKA's breakthrough Augmented Memory Grid™ memory extension technology running on NeuralMesh will support NVIDIA STX to bring high-throughput context memory storage to agentic AI factories, making long-context reasoning seamless across sessions, tools, and tasks. Leveraging NVIDIA Vera Rubin NVL72, NVIDIA BlueField-4, and NVIDIA Spectrum-X Ethernet, the NeuralMesh solution based on NVIDIA STX will deliver an estimated increase of 4-10x more tokens per second for context memory while supporting at least 320 GB read and 150 GB write throughput per second for AI workloads, more than double the throughput of conventional AI storage platforms. ![]() Solving the Inference Cost Problem with Shared KV Cache Infrastructure Context Memory Storage: The Foundation of Agentic AI Factories Leading AI innovators and cloud providers, such as Firmus, are already transforming their inference economics with Augmented Memory Grid on NeuralMesh. "Real-world AI doesn't run in a lab— it has power constraints, cooling limits, and relentless workload demand. Firmus is built for exactly that. Paired with NVIDIA AI infrastructure, WEKA Augmented Memory Grid delivers up to 6.5x higher tokens per second and 4x faster TTFT at scale, proving we can get more performance from the same GPU footprint. With NeuralMesh and Augmented Memory Grid integrated into our NVIDIA-aligned AI Factory and NVIDIA STX reference architecture, we'll be able to deliver the fastest context memory network for predictable and efficient inference at scale," said Daniel Kearney, Chief Technology Officer at Firmus. NeuralMesh and NVIDIA STX: Purpose-Built for Agentic AI
"With coding LLMs advancing, we're seeing unprecedented adoption of Agentic AI use cases for software engineering, where productivity increases by 100-1000x. As coding assistants make repeated calls against largely unchanged codebases and prompts, WEKA's Augmented Memory Grid reuses cached context instead of forcing redundant prefill, even as context windows grow to incredible lengths. This provides a major boost in response times and greatly increases the number of concurrent users running on the same infrastructure," said Liran Zvibel, co-founder and CEO at WEKA. "WEKA first identified this need for context memory storage more than a year ago and launched Augmented Memory Grid at GTC 2025. Now, NVIDIA STX opens the door to organizations running their storage and memory extension infrastructure on state-of-the-art NVIDIA Vera Rubin architecture, including NVIDIA BlueField-4 and NVIDIA Spectrum-X Ethernet. Running Augmented Memory Grid on NeuralMesh for NVIDIA STX delivers extreme performance and efficiency that translates directly to game-changing AI economics." Availability WEKA's Augmented Memory Grid is commercially available with NeuralMesh today. Organizations that don't address the memory wall today will find it harder and more expensive to scale tomorrow. As agentic workloads grow and context windows expand, DRAM-only architectures face a compounding cost problem: each additional concurrent user or session increases recomputation overhead, GPU idle time, and operational cost. The organizations that architect for persistent KV cache now will have a structural cost and performance advantage over those that wait. For more information about NeuralMesh, visit: weka.io/NeuralMesh. Organizations can learn more at weka.io/nvidia or visit WEKA at GTC 2026, booth #1034. About WEKA WEKA and the W logo are registered trademarks of WekaIO, Inc. Other trade names herein may be trademarks of their respective owners. ![]() Photo - https://mma.prnewswire.com/media/2934399/WEKA_and_NVIDIA.jpg
![]()
16.03.2026 CET/CEST Dissemination of a Corporate News, transmitted by EQS News - a service of EQS Group. |
2292282 16.03.2026 CET/CEST



