Skip to content
FarmGPU Blog
17 min read

The Neocloud Storage Imperative: Architecting the Data Foundation for the AI Factory

The Neocloud Storage Imperative: Architecting the Data Foundation for the AI Factory

A new class of cloud provider, the "neocloud," has emerged to service the voracious computational demands of the artificial intelligence (AI) revolution.1 Unlike traditional hyperscalers such as AWS, Azure, and Google Cloud, which offer a vast "supermarket" of general-purpose services, neoclouds are highly specialized "delicatessens".2 Their singular focus is delivering GPU-as-a-Service (GPUaaS) with maximum performance, flexibility, and cost-efficiency for AI and High-Performance Computing (HPC) workloads.2 Providers like CoreWeave, Lambda Labs, and Crusoe Energy have built their businesses on this purpose-built model, stripping away extraneous services to concentrate on providing raw, scalable GPU power.2

This GPU-centric paradigm, however, introduces a critical shift in infrastructure dynamics. As computational power becomes abundant, the primary performance bottleneck moves from the processor to the data pipeline. The storage subsystem is no longer a passive repository but an active, performance-critical component of the AI factory. Its architecture directly dictates GPU utilization, job completion times, and, ultimately, the economic viability of the entire neocloud model.1 An underperforming storage layer leads to idle, multi-million-dollar GPU clusters, negating the very value proposition of the neocloud.7

The business model of neoclouds inverts traditional infrastructure pressures. Hyperscalers typically build vast, general-purpose infrastructure and then seek workloads to fill it. In contrast, neoclouds often acquire highly sought-after GPU resources first and then must construct the most efficient infrastructure around them to maximize their return on investment. This inversion places intense pressure on every other component, especially storage and networking, to be a pure performance enabler rather than a source of latency or a cost center.8 The success of a neocloud hinges on its ability to deliver predictable, peak performance in a demanding, multi-tenant environment—a challenge that falls squarely on the storage architecture.1 This economic reality is the primary force driving the adoption of advanced, parallel storage platforms over traditional enterprise Network Attached Storage (NAS).

NVIDIA Direction for storage

Section 1: The Neocloud Storage Blueprint

Evaluating storage for the unique demands of neoclouds requires a sophisticated framework that distinguishes between foundational necessities and strategic, performance-defining differentiators. This blueprint moves beyond a simple feature checklist to define the architectural principles that underpin a truly AI-ready data platform.

Table Stakes: The Foundational Requirements

These are the non-negotiable capabilities that any storage platform must possess to be considered viable for a modern neocloud environment.

Storage needs for neocloud, FarmGPU wishlist

Best-in-Class: The Strategic Differentiators

These are the advanced capabilities that separate leading AI storage platforms from the rest, enabling maximum performance, efficiency, and future-readiness.

Section 2: The Titans of AI Storage: NVIDIA NCP Partners Checking the Boxes

The NVIDIA Cloud Partner (NCP) program includes a select group of storage vendors whose solutions have been validated to meet the stringent performance and scalability demands of AI infrastructure. These partners are actively delivering on the table stakes and best-in-class features required by neoclouds.

Feature VAST Data WEKA DDN
Core Architecture Disaggregated & Shared-Everything (DASE) Software-Defined Parallel Filesystem (WekaFS) Shared Parallel Filesystem (EXAScaler/Lustre)
Primary Media QLC Flash + Storage Class Memory (SCM) All-NVMe Flash All-NVMe Flash
Filesystem Type Global, Log-Structured Distributed, Parallel Parallel
Key Differentiator Economics of All-Flash at Exabyte Scale Extreme Low-Latency via User-Space I/O Proven HPC-Scale Throughput & Caching
GPU Integration GDS, DGX SuperPOD Certified GDS, DGX SuperPOD Certified, Dynamo/NIXL GDS, DGX SuperPOD Certified
Small File Handling Similarity-Based Global Data Reduction Fully Distributed Metadata Lustre Optimizations
Caching Strategy All-Flash (No Caching Tier) Tiering to S3 Object Storage "Hot Nodes" Client-Side Caching
Deployment Model Hardware Appliance Software-Defined / Hardware Appliance Hardware Appliance

Section 3: Future Frontiers: Data Orchestration and Storage-Aware AI

While today's leaders deliver immense performance, emerging players are tackling the next major challenge: managing data across a globally distributed landscape and integrating storage even more deeply into the AI compute fabric.

Solving Data Gravity with Global Orchestration

As AI workloads become more distributed, the physical location of data creates "data gravity"—the difficulty of moving massive datasets to compute resources. Emerging players like Hammerspace are addressing this with a software-defined Global Data Environment. Hammerspace creates a single, unified global namespace across existing, heterogeneous storage systems.80 It leverages Parallel NFS (pNFS), a standards-based protocol built into the Linux kernel, to separate metadata from the data path. This allows Hammerspace to act as an intelligent data orchestration layer, using policies to automatically move data between storage tiers and locations in the background, making data appear local to applications wherever they run.80

Another emerging player, PEAK:AIO, is carving out a niche by focusing on the mid-scale AI cluster, where traditional enterprise storage can be overly complex and expensive.83 PEAK:AIO's software-defined approach transforms standard server hardware into a high-performance "AI Data Server," designed from the ground up for the specific needs of AI workloads.85 The architecture prioritizes simplicity and cost-effectiveness, maintaining the ease of use of standard NFS while turbo-charging it with RDMA and GPUDirect support to deliver ultra-low latency and high bandwidth directly to GPUs.83 This allows organizations to start small and scale linearly without over-investing in storage, freeing up budget for critical GPU resources.84 Looking ahead, PEAK:AIO is also addressing next-generation challenges with its "Token Memory Architecture," a dedicated appliance designed to unify KVCache acceleration and GPU memory expansion using CXL memory, treating token history as a memory tier rather than a storage problem.88

The Next Frontier: Storage-Aware AI and Infrastructure Offload

The evolution of AI is blurring the lines between storage, memory, and compute, creating a new paradigm of "storage-aware" AI. Techniques like Microsoft's ZeRO-Infinity offload model parameters and optimizer states to fast NVMe SSDs, using them as an extended memory tier.43 In LLM inference, the Key-Value (KV) Cache is a major consumer of precious GPU VRAM; emerging solutions like vLLM and NVIDIA's Dynamo inference server can intelligently offload portions of this cache to high-speed networked storage.51 This creates a new, extremely latency-sensitive workload where the storage system must serve cache blocks with memory-like speed. This future is enabled by DPUs, which are designed to manage these complex data movements efficiently, fetching data from networked storage and placing it directly into GPU memory via GDS without involving the host CPU.82 This fundamental shift elevates the importance of ultra-low-latency, parallel storage from a "performance optimization" to an "enabling technology" for the next generation of AI.

Conclusion: Architecting for Intelligence

The rise of the neocloud signifies a fundamental shift in computing infrastructure, one purpose-built for the age of AI. This analysis demonstrates that while GPUs provide the raw computational force, the storage platform is the critical subsystem that governs the efficiency, scalability, and ultimate profitability of a neocloud's AI factory. Choosing the right storage is not a matter of selecting the "fastest" option on a spec sheet, but of selecting the platform with the right architecture for the unique and demanding workloads of AI.

The blueprint for a best-in-class neocloud storage solution is clear. It begins with a foundation of "table stakes" features—from multi-protocol support and linear scalability to robust data protection and security. Upon this foundation, "best-in-class" differentiators are built: deep hardware integration with technologies like GPUDirect Storage and DPUs, specific optimizations for emerging workloads like RAG and KV Cache offload, and a forward-looking approach to security and sustainability.

Looking forward, the line between storage and memory will continue to blur. The most successful neoclouds will be those that embrace this paradigm, building their infrastructure not on siloed resources, but on an integrated, intelligent data platform where storage actively participates in the entire AI lifecycle. The choice of storage architecture is therefore one of the most critical strategic decisions a neocloud provider can make, a decision that will fundamentally define its performance, its capabilities, and its position in the competitive landscape of AI infrastructure.

Works cited

  1. Understanding Neocloud offering GPU-as-a-Service (GPUaaS) - DriveNets, accessed August 17, 2025, https://drivenets.com/resources/education-center/what-are-neocloud-providers/
  2. What is a Neocloud? - NEXTDC, accessed August 17, 2025, https://www.nextdc.com/blog/what-is-a-neo-cloud
  3. NeoClouds: The Next Generation of AI Infrastructure - Voltage Park, accessed August 17, 2025, https://www.voltagepark.com/blog/neoclouds-the-next-generation-of-ai-infrastructure
  4. Neocloud Providers: Powering the Next Generation of AI Workloads - Rafay, accessed August 17, 2025, https://rafay.co/ai-and-cloud-native-blog/neocloud-providers-next-generation-ai-workloads/
  5. What Are Neoclouds and Why Does AI Need Them? - RTInsights, accessed August 17, 2025, https://www.rtinsights.com/what-are-neoclouds-and-why-does-ai-need-them/
  6. Why Storage Is the Unsung Hero for AI, accessed August 17, 2025, https://blog.purestorage.com/perspectives/why-storage-is-the-unsung-hero-for-ai/
  7. DDN: Data Intelligence Platform Built for AI, accessed August 17, 2025, https://www.ddn.com/
  8. How to Build an AI-Ready Infrastructure, like a Neocloud - NEXTDC, accessed August 17, 2025, https://www.nextdc.com/blog/how-to-build-like-a-neo-cloud
  9. NFS vs SMB - Difference Between File Access Storage Protocols - AWS, accessed August 17, 2025, https://aws.amazon.com/compare/the-difference-between-nfs-smb/
  10. Virtual Machine Storage – File vs Block [Part 1]: SMB & NFS vs iSCSI & NVMe-oF - StarWind, accessed August 17, 2025, https://www.starwindsoftware.com/blog/virtual-machine-storage-file-vs-block-part-1/
  11. Unsung hero of AI World: Storage - Medium, accessed August 17, 2025, https://medium.com/@lazygeek78/unsung-hero-of-ai-world-storage-7cb5f342db81
  12. Optimizing AI: Meeting Unstructured Storage Demands Efficiently, accessed August 17, 2025, https://infohub.delltechnologies.com/sv-se/p/optimizing-ai-meeting-unstructured-storage-demands-efficiently/
  13. Why Auto-Tiering is Essential for AI Solutions: Optimizing Data Storage from Training to Long-Term Archiving - insideAI News, accessed August 17, 2025, https://insideainews.com/2024/11/11/why-auto-tiering-is-essential-for-ai-solutions-optimizing-data-storage-from-training-to-long-term-archiving/
  14. On-prem, cloud, or hybrid? Choosing the right storage strategy for AI workloads - Wasabi, accessed August 17, 2025, https://wasabi.com/blog/industry/storage-strategy-for-ai-workloads
  15. Understanding Erasure Coding And Its Difference With RAID - StoneFly, Inc., accessed August 17, 2025, https://stonefly.com/blog/understanding-erasure-coding/
  16. What is erasure coding and how does it differ from RAID? : r/DataHoarder - Reddit, accessed August 17, 2025, https://www.reddit.com/r/DataHoarder/comments/630zd6/what_is_erasure_coding_and_how_does_it_differ/
  17. What is Erasure Coding? - Supermicro, accessed August 17, 2025, https://www.supermicro.com/en/glossary/erasure-coding
  18. Snapshot vs Clone in Storage - Simplyblock, accessed August 17, 2025, https://www.simplyblock.io/glossary/snapshot-vs-clone-in-storage/
  19. Snapshots or Clones for Data Protection? - Verge.io, accessed August 17, 2025, https://www.verge.io/blog/storage/snapshots-or-clones-for-data-protection/
  20. Manage and increase quotas for resources - Azure AI Foundry - Microsoft Learn, accessed August 17, 2025, https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/quota
  21. Data deduplication vs. data compression - Lytics CDP, accessed August 17, 2025, https://www.lytics.com/blog/data-deduplication-vs-data-compression/
  22. Effective Deduplication & Compression - DataCore Software, accessed August 17, 2025, https://www.datacore.com/products/sansymphony/deduplication-compression/
  23. Deduplication, data compression, data compaction, and storage efficiency - NetApp, accessed August 17, 2025, https://docs.netapp.com/us-en/ontap/volumes/deduplication-data-compression-efficiency-concept.html
  24. Storage Encryption & Disk Encryption – Cyber Resilience | NetApp, accessed August 17, 2025, https://www.netapp.com/cyber-resilience/storage-encryption/
  25. Encryption in transit for Google Cloud | Security, accessed August 17, 2025, https://cloud.google.com/docs/security/encryption-in-transit
  26. Azure Data Encryption-at-Rest - Microsoft Learn, accessed August 17, 2025, https://learn.microsoft.com/en-us/azure/security/fundamentals/encryption-atrest
  27. Media sanitization guidelines | Internal Revenue Service, accessed August 17, 2025, https://www.irs.gov/privacy-disclosure/media-sanitization-guidelines
  28. What is Media Sanitization - ITAD Services Company TechReset, accessed August 17, 2025, https://techreset.com/itad-guides/what-is-media-sanitization/
  29. www.irs.gov, accessed August 17, 2025, https://www.irs.gov/privacy-disclosure/media-sanitization-guidelines#:~:text=Destruction%20of%20media%20is%20the,%2C%20pulverizing%2C%20shredding%20and%20melting.
  30. Azure Storage REST API Reference - Microsoft Learn, accessed August 17, 2025, https://learn.microsoft.com/en-us/rest/api/storageservices/
  31. Using Apigee API management for AI | Google Cloud Blog, accessed August 17, 2025, https://cloud.google.com/blog/products/api-management/using-apigee-api-management-for-ai
  32. What is AI observability? - Dynatrace, accessed August 17, 2025, https://www.dynatrace.com/knowledge-base/ai-observability/
  33. SMART + NVMe status | Grafana Labs, accessed August 17, 2025, https://grafana.com/grafana/dashboards/16514-smart-nvme-status/
  34. How to Check & Monitor NVMe SSD Drive Health, accessed August 17, 2025, https://ulink-da.com/how-to-check-nvme-drive-health/
  35. Checkpointing in AI workloads: A primer for trustworthy AI. - Seagate Technology, accessed August 17, 2025, https://www.seagate.com/blog/checkpointing-in-ai-workload-a-primer-for-trustworthy-ai/
  36. Simplify Enterprise AI with Pure and NVIDIA DGX SuperPOD - Pure Storage Blog, accessed August 17, 2025, https://blog.purestorage.com/products/simplify-enterprise-ai-with-certified-pure-storage-and-nvidia-dgx-superpod/
  37. A data-processing unit, or DPU, is a type of specialized processor that's designed to offload and accelerate the networking, storage, and security workloads that would traditionally be handled by the CPU in a server. The idea is to free up the CPU so that it can focus more on running applications and processing data specific to the user's tasks. - ASUS Servers, accessed August 17, 2025, https://servers.asus.com/glossary/DPU
  38. NVIDIA Bluefield Data Processing Unit | DPU - ASBIS solutions, accessed August 17, 2025, https://solutions.asbis.com/products/storage/bluefield-data-processing-units-dpu-
  39. NVIDIA BlueField Networking Platform, accessed August 17, 2025, https://www.nvidia.com/en-us/networking/products/data-processing-unit/
  40. 1. Overview Guide — GPUDirect Storage Overview Guide - NVIDIA Docs Hub, accessed August 17, 2025, https://docs.nvidia.com/gpudirect-storage/overview-guide/index.html
  41. NVIDIA GPUDirect Storage: 4 Key Features, Ecosystem & Use Cases - Cloudian, accessed August 17, 2025, https://cloudian.com/guides/data-security/nvidia-gpudirect-storage-4-key-features-ecosystem-use-cases/
  42. What is GPUDirect Storage? | WEKA, accessed August 17, 2025, https://www.weka.io/learn/glossary/gpu/what-is-gpudirect-storage/
  43. MemAscend: System Memory Optimization for SSD-Offloaded LLM Fine-Tuning - arXiv, accessed August 17, 2025, https://arxiv.org/html/2505.23254v1
  44. AI TOP 100E SSD 2TB Key Features | SSD - GIGABYTE Global, accessed August 17, 2025, https://www.gigabyte.com/SSD/AI100E2TB
  45. milvus.io, accessed August 17, 2025, https://milvus.io/ai-quick-reference/what-are-the-hardware-requirements-for-hosting-a-legal-vector-db#:~:text=Storage%20and%20networking%20are%20equally,on%20vector%20dimensions%20and%20metadata.
  46. What is Retrieval-Augmented Generation (RAG)? - Google Cloud, accessed August 17, 2025, https://cloud.google.com/use-cases/retrieval-augmented-generation
  47. AI Storage is Object Storage - MinIO, accessed August 17, 2025, https://www.min.io/solutions/object-storage-for-ai
  48. MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license. - GitHub, accessed August 17, 2025, https://github.com/minio/minio
  49. Parallel file systems for HPC workloads | Cloud Architecture Center, accessed August 17, 2025, https://cloud.google.com/architecture/parallel-file-systems-for-hpc
  50. Announcing the MLPerf Storage v2.0 Checkpointing Workload - MLCommons, accessed August 17, 2025, https://mlcommons.org/2025/08/storage-2-checkpointing/
  51. SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs - arXiv, accessed August 17, 2025, https://arxiv.org/html/2503.16163v1
  52. KV cache strategies - Hugging Face, accessed August 17, 2025, https://huggingface.co/docs/transformers/kv_cache
  53. Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS - AWS, accessed August 17, 2025, https://aws.amazon.com/blogs/machine-learning/accelerate-generative-ai-inference-with-nvidia-dynamo-and-amazon-eks/
  54. What is vLLM? - Red Hat, accessed August 17, 2025, https://www.redhat.com/en/topics/ai/what-is-vllm
  55. Dynamo Inference Framework - NVIDIA Developer, accessed August 17, 2025, https://developer.nvidia.com/dynamo
  56. What Is Confidential Computing? Defined and Explained - Fortinet, accessed August 17, 2025, https://www.fortinet.com/resources/cyberglossary/confidential-computing
  57. Confidential computing - Wikipedia, accessed August 17, 2025, https://en.wikipedia.org/wiki/Confidential_computing
  58. Hardware Root of Trust: Everything you need to know - Rambus, accessed August 17, 2025, https://www.rambus.com/blogs/hardware-root-of-trust/
  59. FAQs: What is Root of Trust? - Thales CPL, accessed August 17, 2025, https://cpl.thalesgroup.com/faq/hardware-security-modules/what-root-trust
  60. Best Data Center Infrastructure Management Tools Reviews 2025 | Gartner Peer Insights, accessed August 17, 2025, https://www.gartner.com/reviews/market/data-center-infrastructure-management-tools
  61. What Is Data Center Infrastructure Management (DCIM)? - Pure Storage, accessed August 17, 2025, https://www.purestorage.com/knowledge/what-is-data-center-infrastructure-management.html
  62. Delta Lake table optimization and V-Order - Microsoft Fabric, accessed August 17, 2025, https://learn.microsoft.com/en-us/fabric/data-engineering/delta-optimization-and-v-order
  63. Power Allocation and Capacity Optimization Configuration of Hybrid Energy Storage Systems in Microgrids Using RW-GWO-VMD - MDPI, accessed August 17, 2025, https://www.mdpi.com/1996-1073/18/16/4215
  64. Scope 1, 2, and 3 Emissions Explained | CarbonNeutral, accessed August 17, 2025, https://www.carbonneutral.com/news/scope-1-2-3-emissions-explained
  65. What Are Scope 1, 2 and 3 Emissions? - IBM, accessed August 17, 2025, https://www.ibm.com/think/topics/scope-1-2-3-emissions
  66. Hardware harvesting at Google: Reducing waste and emissions | Google Cloud Blog, accessed August 17, 2025, https://cloud.google.com/blog/topics/sustainability/hardware-harvesting-at-google-reducing-waste-and-emissions
  67. DASE (Disaggregated and Shared Everything) | Continuum Labs, accessed August 17, 2025, https://training.continuumlabs.ai/infrastructure/vast-data-platform/dase-disaggregated-and-shared-everything
  68. VAST Data Achieves NVIDIA DGX SuperPOD Certification | Inside HPC & AI News, accessed August 17, 2025, https://insidehpc.com/2023/05/vast-data-achieves-nvidia-dgx-superpod-certification/
  69. ACCELERATING A.I. WORKLOADS AT LIGHTSPEED - Military Expos, accessed August 17, 2025, https://www.militaryexpos.com/wp-content/uploads/2021/04/VAST-Data_NVIDIA_GPU-Reference-Architecture.pdf
  70. WEKA Architecture Key Concepts, accessed August 17, 2025, https://www.weka.io/wp-content/uploads/files/resources/WEKA-Architecture-Key-Concepts.pdf
  71. WEKA Accelerates AI Inference with NVIDIA Dynamo and NVIDIA NIXL, accessed August 17, 2025, https://www.weka.io/blog/ai-ml/weka-accelerates-ai-inference-with-nvidia-dynamo-and-nvidia-nixl/
  72. Fully-validated and optimized AI High-Performance Storage ... - DDN, accessed August 17, 2025, https://www.ddn.com/wp-content/uploads/2024/08/FINAL-DDN-NCP-RA-20240626-A3I-X2-Turbo-WITH-NCP-1.1-GA-1.pdf
  73. NVIDIA DGX SuperPOD H100 - DDN, accessed August 17, 2025, https://www.ddn.com/resources/reference-architectures/nvidia-dgx-superpod-h100/
  74. NVIDIA Technology Partnership - Pure Storage, accessed August 17, 2025, https://www.purestorage.com/partners/technology-alliance-partners/nvidia.html
  75. NetApp Storage Now Validated for NVIDIA DGX SuperPOD, NVIDIA Cloud Partners, and NVIDIA-Certified Systems, accessed August 17, 2025, https://www.netapp.com/newsroom/press-releases/news-rel-20250318-592455/
  76. Powering the next generation of enterprise AI infrastructure | NetApp Blog, accessed August 17, 2025, https://www.netapp.com/blog/next-generation-enterprise-ai-infrastructure/
  77. NVIDIA AI storage solutions - IBM, accessed August 17, 2025, https://www.ibm.com/solutions/storage/nvidia
  78. Artificial Intelligence (AI) Storage Solutions - IBM, accessed August 17, 2025, https://www.ibm.com/solutions/ai-storage
  79. Storage vendors rally behind Nvidia at GTC 2025 - Blocks and Files, accessed August 17, 2025, https://blocksandfiles.com/2025/03/18/nvidia-storage-announcements/
  80. Hammerspace - The Data Platform for AI Anywhere, accessed August 17, 2025, https://hammerspace.com/
  81. pNFS Provides Performance and New Possibilities - HPCwire, accessed August 17, 2025, https://www.hpcwire.com/2024/02/29/pnfs-provides-performance-and-new-possibilities/
  82. DPUs Explained | How They Supercharge AI Infrastructure ⚙️ - YouTube, accessed August 17, 2025, https://www.youtube.com/watch?v=0Mek000MYik
  83. PEAK:AIO Storage - PNY Technologies, accessed August 17, 2025, https://www.pny.com/en-eu/professional/software/peak-aio-storage
  84. AI Data Servers - Storage Reinvented - PEAK:AIO, accessed August 17, 2025, https://www.peakaio.com/ai-data-servers/
  85. PEAK:AIO AI-controlled storage revolution - sysGen GmbH, accessed August 17, 2025, https://www.sysgen.de/en/loesungen/data-storage/peak-aio/
  86. PEAKAIO About us, accessed August 17, 2025, https://www.peakaio.com/about-us/
  87. PEAK:AIO, MONAI, and Solidigm: Revolutionizing Storage for Medical AI, accessed August 17, 2025, https://www.solidigm.com/products/technology/peak-aio-monai-storage-for-medical-ai.html
  88. PEAK:AIO Introduces Token Memory Architecture to Address KVCache and GPU Bottlenecks - HPCwire, accessed August 17, 2025, https://www.hpcwire.com/off-the-wire/peakaio-introduces-token-memory-architecture-to-address-kvcache-and-gpu-bottlenecks/