Advertisment

A Deep Dive into Data Reduction Technologies of Pure Storage

Pure Storage data reduction technologies, including deduplication, compression, and thin provisioning, help businesses optimize storage capacity and reduce costs.

author-image
Aanchal Ghatak
New Update
pure storage

Ajay Singh, Chief Product Officer at Pure Storage, discusses the company’s cutting-edge Direct-to-Flash (DFM) technology, which has been a game-changer in the storage industry. In this insightful conversation, Singh explores how Pure Storage is leveraging DFM to outpace competitors, deliver unmatched performance, and offer a cost-effective solution for enterprises navigating cloud, AI, and ransomware challenges. With a strategic focus on scalability and flexibility, Pure Storage aims to become the market leader by continuously innovating in flash storage technology.

A Unified Platform for On-Premises and Cloud Storage

Advertisment

 

Can you elaborate on Pure Storage’s data reduction technologies and how they help customers optimize storage capacity and cost?

Ajay Singh: Data reduction is a core part of any storage system, especially with Flash storage. We use several techniques such as deduplication, compression, and thin provisioning to optimize storage. These allow us to store data more efficiently by reducing redundancy. For example, with these technologies, customers can see reductions ranging from 4-5x, and in some cases, up to 15-16x. This means we can drastically cut the amount of raw storage needed while still maintaining effective capacity.

Advertisment

Our data reduction technologies are built into the Purity operating system, offering enterprise-grade storage solutions that maximize efficiency. These techniques also apply in our Cloud Block Store, which runs on platforms like AWS and Azure. Customers using Cloud Block Store not only get enterprise-grade capabilities but can also see their storage costs reduced by half, thanks to our sophisticated data reduction methods.

How relevant are these technologies for small and medium enterprises in terms of cost?

AS: For small and medium enterprises, data reduction is critical because it allows them to use less raw storage for the same amount of data. A 4-5x reduction means they can significantly lower their costs on storage. For example, instead of buying additional raw storage, they can rely on our optimized systems to stretch their existing capacity.

Advertisment

Can you share an example of how enterprises have benefited from Pure Storage’s unified approach in terms of improved efficiency, reduced complexity, and enhanced data management?

AS: Our platform provides a single, cloud-like interface, simplifying management for businesses. Underneath, we offer a scale-up platform for latency-sensitive applications, like high-frequency trading, and a scale-out platform for high-throughput needs, like AI and electronic design automation (EDA).

One of the key benefits is our design philosophy focused on simplicity. Our founder’s goal was to make storage as easy to use as an iPhone.

Advertisment

This ethos of simplicity is what drives our innovation—making our technology easy to install, manage, and scale without complex configurations. This reduces both the time and expertise required to manage storage infrastructure, directly benefiting enterprises by lowering operational complexity and costs.

How does Pure Storage’s data platform strategy target Total Cost of Ownership (TCO) savings in both on-premise and cloud storage environments?

AS: TCO includes several components: acquisition cost, maintenance, labor, failure rates, and the lifecycle of the storage system. We've systematically targeted all these areas to lower TCO. While Flash used to be expensive, it's now more affordable, and we've built technologies to maximize its value.

Advertisment

For example, our failure rates are exceptionally low—about 0.15%—compared to traditional HDD and even SSD systems. This means fewer interruptions, lower labor costs, and increased reliability. Additionally, our Evergreen model allows businesses to upgrade their systems without downtime, avoiding costly migrations and minimizing e-waste. This longevity and efficiency are key to driving down TCO over the lifecycle of the storage infrastructure.

Can you explain how Pure Storage’s direct-to-flash technology improves durability compared to SSD and HDD systems?

AS: Most SSDs use a translation layer to make flash storage behave like traditional disk systems. This adds complexity and cost, as well as increases failure rates. Our direct-to-flash technology bypasses this translation layer, making our storage more efficient and reliable. For example, where an SSD might need a gigabyte of DRAM per terabyte of storage to handle this translation, our systems only require a megabyte—1,000 times less DRAM.

Advertisment

This not only reduces power consumption but also improves reliability by eliminating the need for complex translation software, which can introduce bugs and failures. As a result, our systems are more durable and have lower return rates compared to SSD-based systems.

How does Pure Storage balance performance and affordability within its data platform strategy?

AS: At Pure Storage, we offer a range of solutions designed to meet various performance and capacity needs, while maintaining affordability. We have high-performance, low-capacity options for workloads that demand speed but don’t require a lot of storage. We also provide mid-performance, mid-capacity solutions, and low-performance, high-capacity systems for more storage-intensive applications.

Advertisment

These options are all available under the same platform, allowing users to choose what suits their workloads best. With policy-based management, we offer "late binding"—meaning, once you define the application’s needs, it automatically gets mapped to the appropriate system, whether high, mid, or low performance. This flexibility is key, as it mimics a cloud-like environment where you can scale up or down as required.

Under the hood, we utilize scale-up and scale-out engines to address different levels of performance and capacity. This approach ensures that enterprises can optimize both performance and cost based on their specific needs.

What are some key trends and challenges you foresee in the enterprise IT landscape over the next few years?

AS: One of the major trends is the shift toward cloud-like operations, even for on-prem environments. Enterprises increasingly view the cloud as an operating model rather than just a destination. While we’ve virtualized compute and networks using platforms like VMware, Kubernetes, and Nutanix, storage virtualization has lagged behind.

Traditional storage is still siloed—dedicated to specific applications. There's a growing need to virtualize storage, allowing it to become a flexible pool with various service level agreements (SLAs). This would bring storage in line with the way compute and networks are managed, reducing lock-in and complexity.

Another challenge is the rising frequency of ransomware attacks. It’s no longer a question of "if," but "when" an attack will happen. Pure Storage has invested heavily in ransomware resilience, with features like SafeMode snapshots. These snapshots are air-gapped, meaning they can’t be accessed without two-factor authentication, protecting data even if the primary storage is compromised.

We also offer a Ransomware Recovery SLA, which ensures that after an attack, we can deliver new storage arrays within hours and help businesses restore their data quickly. The speed of recovery is critical, especially when dealing with large data sets. By using Flash technology, we can restore petabytes of data much faster than traditional systems.

AI is another big trend, and enterprises are realizing that data silos are a barrier to effective AI training. Instead of moving data to a centralized data lake, it’s more efficient to use the data where it resides. Pure Storage’s flash-based systems provide the extra performance needed for AI workloads, and our platform enables a virtual data lake by virtualizing storage across different environments.

How does Pure Storage's partnership with NVIDIA support customers in adopting AI technologies?

AS: AI is incredibly data-hungry, and Pure Storage has been a leader in flash storage, working closely with NVIDIA for years—even before the rise of generative AI. Our collaboration has produced the AI-ready infrastructure (AIRI) platform, which is a validated reference architecture for running AI workloads.

We’ve been the most widely deployed storage with NVIDIA, supporting both traditional machine learning models and more recent generative AI workloads. Our AIRI partnership ensures seamless integration between NVIDIA’s AI infrastructure and our storage solutions.

We’ve also been certified for NVIDIA’s DGX SuperPOD, which is used for large-scale AI deployments. Beyond that, we’ve conducted global roadshows to help customers learn best practices for AI implementation.

How does Pure Storage enable seamless data mobility between on-premise and cloud environments?

AS: We offer a product called Cloud Block Store, which is essentially our Purity software running in cloud environments like AWS and Azure. This allows customers to move data between their on-prem storage and the cloud as if they were simply transferring data between two local systems.

Cloud Block Store virtualizes storage across multiple clouds and on-prem systems, creating a unified cloud-like experience for customers. This means you’re not locked into any single environment—you can move your data freely between clouds and on-prem systems. Plus, it cuts costs by half compared to traditional cloud storage.

You mentioned Pure Storage's goal to surpass Dell EMC. What strategies are driving this growth?

AS: Over the last few years, we’ve consistently grown 15% faster than the rest of the industry. While our competitors have their own flash offerings, they often rely on SSD-based systems, which don’t match the performance and reliability of Pure’s DirectFlash technology. Our technology leapfrogs traditional SSDs, offering better performance, reliability, and overall cost-efficiency.

A key part of our strategy is the introduction of our E family, which targets lower-cost, high-capacity workloads. This brings flash storage into markets that were previously dominated by spinning disk systems. Just like Tesla’s model strategy—starting with high-end cars and then introducing more affordable models—we’re now able to offer flash storage that’s cheaper than disk-based systems on a total cost of ownership basis. This allows us to capture a larger market share, and we believe we have significant headroom for growth as we continue to replace traditional storage systems.

flash all-flash Pure Storage
Advertisment