Kubernetes is often celebrated for how easily it scales compute. Add nodes, spin up pods, rebalance workloads – compute elasticity feels almost natural. Storage, however, is a very different story. Persistent Volumes (PVs) don’t move as easily. Optimizing storage in Kubernetes means dealing with downtime, migrations, and critical application paths. It’s here that the hidden complexity of Kubernetes really shows up.
When we first introduced Lucidity’s storage optimization for Kubernetes, adoption was slower than expected. Not because customers didn’t see the value. Saving on cloud storage bills is definitely compelling, but the workflows involved were too critical to trust right away. No one wants to risk breaking production databases to chase efficiency. Over time, though, as the demand to manage storage at scale increased, customers started leaning in. Some now want to onboard hundreds of clusters, and that scale forces a new conversation.
1. The Building Blocks
The foundation starts with installing the Lucidity agent alongside the CSI driver. This combination is what allows us to intercept and orchestrate PV operations. The CSI driver doesn’t directly interact with cloud providers or instances – it routes operations to the orchestrator and agent. From the customer’s perspective, this setup is the gateway into optimization. Without it, PVs remain opaque and static.

2. Onboarding Persistent Volumes
Once the infrastructure is in place, the first real challenge is onboarding PVs. Unlike compute, these are stateful resources. You’re touching application data, and that changes the conversation entirely.
To build trust, we developed a structured questionnaire for customers:
- What’s your tolerance for downtime?
- Can your applications handle replicas or master-slave adjustments?
- Are you comfortable modifying StatefulSets or recreating PVCs?
These weren’t just technical checks. They were about understanding the operational reality of each customer. A team running MongoDB has different boundaries than one running Prometheus. Framing those questions upfront was the only way to guide adoption.
3. The Downtime Trade-Off
At its core, PV migration involves downtime – around 10 minutes, capped at 30. For many applications, that’s acceptable. For others, it’s not negotiable.
We built zero-downtime paths, but they come with conditions: the app must support multiple replicas, and the customer must perform extra steps during migration. For example, in MongoDB, this means reconfiguring master-slave roles. For Prometheus, it can be simpler.
The critical point here: downtime isn’t just an engineering number. It’s a business decision. Different applications, teams, and industries draw the line differently.
4. The Switch Problem
Once migration is complete, there’s one more step: switching PVs. Today, this requires customers to run helm commands manually. For a handful of clusters, this is manageable. For enterprises running hundreds of clusters, it’s not sustainable.
This is the current bottleneck. Customers want storage optimization at scale, and a manual switch flow is friction. Our immediate goal is to automate this process and extend support for bulk onboarding. That’s where adoption will really accelerate.
5. Adoption Journey
Adoption followed a predictable curve. Early customers hesitated – touching production storage is risky. The complexity wasn’t in the technology itself, but in the trust required to run it. Over time, the economics of cloud storage and the scale of Kubernetes environments made adoption inevitable. Now customers aren’t asking whether to adopt, but how fast they can onboard entire fleets of clusters.
6. Where We’re Headed
The next phase focuses on eliminating friction. Automating the PV switch flow will remove the dependency on manual helm commands. Bulk onboarding will allow customers to manage PVs across 100s of clusters with confidence.
The long-term vision is straightforward: make storage optimization in Kubernetes feel as seamless as compute scaling. The complexity won’t disappear – but it will be abstracted away so customers can focus on running applications, not managing volumes.
