Which Kubernetes object storage solution?

At work we have a Kubernetes cluster which we deploy our system to. We have a devops team that manages the cluster. They are exposing an IBM GPFS/spectrum scale mount path (/path) to the cluster nodes for us to use to persist data for some amount of time. For all intents and purposes, this GPFS mount is just "local" storage to us. Let's say it's 1TiB of physical disk space. Now, we have many developers who may deploy test builds of our app to the test cluster in their own personal namespaces. We will also say that each deployment dynamically provisions a persistent volume for that deployment to claim, call it a 50GiB slice. My question is what is a good object/file storage solution to use that would facilitate data storage/retrieval via some API, be backed by these PVs per each deployment that can automatically "age-off/purge" the data after some amount of time, and while also being distributed in a way that the pods can be spread across the cluster nodes for high availability if a node goes down for some reason? I was looking into a distributed MinIO cluster, but you can't disable the erasure coding and the disk space hit we would take would be too much overhead according to the devops team (GPFS is handling redundancy and resiliency of data). I have looked into various solutions such as Rook + Ceph, OpenIO, and GlusterFS, but all the examples that I've seen deal with the devops configuration side of things rather than deploying the solution along with our system as a "cluster" of pods to utilize the PVs and facilitate storage/retrieval with auto data age-off. The docs are also seem to be extremely dense, so I'd rather pick something that I know can handle our use case before completely diving in to save time. This is my first time having to deal with creating a storage "service" on Kubernetes aside from a simple PV and PVC setup to persist some data, so I apologize for any ambiguity or lack of general knowledge on the matter. Any feedback or recommendations would be greatly appreciated.

Edit: I guess the solution doesn't have to be deployed as part of our system, but the idea is we'd like to have control over how the storage gets sliced up without having to go through the devops team (say if a new hire joins, now we split the storage 6 ways instead of 5, etc). We also want the control over managing our "buckets" and how long our data can exist in the buckets before being aged-off.

submitted by /u/evanlott
[link] [comments]

from Software Development – methodologies, techniques, and tools. Covering Agile, RUP, Waterfall + more! https://ift.tt/3CQ3njF

Share this:

Related

Leave a comment Cancel reply