Three steps to Object heaven

Jerome Wendt, President and Founder of analyst firm DCIG, looks at three key features that will help enable Object storage capacity and performance at scale

Many organisations primarily view object storage systems as cost-effective solutions to host their archival and backup data. This mindset makes sense as it represents how many organisations introduced object storage solutions into their environment. However, the use cases for object storage continue to expand.

Organisations generate and capture expanding amounts of data and deploy new applications that process that data to generate new business value. To meet these new demands, object storage solutions must deliver both economical capacity and high performance.

PERFORMANCE: OBJECT'S NEW LOVE
Multiple applications now push object storage solutions to function as more than archival and backup data stores. These sources include applications that generate log files; machine sensors that capture environmental and performance data; and video surveillance.

Organisations still want economical storage solutions on which to store these types of data. However, organisations also want to study and analyse this data, sometimes in real time, to support decisions and take actions.

Waits of multiple seconds or even minutes for object reads to complete represents the norm for many object solutions designed to host archival data. To expect them to suddenly deliver read response times under a second does not happen accidentally. They cannot meet these demands because they were never designed to do so.

All object storage solutions face this growing challenge of delivering on competing demands of economical capacity and sub-second performance at scale. In response, more have taken steps to provide them. The ones best equipped to deliver on these new requirements offer the following three features:

  1. Stores object metadata on flash media
  2. Scale performance independently of capacity
  3. Stores objects in chunks and processes them in parallel

Feature #1: Object storage metadata hosted on flash media
Organisations should first verify the solution offers the option to store object metadata on flash media (NVMe or SAS SSDs). Each object stored will have metadata associated with it. Due to the millions or billions of objects stored on a solution, object metadata databases will grow large.

These systems can and do host metadata in memory. However, the size of these metadata databases makes this technique impractical at scale. Storing all metadata on flash media accelerates access to the metadata and improves the possibility for sub-second read response times.

This need for sub-second response times explains why Cloudian, Dell EMC, and others recently introduced flash media into their solutions. Others such as Scality have offered the option to store metadata on flash storage for some time.

Feature #2: Scales performance independently of capacity
Storing object metadata on flash media represents only the first part of the key to delivering performance at scale. Object storage solutions typically scale out capacity and performance simultaneously by introducing new server nodes into the cluster. Each server node may contain both flash media and HDDs with fixed amounts of both media types.

Unfortunately, the available performance in the storage solution cluster may not meet application or user expectations. Two ways exist to increase performance.

1. Add more nodes to the cluster. Each node adds more capacity and performance to the cluster. This may improve the situation, though organisations will buy un-needed capacity.

2. Select a solution that frees them to scale performance independently of capacity. Using this architectural approach, an organisation may install new flash media in existing nodes. They may introduce performance-centric nodes that primarily contain flash media and few or no HDDs. This provides the targeted performance boost they need without paying for unneeded capacity.


"Object storage solutions that deliver both economical capacity and high performance at scale do exist. However, DCIG knows of only a few solutions that leverage all three features mentioned here to deliver on these enterprise expectations. Enterprises should be wary of any solutions that have existed for over 10 years. Many have introduced flash media into their systems to host metadata on flash to help improve their performance. It certainly helps, but how well?"

Feature #3: Stores large objects in chunks and processes them in parallel
While organisations may one day store their object data on flash media, that day has not yet arrived. In the meantime, organisations will continue to store their object data on HDDs. This may present a performance challenge, especially when storing and reading large objects from HDDs.

Individual objects may grow into the hundreds of GBs if not TBs in size. Using a single process to read object data from HDDs on cluster nodes will take significant time to complete. To improve response times, identify solutions that perform two tasks.

  • First, they should break large objects into smaller chunks before writing them to multiple nodes and disks.
  • Second, they should use multiple parallel processes to read back the object data.
These techniques serve the following purposes. Spreading large objects across multiple nodes and disks enables the solution to both write and read objects back more quickly. This accelerates performance at scale.

NEWER OBJECT STORAGE SOLUTIONS BETTER ACCOUNT FOR FLASH MEDIA
Object storage solutions that deliver both economical capacity and high performance at scale do exist. However, DCIG knows of only a few solutions that leverage all three features mentioned here to deliver on these enterprise expectations.

Enterprises should be wary of any solutions that have existed for over 10 years. Many have introduced flash media into their systems to host metadata on flash to help improve their performance. It certainly helps, but how well?

Unfortunately, it remains unclear to what extent taking this step alone helps at scale. The early evidence seems to suggest it does not translate very well.

Those organisations scaling into the petabytes will be better served by identifying and choosing object storage solutions with more modern designs. These newer solutions better account for flash media, scale performance and capacity independently, and parallelise I/O to deliver performance even as data stores scale to multiple petabytes.

More info: www.dcig.com