SteveJ-on-IT: 2014-07

2014/07/12

RAID++ and Storage Pools: Leveraging GPT partitions for Asymmetric Media Logical Volumes. Pt 1.

This is an exploration of addressing Storage problems posed by directly connected or (ethernet) networked drives, not for SAN-connected managed disks served by Enterprise Storage Arrays.

The Problem

One of the most important features of the Veritas Logical Volume Manager (LVM) circa 1995 was the ~1MB disk label that contained a full copy of the LVM information of the drive/volume and allowed drives to be renamed by the system of physically shuffled, intentionally or not.

Today we have a standard, courtesy UEFI, for GUID Partition Tables (GPT) of Storage Devices supported by all major Operating Systems. Can this provide similar, or additional capability?

2014/07/10

RAID++ and Storage Pages: We may be asking the wrong questions

The implied contract between Storage Devices, once HDD's only, and systems is a rather weak one.

Storage Devices return blocks of data on a "Best Efforts" basis, failure & error handling are minimalist or non-existent.

There's no implicit contract with the many other components that are now needed to move data off the Storage Device and into Memory, HBA's, cables, adaptors, switches etc. The move to Ethernet and larger Networks compounds the problem: networks are not nearly error-free. This matters when routinely moving around Exabytes and more: errors and failures are guaranteed for any human-scale observation period.

Turning this weak assurance into usable levels of Reliability and Data Durability is currently left to a rather complex set of layers, which can have subtle & undetectable failure modes or in "Recovery" mode, have unusably poor performance and limited or no resilient against additional failures. We need to improve our models to move past current RAID schemes to routinely support thousands of small drives and new Storage Class Memory.

Scaling Storage to Petabyte and Exabyte sized Pools of mixed technologies needs some new thinking.
New mixed technologies now provide us with multiple Price-Size-Performance components, requiring very careful analysis to optimise Systems against owner criteria.

There is no one true balance between DRAM, PCI-Flash, SSD's, fast-HDD, slow-HDD and near-line/off-line HDD or tape and Optical Disk. What there is, is a willingness of an owner to pay. Presumably they have a preference to pay enough, but not significantly more, for their desired or required "performance", either as "response time" latency or "throughput". Very few clients can afford, or need, to store everything in DRAM with some sort of backup system. It's the highest performance and highest priced solution possible, but is only necessary or desirable in very constrained problems.

DRAM is around $10/GB, Flash and SSD about $1/GB and HDD's from $0.04 to $0.30/GB for raw disk.

Here's a possible new contract between Storage Devices and Clients/Systems:

Data is returned Correct, Complete and Verifiable, in whole or part, between the two Endpoints.