Industry Leading Compression translates to Huge Cost Savings
No other feature in a database directly translates to cost savings like the Data Compression feature. Compression can save big money on storage costs, increase data center density, or allow more data to be kept, while simultaneously increasing query performance in cases where I/O is the bottleneck.
Central to RainStor’s unique product capabilities is the ability to compress and de-duplicate large data sets which achieve ratios that are typically 40:1, rising to 100:1 with some data, through the use of four distinct but complementary techniques. With RainStor’s data reduction capabilities, you can significantly reduce overall storage costs which impact total cost of ownership:
- Field level de-duplication: involves processing the source data on a column-by-column basis, reducing the dataset to only a list of the unique values that each column holds, together with a frequency count of the number of times the value appears. In this instance the storage space required using field level de-duplication is a fraction of the original data.
- Pattern level de-duplication: in order to store compressed data in a loss less state, a binary tree is built up with pointers that can be used to reconstitute the data as it was in its original form. Pattern level de-duplication builds on field level de-duplication by further exploiting the ability to store only unique values of the branches, again with a frequency count. This is achieved using exactly the same technique as used at the field level to work out the unique combinations.
- Algorithmic compression: field and pattern compression techniques are equally valuable in saving space on disk as in saving memory. RainStor also employs algorithmic compression that involves a series of techniques designed primarily to reduce the amount of disk required for storage.
- Byte level compression: the final form of compression is byte level compression. Here components of the tree are aggressively compressed independently using industry-standard byte compression algorithms tuned to obtain the maximum on-disk compression of the data.
These de-duplication techniques do not result in any loss of detail; data is not summarized or aggregated in RainStor. Instead RainStor stores each record as a series of pointers to the location of a single instance of a data value, or pattern of data values.
RainStor uses a tree-based structure to store data that links the various instances of the patterns together to establish data records. This means that the original records can be reconstituted at any time. This de-duplication process also means that the bigger the data set, the higher the probability that values and patterns will be repeated, and the greater the level of compression that can be achieved when loaded.