Pression
1.1.1
Compressor, decompressor, uploader and downloader plugins
|
This document specifies the transformation of the output of a data compressor into a smaller set of larger output slices. The primary use case is as a backend of the memcached keyv::Map, which has a maximum value size of one megabyte.
The new compression plugin API: C++ Plugin API for CPU compressors
For an input of:
The slicer produces:
For an input of n output slices (see above), the slicer produces the uncompressed data
namespace pression { namespace data { class Slicer { struct Result { uint8_t* data; uint32_t size; }; typedef std::vector< Result > Results; //!< Set of result slices typedef std::vector< uint32_t > ResultSizes; //!< Remaining slice sizes Slicer( const CompressorInfo& compressor ); // returned pointers are valid until next compress(), delete of // input data, or dtor of Slicer called Results&& compress( const uint8_t* data, size_t size, uint32_t sliceSize ); // input: first slice, output: remaining slice sizes ResultSizes&& getRemainingSizes( const uint8_t* data, uint32_t size ); // input: first slice, output: total decompressed data size size_t getDecompressedSize( const uint8_t* data, uint32_t size ); /** void decompress( const Results& input, uint8_t* data ); }; } }
compress() allocates a compressor and compresses the input data. Output is uncompressible if pression::getDataSize() exceeds input size minus header overhead
Uncompressibly output is returned as:
Compressibly output is returned as:
First implementation throws if header size exceeds sliceSize for compressed output and if a chunk is bigger than a slice.
void Keyv::memcached::Plugin::insert( const std::string& key, const void* ptr, const size_t size ) { const auto data = _slicer.compress( ptr, size, LB_1MB ); const std::string& hash = servus::make_uint128( key ).getString(); for( const auto& slice : data ) { ++hash; memcached_set( _instance, hash.c_str(), hash.length(), slice.data, slice.size, (time_t)0, (uint32_t)0 ); } } std::string Keyv::memcached::Plugin::operator [] ( const std::string& key ) { const std::string& hash = servus::make_uint128( key ).getString(); pression::data::Slicer::Results slices( 1 ); slices[0].data = memcached_get( _instance, hash.c_str(), hash.length(), &slices[0].size ); const auto remaining = _slicer.getRemainingSizes( slice[0].data, slice[0].size ); slices.append( takeValues( hash, remaining )); std::string value( _slicer.getDecompressedSize( slice[0].data, slice[0].size )); _slicer.decompress( slices, value.data(), value.length( )); return value; }
Resolution: 4GB
It is unlikely that a storage system uses larger slices. Memcached has a recommended limit of one megabyte.