You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by gi...@git.apache.org on 2017/08/02 04:06:11 UTC

[GitHub] TojNALvViV opened a new issue #736: support (Zstandard?) dictionary compression

TojNALvViV opened a new issue #736: support (Zstandard?) dictionary compression
URL: https://github.com/apache/couchdb/issues/736
 
 
   I am using CouchDB to store a large number of relatively small documents, usually between 50 and 150 bytes each. Things work fine, but I noticed that despite running compaction and having compression enabled ("snappy"), a backup of my data directory still compressed 4:1.
   
   Looking at the shard files it seems that CouchDB is struggling to compress the documents because each of them is small.
   
   Would it be possible to support some kind of dictionary compression, where "dictionary" documents could be added to a database? These could then be passed to a compressor which supports dictionary compression, like Zstandard. I imagine this could lead to significant storage savings, especially given that most users store documents with frequently recurring JSON property names, which are currently duplicated a (very) large number of times in storage.
   
   I suppose that eventually it might be possible to automate this sort of compression, but allowing users to specify dictionary/template documents seems like a reasonably small, useful stepping stone along that path.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services