You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Alexey Loshkarev (JIRA)" <ji...@apache.org> on 2014/09/15 14:05:33 UTC

[jira] [Created] (COUCHDB-2329) Log broken file name on compress/decompress error

Alexey Loshkarev created COUCHDB-2329:
-----------------------------------------

             Summary: Log broken file name on compress/decompress error
                 Key: COUCHDB-2329
                 URL: https://issues.apache.org/jira/browse/COUCHDB-2329
             Project: CouchDB
          Issue Type: Improvement
      Security Level: public (Regular issues)
          Components: Database Core, Logging
            Reporter: Alexey Loshkarev


Hello.

I'm using couchdb for a bit large database set - over 50 databases with more than 500 million documents in it with total disk size about 2 TB. I'm using cluster with 4 nodes for it.

As it is real life, there are hardware errors from time to time. Most of all didn't affect couchdb, but some of them are. So couchdb write wrong data to disk, or read garbage from them due to disk read errors.

The bad thing is that couchdb dies at the moment it can't decompress data.

The worst thins is that couchdb didn't log broken file name, to help me with this problem. If couchdb would display me broken file name, i'll kill it and recreate via replication from healthy node.

The ugly thing is, I must to drop whole node and re-replicate it. But in my situation, 2 TB replicates over a month! So, average state of my cluster is - 3 nodes are up, and fourth - replicating terabytes of data.

So, my proposal is to add file name, when couchdb fail to decompress data. 

Sample message:
[Mon, 15 Sep 2014 11:51:17 GMT] [error] [emulator] Error in process <0.24789.1> with exit value: {function_clause,[{couch_compress,decompress,[<<1952804468 bytes>>],[{file,"couch_compress.erl"},{line,67}]},{couch_file,pread_term,2,[{file,"couch_file.erl"},{line,135}]},{couch_btree,get_node,2,[{file,"couch_btree.erl"},{line,349}]},{couch_btree,modify_node... 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)