You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "Nick Vatamaniuc (JIRA)" <ji...@apache.org> on 2017/02/03 22:58:51 UTC

[jira] [Created] (COUCHDB-3291) Excessivly long document IDs prevent replicator from making progress

Nick Vatamaniuc created COUCHDB-3291:
----------------------------------------

             Summary: Excessivly long document IDs prevent replicator from making progress
                 Key: COUCHDB-3291
                 URL: https://issues.apache.org/jira/browse/COUCHDB-3291
             Project: CouchDB
          Issue Type: Bug
            Reporter: Nick Vatamaniuc


Currently there is not protection in couchdb from creating IDs which are too long. So large IDs will hit various implicit limits which usually results in unpredictable failure modes.

On such example implicit limit is hit in the replicator code. Replicate usually fetches document IDs in a bulk-like call either gets them via changes feed, computes revs_diffs in a post or inserts them with bulk_docs, except one case when it fetch open_revs. There it uses a single GET request. That requests fails because there is a bug / limitation in the http parser. The first GET line in the http request has to fit in the receive buffer for the receiving socket. 

Increasing that buffer allow passing through larger http requests lines. In configuration options it can be manipulated as 
{code}
 chttpd.server_options="[...,{recbuf, 32768},...]"
{code}

Steve Vinoski mentions something about a possible bug in http packet parser code as well:

http://erlang.org/pipermail/erlang-questions/2011-June/059567.html

Tracing this a bit I see that a proper mochiweb request is never even created and instead request hangs. So that confirms it further. It seems in the code here:

https://github.com/apache/couchdb-mochiweb/blob/bd6ae7cbb371666a1f68115056f7b30d13765782/src/mochiweb_http.erl#L90

The timeout clause is hit. Adding a catchall exception I get the {tcp_error,#Port<0.40682>,emsgsize} message which we don't handle. Seems like a sane place to throw a 413 or such there.

There are probably multiple ways to address the issue:

 * Increase mochiweb listener buffer to fit larger doc ids. However that is a separate bug and using it to control document size during replication is not reliable. Moreover that would allow larger IDs to propagate through the system during replication, then would have to configure all future replication source with the same maximum recbuf value.

 * Introduce a validation step in {code} couch_doc:validate_docid {code}. Currently that code doesn't read from config files and is in the hotpath. Added a config read in there might reduce performance.  If that is enabled it would stop creating new documents with large ids. But have to decide how to handle already existing IDs which are larger than the limit.

 * Introduce a validation/bypass in the replicator. Specifically targeting replicator might help prevent propagation of large IDs during replication. There is a already a similar case of skipping writing large attachment or large documents (which exceed request size) and bumping {code} doc_write_failures {code}.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)