You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "pgj (via GitHub)" <gi...@apache.org> on 2023/05/01 23:22:13 UTC

[GitHub] [couchdb] pgj opened a new issue, #4560: Inconsistent execution statistics for Mango queries

pgj opened a new issue, #4560:
URL: https://github.com/apache/couchdb/issues/4560

   When requested, the Mango execution statistics do not always reflect their actual values but something that is less than or simply zero, most of the times.  This happens on the latest version of CouchDB (as the time of writing, which is `82aa1625`) and the issue can be reproduced as follows.  The commands below create a simple database, called `test` with 25 documents of a single field, for which an index is defined.
   
   ```shell
   curl -sS -X PUT "$COUCHDB_URL"/test
   for i in $(jot - 1 25); do \
     curl -sS -X POST "$COUCHDB_URL"/test -H "Content-Type: application/json" -d '{"a": '"$i"'}'; done
   curl -sS -X POST -H "Content-Type: application/json" "$COUCHDB_URL"/test/_index \
     -d '{"index": {"fields": ["a"]}, "name": "a", "type": "json"}'
   ```
   
   Then, using a selector like `{"a": {"$lt": 20}}` and `limit` of 1, and with execution statistics enabled, the `total_docs_examined` field becomes zero.  The other `_examined` fields are either not implemented or not affected by queries like that.  It is visible that `results_returned` is properly accounted at the same time.
   
   ```console
   $ curl -sS -X POST -H "Content-Type: application/json" "$COUCHDB_URL"/test/_find \
     -d '{"execution_stats": true, "limit": 1, "selector": {"a": {"$lt": 20}}}' \
     | jq '.execution_stats'
   {
     "total_keys_examined": 0,
     "total_docs_examined": 0,
     "total_quorum_docs_examined": 0,
     "results_returned": 1,
     "execution_time_ms": 1.067
   }
   ```
   
   Note that if either `limit` is increased or the search criteria on the `a` field is changed to find less documents, more realistic data is returned.
   
   ```console
   $ curl -sS -X POST -H "Content-Type: application/json" "$COUCHDB_URL"/test/_find \
     -d '{"execution_stats": true, "limit": 20, "selector": {"a": {"$lt": 20}}}' \
     | jq '.execution_stats'
   {
     "total_keys_examined": 0,
     "total_docs_examined": 20,
     "total_quorum_docs_examined": 0,
     "results_returned": 19,
     "execution_time_ms": 2.378
   }
   ```
   
   or
   
   ```console
   $ curl -sS -X POST -H "Content-Type: application/json" "$COUCHDB_URL"/test/_find \
     -d '{"execution_stats": true, "limit": 1, "selector": {"a": {"$lt": 2}}}' \
     | jq '.execution_stats'
   {
     "total_keys_examined": 0,
     "total_docs_examined": 2,
     "total_quorum_docs_examined": 0,
     "results_returned": 1,
     "execution_time_ms": 0.845
   }
   ```
   
   After some debugging, the source of the issue has been identified as the emission of `stop` in `mango_cursor_view:handle_doc/2` when the limit reaches zero (last clause). 
   
   https://github.com/apache/couchdb/blob/82aa1625ca8b652f7db3dadad741e86239641921/src/mango/src/mango_cursor_view.erl#L483-L497
   
   The `stop` action immediately stops the processing of messages from the shards, including the shard-level statistics that are submitted in response to the `complete` message.
   
   https://github.com/apache/couchdb/blob/82aa1625ca8b652f7db3dadad741e86239641921/src/mango/src/mango_cursor_view.erl#L385-L390
   
   That is why when the limit is too low, there is not enough time to receive and process the related message.  Therefore the execution statistics are not, or only partially (from a set of shards) handled.
   
   I could not yet come up with a satisfying solution but I create this ticket to raise awareness about this bug.  Changing `stop` to `ok` in `mango_cursor_view:handle_doc/2` helps with the consistency but then too many results are sent by the shards which will be known to be discarded already.  This potentially has an impact on the performance, although it has not been measured how much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Inconsistent execution statistics for Mango queries [couchdb]

Posted by "pgj (via GitHub)" <gi...@apache.org>.
pgj closed issue #4560: Inconsistent execution statistics for Mango queries
URL: https://github.com/apache/couchdb/issues/4560


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org