You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Jan Lehnardt <ja...@apache.org> on 2020/07/12 13:37:47 UTC

Fastest way to get a doc _rev

Hey all,

based on a question in our new GitHub Discussion board, I got interested in what is faster: retrieve a doc _rev with a HEAD request or with an _all_docs?key=docid request. The results might be interesting for folks:

    https://github.com/apache/couchdb/discussions/2996#discussioncomment-36190

Best
Jan
—

Re: Fastest way to get a doc _rev

Posted by Jan Lehnardt <ja...@apache.org>.
Heya Garren,

great point, I’ve adjusted things to measure from last row sent in the request
to the end of the response[1], and update the results (not much of a difference
overall), numbers go up a little, but the doc size seems to make little difference.

Note that this is a rather speed SSD system here (recent Mac mini) and fast CPUs.

I’m running the tests many times to avoid caching artefacts, but that means all
access should be cached. So you’d likely see more differences on slower disk and
for uncached requests.

[1]: behold the bashism: RESP=`curl -sv --trace-time http://admin:admin@127.0.0.1:5984/benchbulk-1/_all_docs?key=\"00000000000000500000\" 2>&1`; SENT_TS=$(echo "$RESP" | grep -Eo \(.\{15\}\)\ \>\ Accept | sed -e 's/ > Accept//' | sed -e 's/[:.]//g'); END_TS=$(echo "$RESP" | grep -Eo \(.\{17\}\)\ ?\ Connection | sed -e 's/\ \*\ Connection//' | sed -e 's/[.:]//g'); echo $(($END_TS-$SENT_TS))

> On 12. Jul 2020, at 16:58, Garren Smith <ga...@apache.org> wrote:
> 
> Hi Jan,
> 
> That is a really interesting experiment. I was trying to benchmark
> _all_docs recently and I've noticed is that it will stream the results, so
> it returns the header and the start of the body before its done any actual
> work. I'm not 100% sure if that is the case when `key=` is used. You might
> have to adjust your benchmark to check for the first `{` to signify the
> start of a document.
> 
> Cheers
> Garren
> 
> On Sun, Jul 12, 2020 at 3:37 PM Jan Lehnardt <ja...@apache.org> wrote:
> 
>> Hey all,
>> 
>> based on a question in our new GitHub Discussion board, I got interested
>> in what is faster: retrieve a doc _rev with a HEAD request or with an
>> _all_docs?key=docid request. The results might be interesting for folks:
>> 
>> 
>> https://github.com/apache/couchdb/discussions/2996#discussioncomment-36190
>> 
>> Best
>> Jan
>> —


Re: Fastest way to get a doc _rev

Posted by Jan Lehnardt <ja...@apache.org>.
Heya Garren,

great point, I’ve adjusted things to measure from last row sent in the request
to the end of the response[1], and update the results (not much of a difference
overall), numbers go up a little, but the doc size seems to make little difference.

Note that this is a rather speed SSD system here (recent Mac mini) and fast CPUs.

I’m running the tests many times to avoid caching artefacts, but that means all
access should be cached. So you’d likely see more differences on slower disk and
for uncached requests.

[1]: behold the bashism: RESP=`curl -sv --trace-time http://admin:admin@127.0.0.1:5984/benchbulk-1/_all_docs?key=\"00000000000000500000\" 2>&1`; SENT_TS=$(echo "$RESP" | grep -Eo \(.\{15\}\)\ \>\ Accept | sed -e 's/ > Accept//' | sed -e 's/[:.]//g'); END_TS=$(echo "$RESP" | grep -Eo \(.\{17\}\)\ ?\ Connection | sed -e 's/\ \*\ Connection//' | sed -e 's/[.:]//g'); echo $(($END_TS-$SENT_TS))

> On 12. Jul 2020, at 16:58, Garren Smith <ga...@apache.org> wrote:
> 
> Hi Jan,
> 
> That is a really interesting experiment. I was trying to benchmark
> _all_docs recently and I've noticed is that it will stream the results, so
> it returns the header and the start of the body before its done any actual
> work. I'm not 100% sure if that is the case when `key=` is used. You might
> have to adjust your benchmark to check for the first `{` to signify the
> start of a document.
> 
> Cheers
> Garren
> 
> On Sun, Jul 12, 2020 at 3:37 PM Jan Lehnardt <ja...@apache.org> wrote:
> 
>> Hey all,
>> 
>> based on a question in our new GitHub Discussion board, I got interested
>> in what is faster: retrieve a doc _rev with a HEAD request or with an
>> _all_docs?key=docid request. The results might be interesting for folks:
>> 
>> 
>> https://github.com/apache/couchdb/discussions/2996#discussioncomment-36190
>> 
>> Best
>> Jan
>> —


Re: Fastest way to get a doc _rev

Posted by Garren Smith <ga...@apache.org>.
Hi Jan,

That is a really interesting experiment. I was trying to benchmark
_all_docs recently and I've noticed is that it will stream the results, so
it returns the header and the start of the body before its done any actual
work. I'm not 100% sure if that is the case when `key=` is used. You might
have to adjust your benchmark to check for the first `{` to signify the
start of a document.

Cheers
Garren

On Sun, Jul 12, 2020 at 3:37 PM Jan Lehnardt <ja...@apache.org> wrote:

> Hey all,
>
> based on a question in our new GitHub Discussion board, I got interested
> in what is faster: retrieve a doc _rev with a HEAD request or with an
> _all_docs?key=docid request. The results might be interesting for folks:
>
>
> https://github.com/apache/couchdb/discussions/2996#discussioncomment-36190
>
> Best
> Jan
> —

Re: Fastest way to get a doc _rev

Posted by Garren Smith <ga...@apache.org>.
Hi Jan,

That is a really interesting experiment. I was trying to benchmark
_all_docs recently and I've noticed is that it will stream the results, so
it returns the header and the start of the body before its done any actual
work. I'm not 100% sure if that is the case when `key=` is used. You might
have to adjust your benchmark to check for the first `{` to signify the
start of a document.

Cheers
Garren

On Sun, Jul 12, 2020 at 3:37 PM Jan Lehnardt <ja...@apache.org> wrote:

> Hey all,
>
> based on a question in our new GitHub Discussion board, I got interested
> in what is faster: retrieve a doc _rev with a HEAD request or with an
> _all_docs?key=docid request. The results might be interesting for folks:
>
>
> https://github.com/apache/couchdb/discussions/2996#discussioncomment-36190
>
> Best
> Jan
> —