You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-issues@jackrabbit.apache.org by "Vikas Saurabh (JIRA)" <ji...@apache.org> on 2017/05/08 08:00:14 UTC

[jira] [Commented] (OAK-6180) Tune cursor batch/limit size

    [ https://issues.apache.org/jira/browse/OAK-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000404#comment-16000404 ] 

Vikas Saurabh commented on OAK-6180:
------------------------------------

bq. Bandwidth is wasted if the MongoDB Java driver fetches way more than requested by Oak.
I think that such queries would scan the whole result-set anyway (few things that I can recall: getChildren, revGc). Did you encounter some queries where the intent was not to scan the whole set marked by query constraints and only get a top few (afaiu, "few" is more than 100 though).
What I'm trying to say is that while we might want to see behavior of this tuning, but imo such is tuning has global effects. Maybe, we should also investigate why data that would be wasted on client should be traveling over the wire in the first place.

> Tune cursor batch/limit size
> ----------------------------
>
>                 Key: OAK-6180
>                 URL: https://issues.apache.org/jira/browse/OAK-6180
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: mongomk
>            Reporter: Marcel Reutegger
>            Assignee: Marcel Reutegger
>             Fix For: 1.8
>
>
> MongoDocumentStore uses the default batch size, which means MongoDB will initially get 100 documents and then as many documents as fit into 4MB. Depending on the document size, the number of documents may be quite high and the risk of running into the 60 seconds query timeout defined by Oak increases.
> Tuning the batch size (or using a limit) may also be helpful in optimizing the amount of data transferred from MongoDB to Oak. The DocumentNodeStore fetches child nodes in batches as well. The logic there is slightly different. The initial batch size is 100 and every subsequent batch doubles in size until it reaches 1600. Bandwidth is wasted if the MongoDB Java driver fetches way more than requested by Oak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)