You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Nitin Gupta (Jira)" <ji...@apache.org> on 2022/12/15 03:28:00 UTC

[jira] [Updated] (OAK-9754) Increase default dump threshold for multithreaded download

     [ https://issues.apache.org/jira/browse/OAK-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitin Gupta updated OAK-9754:
-----------------------------
    Fix Version/s: 1.48.0
                       (was: 1.46.0)

> Increase default dump threshold for multithreaded download
> ----------------------------------------------------------
>
>                 Key: OAK-9754
>                 URL: https://issues.apache.org/jira/browse/OAK-9754
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: indexing
>            Reporter: Yu-An Lin
>            Priority: Major
>             Fix For: 1.48.0
>
>
> Looking at the detailed log output of indexing job using Oak with Multi-Threaded Download Strategy, lots of small files are being created because we have a low dump threshold of 1MB per file. [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/FlatFileNodeStoreBuilder.java#L91] 
> We should increase the threshold if possible to even larger to 16 MB instead, that way we have 16 MB, with 8 threads that is 128 MB. This would (hopefully) reduce the number of files from 22'972 to 1'435, which is more more reasonable. Also, I don't think it would bring any risk of out-of-memory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)