You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Andrzej Bialecki (Jira)" <ji...@apache.org> on 2020/07/03 12:00:00 UTC

[jira] [Resolved] (SOLR-14537) Improve performance of ExportWriter

     [ https://issues.apache.org/jira/browse/SOLR-14537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki resolved SOLR-14537.
-------------------------------------
    Resolution: Fixed

> Improve performance of ExportWriter
> -----------------------------------
>
>                 Key: SOLR-14537
>                 URL: https://issues.apache.org/jira/browse/SOLR-14537
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Export Writer
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Major
>             Fix For: 8.7
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Retrieving, sorting and writing out documents in {{ExportWriter}} are three aspects of the /export handler that can be further optimized.
> SOLR-14470 introduced some level of caching in {{StringValue}}. Further options for caching and speedups should be explored.
> Currently the sort/retrieve and write operations are done sequentially, but they could be parallelized, considering that they block on different channels - the first is index reading & CPU bound, the other is bound by the receiving end because it uses blocking IO. The sorting and retrieving of values could be done in parallel with the operation of writing out the current batch of results.
> One possible approach here would be to use "double buffering" where one buffered batch that is ready (already sorted and retrieved) is being written out, while the other batch is being prepared in a background thread, and when both are done the buffers are swapped. This wouldn't complicate the current code too much but it should instantly give up to 2x higher throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org