You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mark Miller (JIRA)" <ji...@apache.org> on 2016/12/12 04:47:58 UTC

[jira] [Commented] (SOLR-7333) Make the poll queue time configurable and use knowledge that a batch is being processed to poll efficiently

    [ https://issues.apache.org/jira/browse/SOLR-7333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15740962#comment-15740962 ] 

Mark Miller commented on SOLR-7333:
-----------------------------------

We are trying to improve on this in SOLR-9824 so that we use minimal requests and also avoid needless waits after updates for all cases.

This only worked if you used javabin and it only worked for batched docs in a request - streaming saw no benefit.

> Make the poll queue time configurable and use knowledge that a batch is being processed to poll efficiently
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7333
>                 URL: https://issues.apache.org/jira/browse/SOLR-7333
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>             Fix For: 5.2, 6.0
>
>         Attachments: SOLR-7333.patch, SOLR-7333.patch
>
>
> {{StreamingSolrClients}} uses {{ConcurrentUpdateSolrServer}} to stream documents from leader to replica, by default it sets the {{pollQueueTime}} for CUSS to 0 so that we don't impose an unnecessary wait when processing single document updates or the last doc in a batch. However, the downside is that replicas receive many more update requests than leaders; I've seen up to 40x number of update requests between replica and leader.
> If we're processing a batch of docs, then ideally the poll queue time should be greater than 0 up until the last doc is pulled off the queue. If we're processing a single doc, then the poll queue time should always be 0 as we don't want the thread to wait unnecessarily for another doc that won't come.
> Rather than force indexing applications to provide this optional parameter in an update request, it would be better for server-side code that can detect whether an update request is a single document or batch of documents to override this value internally, i.e. it'll be 0 by default, but since {{JavaBinUpdateRequestCodec}} can determine when it's seen the last doc in a batch, it can override the pollQueueTime to something greater than 0.
> This means that current indexing clients will see a boost when doing batch updates without making any changes on their side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org