You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Joel Bernstein (JIRA)" <ji...@apache.org> on 2016/03/08 19:07:40 UTC

[jira] [Comment Edited] (SOLR-8709) Account for out-of-order version numbers in the TopicStream

    [ https://issues.apache.org/jira/browse/SOLR-8709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185385#comment-15185385 ] 

Joel Bernstein edited comment on SOLR-8709 at 3/8/16 6:06 PM:
--------------------------------------------------------------

A little more detail on the design. A new "retentionWindow" parameter will be added to the TopicStream to define the window of time for holding processed version numbers. This retentionWindow should be larger then the soft commit window. A TreeMap will be used to hold all the version numbers that have been processed. This TreeMap will be trimmed to hold only version numbers within the retention window. A PostFilter will be added that checks to see if a document is within the retentionWindow but is not in the TreeMap. This should catch any out of order version numbers. The contents of the TreeMap will be persisted as part of the checkpoint. 


was (Author: joel.bernstein):
A little more detail on the design. A new "retentionWindow" parameter will be added to the TopicStream to define the window of time for holding processed version numbers. This retentionWindow should be larger then the soft commit window. A TreeMap will be used to hold all the version numbers that have been processed. This TreeMap will be trimmed to hold only version numbers with the retention window. A PostFilter will be added that checks to see if a document is within the retentionWindow but is not in the TreeMap. This should catch any out of order version numbers. The contents of the TreeMap will be persisted as part of the checkpoint. 

> Account for out-of-order version numbers in the TopicStream
> -----------------------------------------------------------
>
>                 Key: SOLR-8709
>                 URL: https://issues.apache.org/jira/browse/SOLR-8709
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Joel Bernstein
>
> Currently the TopicStream can miss documents if version numbers are received out-of-order. The TopicStream sorts on version number so it will only miss out-of-order versions that span commit boundaries.
> In order to resolve this issue we can adopt an approach that keeps a set of the last N version numbers sent for each Topic.  As the documents are scanned we can check for documents within this time window that do not appear in the sent set. These documents can then be sent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org