You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Gus Heck (JIRA)" <ji...@apache.org> on 2018/06/26 18:20:00 UTC

[jira] [Commented] (SOLR-12357) TRA: Pre-emptively create next collection

    [ https://issues.apache.org/jira/browse/SOLR-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524083#comment-16524083 ] 

Gus Heck commented on SOLR-12357:
---------------------------------

This is going require some rework of MaintainRoutedAliasCmd. Presently the code there can never delete a collection unless it's creating a collection. With this feature it would then delay deletion for timePartionSize - premptiveCreateInterval... which would be significant for long partitions and confusing in general. Also, delete time frames that are not even multiples of partition size probably behave somewhat strangely as it is, with old partitions living somewhat longer than they should. I think the maintain command needs to delete if delete is appropriate and create if create is appropriate independently.

Also, it uses Instant.now() to check if it should create a collection and it will now need to know the triggering date from the document or be sent an implicit "force create" attribute. The latter option doesn't sound good because I believe we are relying on this command to be idempotent. If more than one client is updating, several documents might be processed (one by each client) before the results of the command take effect so we can get several instances of the maintain command given to the overseer. Synchronization in the overseer should ensure that subsequent instances see the results of the first and then return as a no-op. So I think we need to pass in a "docDate" or maybe "referenceDate"

> TRA: Pre-emptively create next collection 
> ------------------------------------------
>
>                 Key: SOLR-12357
>                 URL: https://issues.apache.org/jira/browse/SOLR-12357
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: David Smiley
>            Priority: Major
>
> When adding data to a Time Routed Alias (TRA), we sometimes need to create new collections.  Today we only do this synchronously – on-demand when a document is coming in.  But this can add delays as the documents inbound are held up for a collection to be created.  And, there may be a problem like a lack of resources (e.g. ample SolrCloud nodes with space) that the policy framework defines.  Such problems could be rectified sooner rather than later assume there is log alerting in place (definitely out of scope here).
> Pre-emptive TRA collection needs a time window configuration parameter, perhaps named something like "preemptiveCreateWindowMs".  If a document's timestamp is within this time window _from the end time of the head/lead collection_ then the collection can be created pre-eptively.  If no data is being sent to the TRA, no collections will be auto created, nor will it happen if older data is being added.  It may be convenient to effectively limit this time setting to the _smaller_ of this value and the TRA interval window, which I think is a fine limitation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org