You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2017/11/01 04:31:02 UTC

[jira] [Updated] (SOLR-11542) Add feature to DistributedURP to route time partitioned collections

     [ https://issues.apache.org/jira/browse/SOLR-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley updated SOLR-11542:
--------------------------------
    Attachment: SOLR_11542_time_series_URP.patch

Attached is a patch with my work in progress, including a test (that passes).  There are various nocommits and TODOs.  _I'm super glad to have seen that I don't need to hack DURP!_ (I need to retitle the issue accordingly).  What I did is modify {{DistributedUpdateProcessorFactory}} to conditionally wrap DURP with a new URP that I've tentatively named TimePartitionedUpdateProcessor here.  This URP uses the {{SolrCmdDistributor}} facility used by DURP.

Kudos to [~markrmiller@gmail.com] on {{SolrCmdDistributor}} which I think is very well designed and reusable.

The new URP needs to know for what time-partioned alias the local Solr core is related to.  To keep this lookup fast, I decided on a core property "timePartitionAliasName" that can be specified on core creation.  It's technically redundant with information in Aliases but it seems expensive to look one's self up in Aliases since the core's collection name would be on the value side of one of the aliases.

The URP uses SolrCmdDistributor.RetryNode and not StdNode.  It's not quite clear when to use either.

It's a TODO to route the request to the shard leader that corresponds with the docRouter key; instead it just picks whatever the first leader is right now.

I want to test this with TolerantUpdateProcessor to ensure any routing mishaps (i.e. too old doc) needn't fail the whole request.

> Add feature to DistributedURP to route time partitioned collections
> -------------------------------------------------------------------
>
>                 Key: SOLR-11542
>                 URL: https://issues.apache.org/jira/browse/SOLR-11542
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: David Smiley
>            Priority: Major
>             Fix For: 7.2
>
>         Attachments: SOLR_11542_time_series_URP.patch
>
>
> Assuming we have some time partitioning metadata on an alias (see SOLR-11487 for the metadata facility), we'll then need to route documents to the right collection.  I tentatively propose a helper class to DistributedURP to do this.  Perhaps a separate URP is plausible, though it will take some modifications to DistributedURP.
> The scope of this issue is:
> * decide on some alias metadata names & semantics
> * decide the collection suffix pattern.  Read/write code (needed to route).
> * the routing code
> No new partition creation nor deletion happens is this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org