You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Gus Heck (JIRA)" <ji...@apache.org> on 2019/05/02 14:21:00 UTC

[jira] [Comment Edited] (SOLR-13420) Allow Routed Aliases to use Collection Properties instead of core properties

    [ https://issues.apache.org/jira/browse/SOLR-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831656#comment-16831656 ] 

Gus Heck edited comment on SOLR-13420 at 5/2/19 2:20 PM:
---------------------------------------------------------

Let me also try to sell the features of this patch a little better:

Before this patch the api is quite confusing (IMHO):
 # any code that wanted to know what the properties for a collection are could call zkStateReader.getCollectionProperties(collection) but this was a dangerous and trappy API because that was a query to zookeeper every time. If a naive user auto-completed that in their IDE without investigating, heavy use of zookeeper would ensue.
 # To "do it right" for any code that might get called on a per-doc or per request basis one had to cause caching by registering a watcher. At which point the getCollectionProperties(collection) magically becomes safe to use, but the watcher pattern probably looks famillar induces a user who hasn't read the solr code closely to create their own cache and update it when their watcher is notified. If the caching side effect of watches isn't understood this will lead to many in-memory copies of collection properties maintained in user code.
 # This also creates a task to be scheduled on a thread (PropsNotification) and induces an extra thread-scheduling lag before the changes can be observed by user code.
 # The code that cares about collection properties needs to have a lifecycle tied to either a collection or something other object with an even more ephemeral life cycle such as an URP. The user now also has to remember to ensure the watch is unregistered, or there is a leak.

After this patch
 # Calls to getCollectionProperties(collection) are always safe to use in any code anywhere. Caching and cleanup are automatic.
 # Code that really actually wants to know if a collection property changes so it can wake up and do something (autoscaling?) still has the option of registering a watcher that will asynchronously send them a notification.


was (Author: gus_heck):
Let me also try to sell the features of this patch a little better:

Before this patch the api is quite confusing (IMHO):
 # any code that wanted to know what the properties for a collection are could call zkStateReader.getCollectionProperties(collection) but this was a dangerous and trappy API because that was a query to zookeeper every time. If a naive user auto-completed that in their IDE without investigating, heavy use of zookeeper would ensue.
 # To "do it right" for any code that might get called on a per-doc or per request basis one had to cause caching by registering a watcher. At which point the getCollectionProperties(collection) magically becomes safe to use, but the watcher pattern probably looks famillar induces a user who hasn't read the solr code closely to create their own cache and update it when their watcher is notified. If the caching side effect of watches isn't understood this will lead to many in-memory copies of collection properties maintained in user code.
 # This also creates a task to be scheduled on a thread (PropsNotification) before the user and induces an extra thread-scheduling lag before the changes can be observed.
 # The code that cares about collection properties needs to have a lifecycle tied to either a collection or something other object with an even more ephemeral life cycle such as an URP. The user now also has to remember to ensure the watch is unregistered, or there is a leak.

After this patch
 # Calls to getCollectionProperties(collection) are always safe to use in any code anywhere. Caching and cleanup are automatic.
 # Code that really actually wants to know if a collection property changes so it can wake up and do something (autoscaling?) still has the option of registering a watcher that will asynchronously send them a notification.

> Allow Routed Aliases to use Collection Properties instead of core properties
> ----------------------------------------------------------------------------
>
>                 Key: SOLR-13420
>                 URL: https://issues.apache.org/jira/browse/SOLR-13420
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: master (9.0)
>            Reporter: Gus Heck
>            Assignee: Gus Heck
>            Priority: Major
>         Attachments: SOLR-13420.patch, SOLR-13420.patch, SOLR-13420.patch
>
>
> The current routed alias code is relying on a core property named routedAliasName to detect when the Routed Alias wrapper URP should be applied to Distributed Update Request Processor. 
> {code:java}
> #Written by CorePropertiesLocator
> #Sun Mar 03 06:21:14 UTC 2019
> routedAliasName=testalias21
> numShards=2
> collection.configName=_default
> ... etc...
> {code}
> Core properties are not changeable after the core is created, and they are written to the file system for every core. To support a unit test for SOLR-13419 I need to create some legacy formatted collection names, and arrange them into a TRA, but this is impossible due to the fact that I can't change the core property from the test. There's a TODO dating back to the original TRA implementation in the routed alias code to switch to collection properties instead, so this ticket will address that TOD to support the test required for SOLR-13419.
> Back compatibility with legacy core based TRA's and CRA's will of course be maintained. I also expect that this will facilitate some more nimble handling or routed aliases with future auto-scaling capabilities such as possibly detaching and archiving collections to cheaper, slower machines rather than deleting them. (presently such a collection would still attempt to use the routed alias if it received an update even if it were no longer in the list of collections for the alias)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org