You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Andrzej Bialecki (Jira)" <ji...@apache.org> on 2021/01/04 17:57:00 UTC
[jira] [Commented] (SOLR-15055) Re-implement 'withCollection' and
'maxShardsPerNode'
[ https://issues.apache.org/jira/browse/SOLR-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258370#comment-17258370 ]
Andrzej Bialecki commented on SOLR-15055:
-----------------------------------------
Additional notes on how {{withCollection}} was implemented in 8x.
Let's first establish the naming:
* collection A (primary) is the one that wants the other collection to be always co-located with it, eg. to implement faster cross-collection joins.
* collection B (secondary) is an auxiliary collection that is used by collection A (primary). In 8x this collection had to be single-sharded.
In 8x collection A can be marked (by setting a collection property) as {{withCollection: B}}. Collection B must already exist. This constraint causes all ADDREPLICA commands for the collection A (including its initial creation) to also automatically invoke ADDREPLICA for collection B's replica (of the only shard) to be placed on the same node as the A's replica, if a B's replica is missing on the target node for the A's replica.
This relationship in 8x was always supposed to be 1:1, i.e. a single primary collection could specify at most a single {{withCollection: B}}.
A reverse relationship was also created in collection B using {{COLOCATED_WITH}} property. This property would point to collection A and it would prevent collection B from being deleted while in use by collection A.
That implementation was not ideal, for several reasons:
* additional replicas of the secondary collection B were never removed when primary replicas were deleted or moved around.
* the code would always add an NRT replica for the B collection, there was no way to request other types of replicas to add.
* AFAIK the placement could fail due to the fact that the B replica placements would bypass the usual placement policy calculations (including free disk space checks).
* for the same reason the placement of the A replica could be sub-optimal because it didn't consider the combined metrics of A+B replicas (combined replica size, combined number of cores, etc).
* only 1:1 relationship was officially supported - if multiple primary collection pointed to the same B collection the {{COLOCATED_WITH}} property in B would point only to the latest primary collection. This means that users could accidentally bypass the B's deletion prevention mechanism if they deleted the latest primary collection - but still kept in use the other previously defined primary collections.
> Re-implement 'withCollection' and 'maxShardsPerNode'
> ----------------------------------------------------
>
> Key: SOLR-15055
> URL: https://issues.apache.org/jira/browse/SOLR-15055
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Priority: Major
>
> Solr 8x replica placement provided two settings that are very useful in certain scenarios:
> * {{withCollection}} constraint specified that replicas should be placed on the same nodes where replicas of another collection are located. In the 8x implementation this was limited in practice to co-locating single-shard secondary collections used for joins or other lookups from the main collection (which could be multi-sharded).
> * {{maxShardsPerNode}} - this constraint specified the maximum number of replicas per shard that can be placed on the same node. In most scenarios this was set to 1 in order to ensure fault-tolerance (ie. at most 1 replica of any given shard would be placed on any given node). Changing this constraint to values > 1 would reduce fault-tolerance but may be desired in test setups or as a temporary relief measure.
>
> Both these constraints are collection-specific so they should be configured e.g. as collection properties.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org