You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org> on 2015/10/19 19:04:05 UTC

[jira] [Commented] (SOLR-6273) Cross Data Center Replication

    [ https://issues.apache.org/jira/browse/SOLR-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963616#comment-14963616 ] 

Shalin Shekhar Mangar commented on SOLR-6273:
---------------------------------------------

I've been playing with this feature for a couple of days and I have a few thoughts to improve this before we merge it into branch_5x.

# I think the configuration should be moved out of solrconfig.xml -- the source collection name is redundant (it is always the one to which the core belongs) and it is the wrong place to configure peer cluster details. Perhaps the peer cluster details should be in cluster properties and the target collection should live as a collection level property. All this should be editable using our config APIs
# I feel it is too complex to have the user configure things like batch sizes and scheduler delays etc. Maybe a better way is to stream the transaction log in a single thread constantly and throttle to a configurable transfer rate. This will also reduce memory requirements by avoiding huge batches and possibly improve transfer speed as well. See point below.
# The current CDCR code behaves poorly on bulk loads. I loaded a 600MB file containing 2.7M JSON documents into the source collection in 177 seconds but it took more than 6 hours to replicate them into the target collection using schedule=1ms and batch size = 64. We need to do better than that by default.
# Related to the point above, the current CDCR code is not suitable for bootstrapping a new target cluster. We should look into a snapshot replication to speed up the bootstrap process (and maybe even the bulk loads)
# We need better stats/reporting including transfer rate, latency etc
# Each core puts a watch on the current shard's leader node to figure out if it is the current leader and therefore whether it should start the cdcr threads. I think this is not necessary. A similar problem was faced by SOLR-6266 the couchbase indexer plugin (not committed yet). I think we should have a event handler API for cores to listen for important cluster state events such as leader changes or state changes and do away with individual plugins adding a listener on ZK nodes. A better solution may be to have collection level plugins that can be automatically elected, failed over etc but that is a lot of work so I'll defer that for now.

Thoughts?

> Cross Data Center Replication
> -----------------------------
>
>                 Key: SOLR-6273
>                 URL: https://issues.apache.org/jira/browse/SOLR-6273
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Yonik Seeley
>            Assignee: Erick Erickson
>         Attachments: SOLR-6273-trunk-testfix1.patch, SOLR-6273-trunk-testfix2.patch, SOLR-6273-trunk-testfix3.patch, SOLR-6273-trunk.patch, SOLR-6273-trunk.patch, SOLR-6273.patch, SOLR-6273.patch, SOLR-6273.patch, SOLR-6273.patch
>
>
> This is the master issue for Cross Data Center Replication (CDCR)
> described at a high level here: http://heliosearch.org/solr-cross-data-center-replication/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org