You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2016/04/27 21:01:13 UTC

[jira] [Updated] (SOLR-5750) Backup/Restore API for SolrCloud

     [ https://issues.apache.org/jira/browse/SOLR-5750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley updated SOLR-5750:
-------------------------------
    Attachment: SOLR-5750.patch

I pushed changes to the branch and attached a patch.  The changes include:
* test asyncId
* refactored some Snapshooter.create\* logic which included fixing a bug in which a core backup wasn't reserving the IndexCommit
* some small miscellaneous stuff to resolve nocommits

I didn't add to the test that core properties made their wait to the restored core as I'm not sure exactly how to do that but I manually verified they got there.

There are just 2 nocommits to resolve:
# Are we sure we want the parameter "name" for backup & restore to be the backup/snapshot name and not the collection name, and furthermore are we sure we want the collection name to be the parameter "collection"?  I have no strong convictions but I see for other collection oriented commands we use "name" as the name of the collection.  The requests here extend CollectionSpecificAdminRequest which have a getParams that put the collection name into "name" but we override it... which gave me some pause to question if these parameter names are best.  Perhaps the backup name parameter could be "snapshot" or "snapshotName"?  (note that "snapshotName" shows up in some backup/snapshot related properties files).
# [~varunthacker] I don't understand why Overseer.processMessage has case statements for RESTORE & BACKUP that do nothing.  At a minimum there should be comments there explaining why; it sure looks buggy the way it is.  I set breakpoints there and the test never hit it.  I don't understand the differentiation between Overseer.processMessage and OverseerCollectionMessageHandler.processMessage which seem remarkably similar and redundant.

[~shalinmangar] if you have time I would love a code review.

Otherwise, I think it's committable.  Tests pass.  If I don't get a code review or further comments for that matter, I'll commit in a couple days.

Propagating createNodeSet, snitch, and rule options can be a follow-on issue.  Using HDFS as a backup location can be another issue too.

> Backup/Restore API for SolrCloud
> --------------------------------
>
>                 Key: SOLR-5750
>                 URL: https://issues.apache.org/jira/browse/SOLR-5750
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Varun Thacker
>             Fix For: 5.2, master
>
>         Attachments: SOLR-5750.patch, SOLR-5750.patch, SOLR-5750.patch, SOLR-5750.patch, SOLR-5750.patch, SOLR-5750.patch, SOLR-5750.patch
>
>
> We should have an easy way to do backups and restores in SolrCloud. The ReplicationHandler supports a backup command which can create snapshots of the index but that is too little.
> The command should be able to backup:
> # Snapshots of all indexes or indexes from the leader or the shards
> # Config set
> # Cluster state
> # Cluster properties
> # Aliases
> # Overseer work queue?
> A restore should be able to completely restore the cloud i.e. no manual steps required other than bringing nodes back up or setting up a new cloud cluster.
> SOLR-5340 will be a part of this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org