You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2016/11/04 14:50:58 UTC

[jira] [Commented] (SOLR-9038) Support snapshot management functionality for a solr collection

    [ https://issues.apache.org/jira/browse/SOLR-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636549#comment-15636549 ] 

ASF subversion and git services commented on SOLR-9038:
-------------------------------------------------------

Commit 1381dd9287a23c950eaaa3c258249a5ebc812f35 in lucene-solr's branch refs/heads/master from markrmiller
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1381dd9 ]

SOLR-9055: Make collection backup/restore extensible.

- Introduced a parameter for the Backup operation to specify index backup strategy.
- Introduced two strategies for backing up index data.
  - One using core Admin API (BACKUPCORE)
  - Other skipping the backup of index data altogether. This is useful when
    the index data is copied via an external mechanism in combination with named
    snapshots (Please refer to SOLR-9038 for details)
  - In future we can add additional implementations of this interface (e.g. based on HDFS snapshots etc.)
- Added a backup property to record the Solr version. This helps to check the compatibility
  of backup with respect to the current version during the restore operation. This
  compatibility check is not added since its unclear what the Solr level compatibility guidelines
  are. But at-least having version information as part of the backup would be very useful.


> Support snapshot management functionality for a solr collection
> ---------------------------------------------------------------
>
>                 Key: SOLR-9038
>                 URL: https://issues.apache.org/jira/browse/SOLR-9038
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>            Reporter: Hrishikesh Gadre
>            Assignee: David Smiley
>
> Currently work is under-way to implement backup/restore API for Solr cloud (SOLR-5750). SOLR-5750 is about providing an ability to "copy" index files and collection metadata to a configurable location. 
> In addition to this, we should also provide a facility to create "named" snapshots for Solr collection. Here by "snapshot" I mean configuring the underlying Lucene IndexDeletionPolicy to not delete a specific commit point (e.g. using PersistentSnapshotIndexDeletionPolicy). This should not be confused with SOLR-5340 which implements core level "backup" functionality.
> The primary motivation of this feature is to decouple recording/preserving a known consistent state of a collection from actually "copying" the relevant files to a physically separate location. This decoupling have number of advantages
> - We can use specialized data-copying tools for transferring Solr index files. e.g. in Hadoop environment, typically [distcp|https://hadoop.apache.org/docs/r1.2.1/distcp2.html] tool is used to copy files from one location to other. This tool provides various options to configure degree of parallelism, bandwidth usage as well as integration with different types and versions of file systems (e.g. AWS S3, Azure Blob store etc.)
> - This separation of concern would also help Solr to focus on the key functionality (i.e. querying and indexing) while delegating the copy operation to the tools built for that purpose.
> - Users can decide if/when to copy the data files as against creating a snapshot. e.g. a user may want to create a snapshot of a collection before making an experimental change (e.g. updating/deleting docs, schema change etc.). If the experiment is successful, he can delete the snapshot (without having to copy the files). If the experiment is failed, then he can copy the files associated with the snapshot and restore.
> Note that Apache Blur project is also providing a similar feature [BLUR-132|https://issues.apache.org/jira/browse/BLUR-132]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org