You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jan Høydahl (JIRA)" <ji...@apache.org> on 2018/06/27 21:22:00 UTC

[jira] [Comment Edited] (SOLR-12523) Collection backup fails whether the location + name directory exists or not exists.

    [ https://issues.apache.org/jira/browse/SOLR-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525628#comment-16525628 ] 

Jan Høydahl edited comment on SOLR-12523 at 6/27/18 9:21 PM:
-------------------------------------------------------------

Yes, you need a shared file system with the same file path across nodes. The error messages is not very helpful.

The reference Guide at [https://lucene.apache.org/solr/guide/7_4/collections-api.html#backup] says:
{quote}Backs up Solr collections and associated configurations to a shared filesystem - for example a Network File System.
{quote}
We should probably add a CAUTION box in the refGuide for the shared fs requirement. We should also improve the error messages. Perhaps the {{BackupCmd}} could do a pre-check before starting the backup by writing a file with random file name into the backup root folder and then ask each node to check for the existence of that file - if it fails we'd abort the backup with some sane error message.

The error message you quoted above will only be printed in case of an issue with non-shared FS, right? So perhaps a quick fix is to improve that error message and mention the shared FS requirement right there?


was (Author: janhoy):
Yes, you need a shared file system with the same file path across nodes. The error messages is not very helpful.

The reference Guide at [https://lucene.apache.org/solr/guide/7_4/collections-api.html#backup] says:
{quote}Backs up Solr collections and associated configurations to a shared filesystem - for example a Network File System.
{quote}
We should probably add a CAUTION box in the refGuide for the shared fs requirement. We should also improve the error messages. Perhaps the {{BackupCmd}} could do a pre-check before starting the backup by writing a file with random file name into the backup root folder and then ask each node to check for the existence of that file - if it fails we'd abort the backup with some sane error message.

> Collection backup fails whether the location + name directory exists or not exists.
> -----------------------------------------------------------------------------------
>
>                 Key: SOLR-12523
>                 URL: https://issues.apache.org/jira/browse/SOLR-12523
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Backup/Restore
>    Affects Versions: 7.3.1
>            Reporter: Timothy Potter
>            Priority: Major
>
> So I have a large collection with 4 shards across 2 nodes. When I try to back it up with:
> {code}
> curl "http://localhost:8984/solr/admin/collections?action=BACKUP&name=sigs&collection=foo_signals&async=5&location=backups"
> {code}
> I either get:
> {code}
> "5170256188349065":{
>     "responseHeader":{
>       "status":0,
>       "QTime":0},
>     "STATUS":"failed",
>     "Response":"Failed to backup core=foo_signals_shard1_replica_n2 because org.apache.solr.common.SolrException: Directory to contain snapshots doesn't exist: file:///vol1/cloud84/backups/sigs"},
>   "5170256187999044":{
>     "responseHeader":{
>       "status":0,
>       "QTime":0},
>     "STATUS":"failed",
>     "Response":"Failed to backup core=foo_signals_shard3_replica_n10 because org.apache.solr.common.SolrException: Directory to contain snapshots doesn't exist: file:///vol1/cloud84/backups/sigs"},
> {code}
> or if I create the directory, then I get:
> {code}
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":2},
>   "Operation backup caused exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: The backup directory already exists: file:///vol1/cloud84/backups/sigs/",
>   "exception":{
>     "msg":"The backup directory already exists: file:///vol1/cloud84/backups/sigs/",
>     "rspCode":400},
>   "status":{
>     "state":"failed",
>     "msg":"found [2] in failed tasks"}}
> {code}
> I'm thinking this has to do with having 2 cores from the same collection on the same node but I can't get a collection with 1 shard on each node to work either:
> {code}
> "ec2-52-90-245-38.compute-1.amazonaws.com:8984_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://ec2-52-90-245-38.compute-1.amazonaws.com:8984/solr: Failed to backup core=system_jobs_history_shard2_replica_n6 because org.apache.solr.common.SolrException: Directory to contain snapshots doesn't exist: file:///vol1/cloud84/backups/ugh1"}
> {code}
> What's weird is that replica (system_jobs_history_shard2_replica_n6) is not even on the ec2-52-90-245-38.compute-1.amazonaws.com node! It lives on a different node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org