You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Koen De Groote <ko...@limecraft.com> on 2020/02/20 16:25:59 UTC

Backups with only 1 machine having access to remote storage?

Hello all,

I've recently set up backups, using solr 7.6

My setup has 3 replicas per collection and several collections. Not all
collections or replicas are present on all hosts.

That being said, I run the backup command from 1 particular host and only
that host has access to the mount on which the backup data will be written.

This means that the host writing the backup data doesn't have all the data
on its local filesystem.

Is this a problem?

By which I mean: will data not present on that host be retrieved over the
network?

What happens in this case?

Kind regards,
Koen De Groote

Re: Backups with only 1 machine having access to remote storage?

Posted by Koen De Groote <ko...@limecraft.com>.
Hello Houston,

Indeed, upon reading the documentation again, I now see this text, which I
must have missed before: SolrCloud Backup/Restore requires a shared file
system mounted at the same path on all nodes, or HDFS.

My bad. Could stand to be even bigger, I think. The text.

Thanks for contacting me about this.

Kind regards,
Koen De Groote



On Thu, Feb 20, 2020 at 7:04 PM Houston Putman <ho...@gmail.com>
wrote:

> From my experience, you need all nodes to have access to the shared
> storage.
> Solr will pick which nodes should write each shard's data, and you do not
> have a lot of control over which nodes are selected.
> This is why in the documentation it says that the backup must be written to
> NFS or HDFS.
> Solr won't try to retrieve the other replicas over the network.
>
> I think you will actually get an error back when not every node is able to
> see the path where the backup should be written.
> But even if you don't receive an error, the backup will not work.
>
> - Houston
>
>
>
> On Thu, Feb 20, 2020 at 11:26 AM Koen De Groote <
> koen.degroote@limecraft.com>
> wrote:
>
> > Hello all,
> >
> > I've recently set up backups, using solr 7.6
> >
> > My setup has 3 replicas per collection and several collections. Not all
> > collections or replicas are present on all hosts.
> >
> > That being said, I run the backup command from 1 particular host and only
> > that host has access to the mount on which the backup data will be
> written.
> >
> > This means that the host writing the backup data doesn't have all the
> data
> > on its local filesystem.
> >
> > Is this a problem?
> >
> > By which I mean: will data not present on that host be retrieved over the
> > network?
> >
> > What happens in this case?
> >
> > Kind regards,
> > Koen De Groote
> >
>

Re: Backups with only 1 machine having access to remote storage?

Posted by Houston Putman <ho...@gmail.com>.
From my experience, you need all nodes to have access to the shared storage.
Solr will pick which nodes should write each shard's data, and you do not
have a lot of control over which nodes are selected.
This is why in the documentation it says that the backup must be written to
NFS or HDFS.
Solr won't try to retrieve the other replicas over the network.

I think you will actually get an error back when not every node is able to
see the path where the backup should be written.
But even if you don't receive an error, the backup will not work.

- Houston



On Thu, Feb 20, 2020 at 11:26 AM Koen De Groote <ko...@limecraft.com>
wrote:

> Hello all,
>
> I've recently set up backups, using solr 7.6
>
> My setup has 3 replicas per collection and several collections. Not all
> collections or replicas are present on all hosts.
>
> That being said, I run the backup command from 1 particular host and only
> that host has access to the mount on which the backup data will be written.
>
> This means that the host writing the backup data doesn't have all the data
> on its local filesystem.
>
> Is this a problem?
>
> By which I mean: will data not present on that host be retrieved over the
> network?
>
> What happens in this case?
>
> Kind regards,
> Koen De Groote
>

Re: Backups with only 1 machine having access to remote storage?

Posted by Koen De Groote <ko...@limecraft.com>.
Hello Aroop,

I am doing this via the commands described here:
https://lucene.apache.org/solr/guide/7_6/making-and-restoring-backups.html
The setup is using solr cloud. The backup is written to an NFS mount.

I now see the text: "SolrCloud Backup/Restore requires a shared file system
mounted at the same path on all nodes, or HDFS."

Honestly, I must have missed it. I do not recall it being there before.

I guess that answers my question.

Thank you for contacting me back about this anyway.

Kind regards,
Koen De Groote




On Fri, Feb 21, 2020 at 12:43 AM Aroop Ganguly
<ar...@icloud.com.invalid> wrote:

> Hi Koen
>
> Which backup mechanism are you using ?
> HDFS backup setup is a lot more sophisticated, and backup repository
> settings made in the solr.xml manage lots of these things.
> The node from where you issue the command would not have any bearing on
> the target collections’s data that you are trying to backup.
> Backup will reach the designated destination, with all the data from your
> collection.
>
> Thats why knowing your setup and settings for backup would help in
> advising you better.
>
> Thanks
> Aroop
>
> > On Feb 20, 2020, at 8:25 AM, Koen De Groote <ko...@limecraft.com>
> wrote:
> >
> > Hello all,
> >
> > I've recently set up backups, using solr 7.6
> >
> > My setup has 3 replicas per collection and several collections. Not all
> > collections or replicas are present on all hosts.
> >
> > That being said, I run the backup command from 1 particular host and only
> > that host has access to the mount on which the backup data will be
> written.
> >
> > This means that the host writing the backup data doesn't have all the
> data
> > on its local filesystem.
> >
> > Is this a problem?
> >
> > By which I mean: will data not present on that host be retrieved over the
> > network?
> >
> > What happens in this case?
> >
> > Kind regards,
> > Koen De Groote
>
>

Re: Backups with only 1 machine having access to remote storage?

Posted by Aroop Ganguly <ar...@icloud.com.INVALID>.
Hi Koen

Which backup mechanism are you using ?
HDFS backup setup is a lot more sophisticated, and backup repository settings made in the solr.xml manage lots of these things.
The node from where you issue the command would not have any bearing on the target collections’s data that you are trying to backup.
Backup will reach the designated destination, with all the data from your collection.

Thats why knowing your setup and settings for backup would help in advising you better.

Thanks
Aroop

> On Feb 20, 2020, at 8:25 AM, Koen De Groote <ko...@limecraft.com> wrote:
> 
> Hello all,
> 
> I've recently set up backups, using solr 7.6
> 
> My setup has 3 replicas per collection and several collections. Not all
> collections or replicas are present on all hosts.
> 
> That being said, I run the backup command from 1 particular host and only
> that host has access to the mount on which the backup data will be written.
> 
> This means that the host writing the backup data doesn't have all the data
> on its local filesystem.
> 
> Is this a problem?
> 
> By which I mean: will data not present on that host be retrieved over the
> network?
> 
> What happens in this case?
> 
> Kind regards,
> Koen De Groote