You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Wael Kader <wa...@softech-lb.com> on 2018/01/18 09:21:16 UTC

SOLR Data Backup

Hello,

Whats the best way to do a backup of the SOLR data.
I have a single node solr server and I want to always keep a copy of the
data I have.

Is replication an option for what I want ?

I would like to get some tutorials and papers if possible on the method
that should be used in case its backup or replication or anything else.

-- 
Regards,
Wael

Re: SOLR Data Backup

Posted by Emir Arnautović <em...@sematext.com>.
Hi Wael,
I am not sure about moving data in HDFS but you should be able to set up slave without reindexing. Did you start the first node in standalone mode?
You need to check if replication handler is enabled (should be by default) and set up slave to pull data from the first node. Note that slave should be placed on a separate host.
If you are using HDFS for storing your index and if you can tolerate some downtime, isn’t that enough to satisfy your FT requirement? In master-slave setup you still have SPOF - master node - if it goes out, updates will stop.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Jan 2018, at 11:20, Wael Kader <wa...@softech-lb.com> wrote:
> 
> Hi,
> 
> The data is always changing for me so I think I can try the replication
> option.
> I am using cloudera and the data is saved in HDFS. Is it possible for me to
> move the data while the index is running without any problems ?
> 
> I would also like to know if its possible to setup slave/master replication
> without rebuilding the index.
> 
> Thanks,
> Wael
> 
> On Thu, Jan 18, 2018 at 12:06 PM, Wael Kader <wa...@softech-lb.com> wrote:
> 
>> Hi,
>> 
>> Its not possible for me to re-index the data in some of my indexes is only
>> saved in SOLR.
>> I need this solution to make sure that in case the live index fails, I can
>> move to the backup or replicated index.
>> 
>> Thanks,
>> Wael
>> 
>> On Thu, Jan 18, 2018 at 11:41 AM, Charlie Hull <ch...@flax.co.uk> wrote:
>> 
>>> On 18/01/2018 09:21, Wael Kader wrote:
>>> 
>>>> Hello,
>>>> 
>>>> Whats the best way to do a backup of the SOLR data.
>>>> I have a single node solr server and I want to always keep a copy of the
>>>> data I have.
>>>> 
>>>> Is replication an option for what I want ?
>>>> 
>>>> I would like to get some tutorials and papers if possible on the method
>>>> that should be used in case its backup or replication or anything else.
>>>> 
>>>> 
>>> Hi Wael,
>>> 
>>> Have you considered backing up the source data instead? You can always
>>> re-index to re-create the Solr data.
>>> 
>>> Replication will certainly allow you to maintain a copy of the Solr data,
>>> either so you can handle more search traffic by load balancing between the
>>> two, or to provide a failover capability in the case of a server failure.
>>> But this isn't a backup in the traditional sense. You shouldn't consider
>>> Solr as your 'source of truth' unless for some reason it is impossible to
>>> re-index.
>>> 
>>> Perhaps if you could let us know why you think you need a backup we can
>>> suggest the best solution.
>>> 
>>> Cheers
>>> 
>>> Charlie
>>> 
>>> --
>>> Charlie Hull
>>> Flax - Open Source Enterprise Search
>>> 
>>> tel/fax: +44 (0)8700 118334
>>> mobile:  +44 (0)7767 825828
>>> web: www.flax.co.uk
>>> 
>> 
>> 
>> 
>> --
>> Regards,
>> Wael
>> 
> 
> 
> 
> -- 
> Regards,
> Wael


Re: SOLR Data Backup

Posted by Wael Kader <wa...@softech-lb.com>.
Hi,

The data is always changing for me so I think I can try the replication
option.
I am using cloudera and the data is saved in HDFS. Is it possible for me to
move the data while the index is running without any problems ?

I would also like to know if its possible to setup slave/master replication
without rebuilding the index.

Thanks,
Wael

On Thu, Jan 18, 2018 at 12:06 PM, Wael Kader <wa...@softech-lb.com> wrote:

> Hi,
>
> Its not possible for me to re-index the data in some of my indexes is only
> saved in SOLR.
> I need this solution to make sure that in case the live index fails, I can
> move to the backup or replicated index.
>
> Thanks,
> Wael
>
> On Thu, Jan 18, 2018 at 11:41 AM, Charlie Hull <ch...@flax.co.uk> wrote:
>
>> On 18/01/2018 09:21, Wael Kader wrote:
>>
>>> Hello,
>>>
>>> Whats the best way to do a backup of the SOLR data.
>>> I have a single node solr server and I want to always keep a copy of the
>>> data I have.
>>>
>>> Is replication an option for what I want ?
>>>
>>> I would like to get some tutorials and papers if possible on the method
>>> that should be used in case its backup or replication or anything else.
>>>
>>>
>> Hi Wael,
>>
>> Have you considered backing up the source data instead? You can always
>> re-index to re-create the Solr data.
>>
>> Replication will certainly allow you to maintain a copy of the Solr data,
>> either so you can handle more search traffic by load balancing between the
>> two, or to provide a failover capability in the case of a server failure.
>> But this isn't a backup in the traditional sense. You shouldn't consider
>> Solr as your 'source of truth' unless for some reason it is impossible to
>> re-index.
>>
>> Perhaps if you could let us know why you think you need a backup we can
>> suggest the best solution.
>>
>> Cheers
>>
>> Charlie
>>
>> --
>> Charlie Hull
>> Flax - Open Source Enterprise Search
>>
>> tel/fax: +44 (0)8700 118334
>> mobile:  +44 (0)7767 825828
>> web: www.flax.co.uk
>>
>
>
>
> --
> Regards,
> Wael
>



-- 
Regards,
Wael

Re: SOLR Data Backup

Posted by Emir Arnautović <em...@sematext.com>.
Hi Weal,
In general, it is not recommended to use Solr as a primary storage so it is better to store your data somewhere else. That will allow you o reindex if needed and also allow you to not store some field in index and make it more efficient.
When it comes to your original question, it seems that backup/restore feature should be enough for you. If it is single node and static index, you can even do it manually - simply backup index folder, and when need to restore on some other place, create collection/core with same config, replace index folder with one that you backed up and reload collection/core.

Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Jan 2018, at 11:06, Wael Kader <wa...@softech-lb.com> wrote:
> 
> Hi,
> 
> Its not possible for me to re-index the data in some of my indexes is only
> saved in SOLR.
> I need this solution to make sure that in case the live index fails, I can
> move to the backup or replicated index.
> 
> Thanks,
> Wael
> 
> On Thu, Jan 18, 2018 at 11:41 AM, Charlie Hull <ch...@flax.co.uk> wrote:
> 
>> On 18/01/2018 09:21, Wael Kader wrote:
>> 
>>> Hello,
>>> 
>>> Whats the best way to do a backup of the SOLR data.
>>> I have a single node solr server and I want to always keep a copy of the
>>> data I have.
>>> 
>>> Is replication an option for what I want ?
>>> 
>>> I would like to get some tutorials and papers if possible on the method
>>> that should be used in case its backup or replication or anything else.
>>> 
>>> 
>> Hi Wael,
>> 
>> Have you considered backing up the source data instead? You can always
>> re-index to re-create the Solr data.
>> 
>> Replication will certainly allow you to maintain a copy of the Solr data,
>> either so you can handle more search traffic by load balancing between the
>> two, or to provide a failover capability in the case of a server failure.
>> But this isn't a backup in the traditional sense. You shouldn't consider
>> Solr as your 'source of truth' unless for some reason it is impossible to
>> re-index.
>> 
>> Perhaps if you could let us know why you think you need a backup we can
>> suggest the best solution.
>> 
>> Cheers
>> 
>> Charlie
>> 
>> --
>> Charlie Hull
>> Flax - Open Source Enterprise Search
>> 
>> tel/fax: +44 (0)8700 118334
>> mobile:  +44 (0)7767 825828
>> web: www.flax.co.uk
>> 
> 
> 
> 
> -- 
> Regards,
> Wael


Re: SOLR Data Backup

Posted by Charlie Hull <ch...@flax.co.uk>.
On 18/01/2018 10:06, Wael Kader wrote:
> Hi,
> 
> Its not possible for me to re-index the data in some of my indexes is only
> saved in SOLR.
> I need this solution to make sure that in case the live index fails, I can
> move to the backup or replicated index.

OK, so now it's down to you to decide whether you want the backup index 
immediately available for transparent failover, in which case Solr Cloud 
is probably the way to go, or use what Emir recommends to regularly 
create an offline backup you can restore from (bearing in mind this may 
not be *totally* up to date in case of a failure and restoration may 
take a while).

Charlie
> 
> Thanks,
> Wael
> 
> On Thu, Jan 18, 2018 at 11:41 AM, Charlie Hull <ch...@flax.co.uk> wrote:
> 
>> On 18/01/2018 09:21, Wael Kader wrote:
>>
>>> Hello,
>>>
>>> Whats the best way to do a backup of the SOLR data.
>>> I have a single node solr server and I want to always keep a copy of the
>>> data I have.
>>>
>>> Is replication an option for what I want ?
>>>
>>> I would like to get some tutorials and papers if possible on the method
>>> that should be used in case its backup or replication or anything else.
>>>
>>>
>> Hi Wael,
>>
>> Have you considered backing up the source data instead? You can always
>> re-index to re-create the Solr data.
>>
>> Replication will certainly allow you to maintain a copy of the Solr data,
>> either so you can handle more search traffic by load balancing between the
>> two, or to provide a failover capability in the case of a server failure.
>> But this isn't a backup in the traditional sense. You shouldn't consider
>> Solr as your 'source of truth' unless for some reason it is impossible to
>> re-index.
>>
>> Perhaps if you could let us know why you think you need a backup we can
>> suggest the best solution.
>>
>> Cheers
>>
>> Charlie
>>
>> --
>> Charlie Hull
>> Flax - Open Source Enterprise Search
>>
>> tel/fax: +44 (0)8700 118334
>> mobile:  +44 (0)7767 825828
>> web: www.flax.co.uk
>>
> 
> 
> 


-- 
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk

Re: SOLR Data Backup

Posted by Wael Kader <wa...@softech-lb.com>.
Hi,

Its not possible for me to re-index the data in some of my indexes is only
saved in SOLR.
I need this solution to make sure that in case the live index fails, I can
move to the backup or replicated index.

Thanks,
Wael

On Thu, Jan 18, 2018 at 11:41 AM, Charlie Hull <ch...@flax.co.uk> wrote:

> On 18/01/2018 09:21, Wael Kader wrote:
>
>> Hello,
>>
>> Whats the best way to do a backup of the SOLR data.
>> I have a single node solr server and I want to always keep a copy of the
>> data I have.
>>
>> Is replication an option for what I want ?
>>
>> I would like to get some tutorials and papers if possible on the method
>> that should be used in case its backup or replication or anything else.
>>
>>
> Hi Wael,
>
> Have you considered backing up the source data instead? You can always
> re-index to re-create the Solr data.
>
> Replication will certainly allow you to maintain a copy of the Solr data,
> either so you can handle more search traffic by load balancing between the
> two, or to provide a failover capability in the case of a server failure.
> But this isn't a backup in the traditional sense. You shouldn't consider
> Solr as your 'source of truth' unless for some reason it is impossible to
> re-index.
>
> Perhaps if you could let us know why you think you need a backup we can
> suggest the best solution.
>
> Cheers
>
> Charlie
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.flax.co.uk
>



-- 
Regards,
Wael

Re: SOLR Data Backup

Posted by Charlie Hull <ch...@flax.co.uk>.
On 18/01/2018 09:21, Wael Kader wrote:
> Hello,
> 
> Whats the best way to do a backup of the SOLR data.
> I have a single node solr server and I want to always keep a copy of the
> data I have.
> 
> Is replication an option for what I want ?
> 
> I would like to get some tutorials and papers if possible on the method
> that should be used in case its backup or replication or anything else.
> 

Hi Wael,

Have you considered backing up the source data instead? You can always 
re-index to re-create the Solr data.

Replication will certainly allow you to maintain a copy of the Solr 
data, either so you can handle more search traffic by load balancing 
between the two, or to provide a failover capability in the case of a 
server failure. But this isn't a backup in the traditional sense. You 
shouldn't consider Solr as your 'source of truth' unless for some reason 
it is impossible to re-index.

Perhaps if you could let us know why you think you need a backup we can 
suggest the best solution.

Cheers

Charlie

-- 
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk

Re: SOLR Data Backup

Posted by Rick Leir <rl...@leirtech.com>.

.
>
>BTW, why do we not recommend having Solr as a source of truth?
>
One reason is that you might want to tune the analysis chain and then reindex.

Or your data gets progressively larger, and you want to be able to recover from an OOM during indexing. 
Rick

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: SOLR Data Backup

Posted by S G <sg...@gmail.com>.
Another option is to have CDCR enabled for Solr and replicate your data to
another Solr cluster continuously.

BTW, why do we not recommend having Solr as a source of truth?

On Thu, Jan 18, 2018 at 4:08 AM, Florian Gleixner <fl...@redflo.de> wrote:

> Am 18.01.2018 um 10:21 schrieb Wael Kader:
> > Hello,
> >
> > Whats the best way to do a backup of the SOLR data.
> > I have a single node solr server and I want to always keep a copy of the
> > data I have.
> >
> > Is replication an option for what I want ?
> >
> > I would like to get some tutorials and papers if possible on the method
> > that should be used in case its backup or replication or anything else.
> >
>
> The reference manual will help you:
>
>
> https://lucene.apache.org/solr/guide/6_6/making-and-
> restoring-backups.html#standalone-mode-backups
>
>

Re: SOLR Data Backup

Posted by Florian Gleixner <fl...@redflo.de>.
Am 18.01.2018 um 10:21 schrieb Wael Kader:
> Hello,
> 
> Whats the best way to do a backup of the SOLR data.
> I have a single node solr server and I want to always keep a copy of the
> data I have.
> 
> Is replication an option for what I want ?
> 
> I would like to get some tutorials and papers if possible on the method
> that should be used in case its backup or replication or anything else.
> 

The reference manual will help you:


https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html#standalone-mode-backups


Re: SOLR Data Backup

Posted by Emir Arnautović <em...@sematext.com>.
Hi Weal,
If you want HA and FT you have to have at least two Solr nodes and 3 zookeeper nodes (if you plan on using SolrCloud).

If you want just to be sure you don’t have to reindex your data in case something goes wrong, you can use Solr backup feature: https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html <https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html>

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Jan 2018, at 10:21, Wael Kader <wa...@softech-lb.com> wrote:
> 
> Hello,
> 
> Whats the best way to do a backup of the SOLR data.
> I have a single node solr server and I want to always keep a copy of the
> data I have.
> 
> Is replication an option for what I want ?
> 
> I would like to get some tutorials and papers if possible on the method
> that should be used in case its backup or replication or anything else.
> 
> -- 
> Regards,
> Wael