You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Doss <it...@gmail.com> on 2007/04/19 15:39:37 UTC

Snapshooting or replicating recently indexed data

Hi,

It seems the snapshooter  takes the exact copy of the indexed data, that is all the contents inside the index directory,  how can we take the recently added once? 
...
cp -lr ${data_dir}/index ${temp}
mv ${temp} ${name} ...

Thanks,
Doss.

Re: Snapshooting or replicating recently indexed data

Posted by Bill Au <bi...@gmail.com>.
Here's the Solr Wiki on collection distribution:

http://wiki.apache.org/solr/CollectionDistribution

It describes the "incremental" nature of the distribution:

A collection is a directory of many files. Collections are distributed
to the slaves as snapshots of these files. Each snapshot is made up of
hard links to the files so copying of the actual files is not
necessary when snapshots are created. Lucene only significantly
rewrites files following an optimization command. Generally, a file
once written, will change very little if at all. This makes the
underlying transport of rsync very useful. Files that have already
been transfered and have not changed do not need to be re-transferred
with the new edition of a collection.

Bill

On 4/21/07, Kevin Lewandowski <ke...@gmail.com> wrote:
> snapshooter does create incremental builds of the index. It doesn't
> appear so if you look at the contents because the existing files are
> hard links. But it is incremental.
>
> On 4/20/07, Doss <it...@gmail.com> wrote:
> > Hi Yonik,
> >
> > Thanks for your quick response, my question is this, can we take incremental
> > backup/replication in SOLR?
> >
> > Regards,
> > Doss.
> >
> >
> > M. MOHANDOSS Software Engineer Ext: 507 (A BharatMatrimony Enterprise)
> > ----- Original Message -----
> > From: "Yonik Seeley" <yo...@apache.org>
> > To: <so...@lucene.apache.org>
> > Sent: Thursday, April 19, 2007 7:42 PM
> > Subject: Re: Snapshooting or replicating recently indexed data
> >
> >
> > > On 4/19/07, Doss <it...@gmail.com> wrote:
> > >> It seems the snapshooter  takes the exact copy of the indexed data, that
> > >> is all the contents inside the index directory,  how can we take the
> > >> recently added once?
> > >> ...
> > >> cp -lr ${data_dir}/index ${temp}
> > >> mv ${temp} ${name} ...
> > >
> > >
> > > I don't quite understand your question, but since hard links are used,
> > > it's more like pointing to the index files instead of copying them.
> > > Rsync is used as a transport to only move the files that were changed
> > > from the master to slaves.
> > >
> > > -Yonik
> >
> >
>

Re: Snapshooting or replicating recently indexed data

Posted by Kevin Lewandowski <ke...@gmail.com>.
snapshooter does create incremental builds of the index. It doesn't
appear so if you look at the contents because the existing files are
hard links. But it is incremental.

On 4/20/07, Doss <it...@gmail.com> wrote:
> Hi Yonik,
>
> Thanks for your quick response, my question is this, can we take incremental
> backup/replication in SOLR?
>
> Regards,
> Doss.
>
>
> M. MOHANDOSS Software Engineer Ext: 507 (A BharatMatrimony Enterprise)
> ----- Original Message -----
> From: "Yonik Seeley" <yo...@apache.org>
> To: <so...@lucene.apache.org>
> Sent: Thursday, April 19, 2007 7:42 PM
> Subject: Re: Snapshooting or replicating recently indexed data
>
>
> > On 4/19/07, Doss <it...@gmail.com> wrote:
> >> It seems the snapshooter  takes the exact copy of the indexed data, that
> >> is all the contents inside the index directory,  how can we take the
> >> recently added once?
> >> ...
> >> cp -lr ${data_dir}/index ${temp}
> >> mv ${temp} ${name} ...
> >
> >
> > I don't quite understand your question, but since hard links are used,
> > it's more like pointing to the index files instead of copying them.
> > Rsync is used as a transport to only move the files that were changed
> > from the master to slaves.
> >
> > -Yonik
>
>

Re: Snapshooting or replicating recently indexed data

Posted by Doss <it...@gmail.com>.
Hi Yonik,

Thanks for your quick response, my question is this, can we take incremental 
backup/replication in SOLR?

Regards,
Doss.


M. MOHANDOSS Software Engineer Ext: 507 (A BharatMatrimony Enterprise)
----- Original Message ----- 
From: "Yonik Seeley" <yo...@apache.org>
To: <so...@lucene.apache.org>
Sent: Thursday, April 19, 2007 7:42 PM
Subject: Re: Snapshooting or replicating recently indexed data


> On 4/19/07, Doss <it...@gmail.com> wrote:
>> It seems the snapshooter  takes the exact copy of the indexed data, that 
>> is all the contents inside the index directory,  how can we take the 
>> recently added once?
>> ...
>> cp -lr ${data_dir}/index ${temp}
>> mv ${temp} ${name} ...
>
>
> I don't quite understand your question, but since hard links are used,
> it's more like pointing to the index files instead of copying them.
> Rsync is used as a transport to only move the files that were changed
> from the master to slaves.
>
> -Yonik 


Re: Snapshooting or replicating recently indexed data

Posted by Yonik Seeley <yo...@apache.org>.
On 4/19/07, Doss <it...@gmail.com> wrote:
> It seems the snapshooter  takes the exact copy of the indexed data, that is all the contents inside the index directory,  how can we take the recently added once?
> ...
> cp -lr ${data_dir}/index ${temp}
> mv ${temp} ${name} ...


I don't quite understand your question, but since hard links are used,
it's more like pointing to the index files instead of copying them.
Rsync is used as a transport to only move the files that were changed
from the master to slaves.

-Yonik