You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by jason rutherglen <ja...@yahoo.com> on 2006/03/28 03:57:17 UTC

Rsync

I was thinking, would it not be possible to avoid using rsync and record a list of all new segment files added (from within Lucene), and simply use HTTP to sync down the newest ones?  Perhaps only using rsync after an optimize?  Seems like if I understand Lucene correctly only new files are created?


Re: Rsync

Posted by Yonik Seeley <ys...@gmail.com>.
On 3/29/06, jason rutherglen <ja...@yahoo.com> wrote:
> Perhaps a future project to increase the speed of the syncing to sub-minute times.  Sounds like two files will change, in addition to segment files being added.  Is this correct?  Or maybe other pieces such as cache reloading would make this more difficult.

rsync will only copy over the changed index files, not the whole index
each time.

-Yonik

Re: Rsync

Posted by jason rutherglen <ja...@yahoo.com>.
Perhaps a future project to increase the speed of the syncing to sub-minute times.  Sounds like two files will change, in addition to segment files being added.  Is this correct?  Or maybe other pieces such as cache reloading would make this more difficult.  

----- Original Message ----
From: Bill Au <bi...@gmail.com>
To: solr-user@lucene.apache.org; jason rutherglen <ja...@yahoo.com>
Sent: Wednesday, March 29, 2006 6:07:35 AM
Subject: Re: Rsync

The segments file will change when new segments are created.  But what I
really
meant before was that the file deletable also changes when document are
deleted from
the index.

Bill

On 3/28/06, Bill Au <bi...@gmail.com> wrote:
>
> I think the segments file will also change if documents are deleted from
> the index.
>
> Other ways to distribute the index will works as long as:
>
> 1) it makes a copy of the index that is in a consistent state
>
> 2) it keeps track of files that have changed (normally only a small
> amount)
> and transfter them to the slave
>
> Lucene can certainly record a list of all new segment files added.  I
> think the tricky part
> is to ensure that a consistent copy of the index is being distributed.
>
> Bill
>
>
> On 3/27/06, jason rutherglen <ja...@yahoo.com> wrote:
> >
> > I was thinking, would it not be possible to avoid using rsync and record
> > a list of all new segment files added (from within Lucene), and simply use
> > HTTP to sync down the newest ones?  Perhaps only using rsync after an
> > optimize?  Seems like if I understand Lucene correctly only new files are
> > created?
> >
> >
> >
>




Re: Rsync

Posted by Bill Au <bi...@gmail.com>.
The segments file will change when new segments are created.  But what I
really
meant before was that the file deletable also changes when document are
deleted from
the index.

Bill

On 3/28/06, Bill Au <bi...@gmail.com> wrote:
>
> I think the segments file will also change if documents are deleted from
> the index.
>
> Other ways to distribute the index will works as long as:
>
> 1) it makes a copy of the index that is in a consistent state
>
> 2) it keeps track of files that have changed (normally only a small
> amount)
> and transfter them to the slave
>
> Lucene can certainly record a list of all new segment files added.  I
> think the tricky part
> is to ensure that a consistent copy of the index is being distributed.
>
> Bill
>
>
> On 3/27/06, jason rutherglen <ja...@yahoo.com> wrote:
> >
> > I was thinking, would it not be possible to avoid using rsync and record
> > a list of all new segment files added (from within Lucene), and simply use
> > HTTP to sync down the newest ones?  Perhaps only using rsync after an
> > optimize?  Seems like if I understand Lucene correctly only new files are
> > created?
> >
> >
> >
>

Re: Rsync

Posted by Bill Au <bi...@gmail.com>.
I think the segments file will also change if documents are deleted from the
index.

Other ways to distribute the index will works as long as:

1) it makes a copy of the index that is in a consistent state

2) it keeps track of files that have changed (normally only a small amount)
and transfter them to the slave

Lucene can certainly record a list of all new segment files added.  I think
the tricky part
is to ensure that a consistent copy of the index is being distributed.

Bill

On 3/27/06, jason rutherglen <ja...@yahoo.com> wrote:
>
> I was thinking, would it not be possible to avoid using rsync and record a
> list of all new segment files added (from within Lucene), and simply use
> HTTP to sync down the newest ones?  Perhaps only using rsync after an
> optimize?  Seems like if I understand Lucene correctly only new files are
> created?
>
>
>