You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Dong Wang <wa...@gmail.com> on 2007/09/05 09:56:14 UTC

The mechanism of data replciation in Solr?

Hello, everybody:-)
I'm interested with the mechanism of data replciation in Solr, In the
"Introduction to the solr enterprise Search Server", Replication is
one of features of Solr, but I can't find anything about replication
issues on the Web site and documents, including how to split the
index, how to distribute the chunks of index, how to placement the
replica, eager replicaton  or lazy replication..etc. I think  they are
different from the problem in HDFS.
Can anybody help me? Thank you in advance.

Best Wishes.

Re: The mechanism of data replciation in Solr?

Posted by Chris Hostetter <ho...@fucit.org>.

: snapshot. This technique has these advantages: Can keep multiple
: snapshots on each host without the need to keep multiple copies of
: index files that have not changed. File copying from master to slave

: Why do hard links make file copying between master and slave fast?
: Thanks. Best Regards.

bullets 2 and 3 build off of bullet 1 ... the Lucene file format is 
desigend such that files are only ever added, appended to, or deleted -- 
there is never in rewriting of existing bytes in a file.  so having 
hardlinks to the orriginal files in the snapshot directories on both 
the master/slave means that the rsync operation of a new snapshot only 
needs to send the new data, not diffs or full contents of existing files.



-Hoss

Re: The mechanism of data replciation in Solr?

Posted by Dong Wang <wa...@gmail.com>.

Thank you, Thorsten Scherler and Bill Au.I'm so indiscretionary to
post this question, Thanks for your patience.
Ok, Here comes my new questions, Solr's Wiki says
"All the files in the index directory are hard links to the latest
snapshot. This technique has these advantages: Can keep multiple
snapshots on each host without the need to keep multiple copies of
index files that have not changed. File copying from master to slave
is
very fast...balabala........
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^
"
Why do hard links make file copying between master and slave fast?
Thanks. Best Regards.

--
Wang

2007/9/5, Bill Au <bi...@gmail.com>:
> The front page of the Solr WIki has a small section on replication:
>
> http://wiki.apache.org/solr/
>
> Solr's built-in replication does not split the index.  It replicate the
> entire index by only copying files that have changed.
>
> Bill
>
>
> On 9/5/07, Dong Wang <wa...@gmail.com> wrote:
> >
> > Hello, everybody:-)
> > I'm interested with the mechanism of data replciation in Solr, In the
> > "Introduction to the solr enterprise Search Server", Replication is
> > one of features of Solr, but I can't find anything about replication
> > issues on the Web site and documents, including how to split the
> > index, how to distribute the chunks of index, how to placement the
> > replica, eager replicaton  or lazy replication..etc. I think  they are
> > different from the problem in HDFS.
> > Can anybody help me? Thank you in advance.
> >
> > Best Wishes.
> >
>

Re: The mechanism of data replciation in Solr?

Posted by Bill Au <bi...@gmail.com>.

The front page of the Solr WIki has a small section on replication:

http://wiki.apache.org/solr/

Solr's built-in replication does not split the index.  It replicate the
entire index by only copying files that have changed.

Bill


On 9/5/07, Dong Wang <wa...@gmail.com> wrote:
>
> Hello, everybody:-)
> I'm interested with the mechanism of data replciation in Solr, In the
> "Introduction to the solr enterprise Search Server", Replication is
> one of features of Solr, but I can't find anything about replication
> issues on the Web site and documents, including how to split the
> index, how to distribute the chunks of index, how to placement the
> replica, eager replicaton  or lazy replication..etc. I think  they are
> different from the problem in HDFS.
> Can anybody help me? Thank you in advance.
>
> Best Wishes.
>

Re: The mechanism of data replciation in Solr?

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.

On Wed, 2007-09-05 at 15:56 +0800, Dong Wang wrote:
> Hello, everybody:-)
> I'm interested with the mechanism of data replciation in Solr, In the
> "Introduction to the solr enterprise Search Server", Replication is
> one of features of Solr, but I can't find anything about replication
> issues on the Web site and documents, including how to split the
> index, how to distribute the chunks of index, how to placement the
> replica, eager replicaton  or lazy replication..etc. I think  they are
> different from the problem in HDFS.
> Can anybody help me? Thank you in advance.

http://wiki.apache.org/solr/CollectionDistribution

HTH
> 
> Best Wishes.
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions