You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Marc Sturlese <ma...@gmail.com> on 2009/10/23 18:41:31 UTC

keep index in production and snapshots in separate phisical disks

Is there any way to make snapinstaller install the index in
spanpshot20091023124543 (for example) from another disk? I am asking this
because I would like not to optimize the index in the master (if I do that
it takes a long time to send it via rsync if it is so big). This way I would
just have to send the new segments.
In the slave I would have 2 phisical disks. Snappuller would send the
snapshot to a disk (here the index would not be optimized). Snapinstaller
would install the snapshot in the other disk, optimize it and open the
newIndexReader. The optimization should be done in the disk wich contains
the "not in production index" to not affect the search request speed.
Any idea what should I hack to reach this goal in case it is possible?
-- 
View this message in context: http://www.nabble.com/keep-index-in-production-and-snapshots-in-separate-phisical-disks-tp26029666p26029666.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: keep index in production and snapshots in separate phisical disks

Posted by Chris Hostetter <ho...@fucit.org>.

: Is there any way to make snapinstaller install the index in
: spanpshot20091023124543 (for example) from another disk? I am asking this

you're talking about the old script based replication correct? ... i don't 
think that's possible since it relied on hardlinks and atomic move 
operations.  i don't think those work across physical disks.

: because I would like not to optimize the index in the master (if I do that
: it takes a long time to send it via rsync if it is so big). This way I would
: just have to send the new segments.

in Solr 1.4 the disadvantages of having a non-optimized index are fading 
away ... before you worry too much about this you might want to run some 
tests and verify that you really need to optimize to meet your performance 
targets.

: In the slave I would have 2 phisical disks. Snappuller would send the
: snapshot to a disk (here the index would not be optimized). Snapinstaller
: would install the snapshot in the other disk, optimize it and open the
: newIndexReader. The optimization should be done in the disk wich contains
: the "not in production index" to not affect the search request speed.
: Any idea what should I hack to reach this goal in case it is possible?

I'm not really familiar with the new java based replication code, but i 
suspect this could be setup fairly easily by running two solr instances on 
your slave boxes ... one serving as a repeater (ie: both a slave and a 
master) to the other.  on the repeater port you would run the optimize, 
and then the leaf level slave (serving queries to end users) would 
replicate that optimized index over the loopback address (no network 
overhead, should be ~fast as a file copy from a different disk)

...this is all just theory mind you, i've never tried this.


-Hoss