You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Britske <gb...@gmail.com> on 2008/10/15 23:14:37 UTC

solr on raid 0 --> no performance gain while indexing?

Hi, 

I understand that this may not be a 100% related question to the forum
(perhaps it's more Lucene than Solr) but perhaps someone here has seen
similar things...

I'm experimenting on Amazon Ec2 with indexing a solr / lucene index on a
striped (Raid 0) partition. 

While searching gives good benefits of using a single harddisk I see no
improvement is indexing over a single disk.  
Now I'm not at all a linux-guru but doing basic random write / read
io-testing with bonnie+ leads me to conclude that the raid is properly
setup, and is performing good.  I'm running Ubuntu 8.0.4 / Mdadm as software
raid / Xfs as file system btw.

The data i'm creating is very index heavy, e.g: over 1000 indices. 
Would this be a reason for not seeing better performance with indexing than
on a single disk? I'm guessing here: perhaps creating / shifting / altering
the indices after each insert creates such a load between physical disks
that the normal write scenario (of software raid 0) of writing sequential
chunks in round-robin fashion to all the disks in the array no longer holds? 

Does this seem logical or does someone know another reason?

Thanks,
Britske
-- 
View this message in context: http://www.nabble.com/solr-on-raid-0---%3E-no-performance-gain-while-indexing--tp20002623p20002623.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr on raid 0 --> no performance gain while indexing?

Posted by Britske <gb...@gmail.com>.
As a 'workaround' :
would instead of striping the available disks, but treating them as N silos
and merging the indices afterwards be an option ?


Britske wrote:
> 
> Hi, 
> 
> I understand that this may not be a 100% related question to the forum
> (perhaps it's more Lucene than Solr) but perhaps someone here has seen
> similar things...
> 
> I'm experimenting on Amazon Ec2 with indexing a solr / lucene index on a
> striped (Raid 0) partition. 
> 
> While searching gives good benefits of using a single harddisk I see no
> improvement is indexing over a single disk.  
> Now I'm not at all a linux-guru but doing basic random write / read
> io-testing with bonnie+ leads me to conclude that the raid is properly
> setup, and is performing good.  I'm running Ubuntu 8.0.4 / Mdadm as
> software raid / Xfs as file system btw.
> 
> The data i'm creating is very index heavy, e.g: over 1000 indices. 
> Would this be a reason for not seeing better performance with indexing
> than on a single disk? I'm guessing here: perhaps creating / shifting /
> altering the indices after each insert creates such a load between
> physical disks that the normal write scenario (of software raid 0) of
> writing sequential chunks in round-robin fashion to all the disks in the
> array no longer holds? 
> 
> Does this seem logical or does someone know another reason?
> 
> Thanks,
> Britske
> 

-- 
View this message in context: http://www.nabble.com/solr-on-raid-0---%3E-no-performance-gain-while-indexing--tp20002623p20002667.html
Sent from the Solr - User mailing list archive at Nabble.com.