You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by marotosg <ma...@gmail.com> on 2013/05/08 16:12:46 UTC

Indexing 4 different cores same machine

Hi,

I have 4 different cores in same machine. 
Person core -> 3 million docs   -> 20 GB size
Company Core  -> 1 million docs -> 2GB size
Documents Core -> 5 million docs -> 5GB size
Emails Core -> 50,000 thousand  -> 200 Mb

While I am indexing data performance in server is almost the same if I am
indexing only one core or all
cores at the same time.

I thought having different cores allow you to get different threads in
parallel gaining some performance.
Am I right?. My server is never reaching 100% CPU use. It always about 50%
or even less.
I had a look to I/O and it is not a problem.

Any ideas?

Thanks
Sergio





--
View this message in context: http://lucene.472066.n3.nabble.com/Indexing-4-different-cores-same-machine-tp4061576.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing 4 different cores same machine

Posted by Shawn Heisey <so...@elyograg.org>.
On 5/8/2013 8:12 AM, marotosg wrote:
> Hi,
> 
> I have 4 different cores in same machine. 
> Person core -> 3 million docs   -> 20 GB size
> Company Core  -> 1 million docs -> 2GB size
> Documents Core -> 5 million docs -> 5GB size
> Emails Core -> 50,000 thousand  -> 200 Mb
> 
> While I am indexing data performance in server is almost the same if I am
> indexing only one core or all
> cores at the same time.
> 
> I thought having different cores allow you to get different threads in
> parallel gaining some performance.
> Am I right?. My server is never reaching 100% CPU use. It always about 50%
> or even less.
> I had a look to I/O and it is not a problem.

You say that I/O performance appears to be good, but I/O is still likely
the bottleneck here.  When you are indexing them sequentially, each one
has access to full I/O resources, so each one goes at top speed.  If you
do them all at the same time, then they are competing for I/O resources,
so one can do its thing and the others have to wait until the I/O
scheduler can work on their requests.

In most cases, Solr is I/O bound, and the fact that it takes the same
amount of time either way is additional support for the idea that you
are limited by I/O resources, not CPU resources.  Your I/O system is
keeping up, which is good.  If it weren't keeping up, parallel indexing
would actually take even longer.

Thanks,
Shawn


Re: Indexing 4 different cores same machine

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,

Right, the network could be something else - memory of network, for
instance.  What are you using to index?  Make sure you're hitting Solr
with multiple threads if your CPU is multi-core.  Use SPM for Solr or
anything else and share some Solr monitoring graphs if you think they
can help.  And/or share some of your indexing code.

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Wed, May 8, 2013 at 10:12 AM, marotosg <ma...@gmail.com> wrote:
> Hi,
>
> I have 4 different cores in same machine.
> Person core -> 3 million docs   -> 20 GB size
> Company Core  -> 1 million docs -> 2GB size
> Documents Core -> 5 million docs -> 5GB size
> Emails Core -> 50,000 thousand  -> 200 Mb
>
> While I am indexing data performance in server is almost the same if I am
> indexing only one core or all
> cores at the same time.
>
> I thought having different cores allow you to get different threads in
> parallel gaining some performance.
> Am I right?. My server is never reaching 100% CPU use. It always about 50%
> or even less.
> I had a look to I/O and it is not a problem.
>
> Any ideas?
>
> Thanks
> Sergio
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Indexing-4-different-cores-same-machine-tp4061576.html
> Sent from the Solr - User mailing list archive at Nabble.com.