You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jihyun Suh <jh...@gmail.com> on 2012/06/06 03:25:45 UTC

Solr, I have perfomance problem for indexing.

I have 128 tables of mysql 5.x and each table have 3,5000 rows.
When I start dataimport(indexing) in Solr, it takes 5 minutes for one
table.
But When Solr indexs 20th table, it takes around 10 minutes for one table.
And then When it indexs 40th table, it takes around 20 minutes for one
table.

Solr has some performance problem for too many documents?
Should I set some configuration?

Re: Solr, I have perfomance problem for indexing.

Posted by Lance Norskog <go...@gmail.com>.
Which Solr do you run?

On Tue, Jun 5, 2012 at 8:02 PM, Jack Krupansky <ja...@basetechnology.com> wrote:
> You wrote "3,5000", but is that 35 hundred (3,500) or 35 thousand (35,000)??
>
> Your numbers seem far worse than what many people typically see with Solr
> and DIH.
>
> Is the database running on the same machine?
>
> Check the Solr log file to see if some errors (or warnings) might be
> occurring frequently.
>
> Check the log for the first table from when it starts to when it ends. How
> often is it committing (according to the log)? Does there seem to be any odd
> activity during that period?
>
> -- Jack Krupansky
>
> -----Original Message----- From: Jihyun Suh
> Sent: Tuesday, June 05, 2012 9:25 PM
> To: solr-user-help@lucene.apache.org ; solr-user@lucene.apache.org
> Subject: Solr, I have perfomance problem for indexing.
>
>
> I have 128 tables of mysql 5.x and each table have 3,5000 rows.
> When I start dataimport(indexing) in Solr, it takes 5 minutes for one
> table.
> But When Solr indexs 20th table, it takes around 10 minutes for one table.
> And then When it indexs 40th table, it takes around 20 minutes for one
> table.
>
> Solr has some performance problem for too many documents?
> Should I set some configuration?



-- 
Lance Norskog
goksron@gmail.com

Re: Solr, I have perfomance problem for indexing.

Posted by Jack Krupansky <ja...@basetechnology.com>.
You wrote "3,5000", but is that 35 hundred (3,500) or 35 thousand (35,000)??

Your numbers seem far worse than what many people typically see with Solr 
and DIH.

Is the database running on the same machine?

Check the Solr log file to see if some errors (or warnings) might be 
occurring frequently.

Check the log for the first table from when it starts to when it ends. How 
often is it committing (according to the log)? Does there seem to be any odd 
activity during that period?

-- Jack Krupansky

-----Original Message----- 
From: Jihyun Suh
Sent: Tuesday, June 05, 2012 9:25 PM
To: solr-user-help@lucene.apache.org ; solr-user@lucene.apache.org
Subject: Solr, I have perfomance problem for indexing.

I have 128 tables of mysql 5.x and each table have 3,5000 rows.
When I start dataimport(indexing) in Solr, it takes 5 minutes for one
table.
But When Solr indexs 20th table, it takes around 10 minutes for one table.
And then When it indexs 40th table, it takes around 20 minutes for one
table.

Solr has some performance problem for too many documents?
Should I set some configuration? 


Re: Solr, I have perfomance problem for indexing.

Posted by Erick Erickson <er...@gmail.com>.
You haven't really told us much about what you're doing here. As Lee
hints, we don't know much about the details of *how* you are doing this.

But unless you're doing something odd, Solr shouldn't be the bottleneck
here. Often when a database import is slow, the problem is in the data-
acquisition bit. That is, your SQL query for some reason gets
slow. That said, with DIH it can be hard to know exactly.

You might want to consider using SolrJ instead of DIH. We've found that
as the import process gets more complex, using SolrJ is often easier. See:
http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/


Best
Erick

On Thu, Jun 7, 2012 at 5:26 AM, Lee Carroll
<le...@googlemail.com> wrote:
> what is your db schema ? do you need to import all the schema ? (128
> joined tables ??)
> or are the tables all independant ? (if so dump them out and import
> them in using csv)
>
> cheers lee c
>
> On 7 June 2012 02:32, Jihyun Suh <jh...@gmail.com> wrote:
>> Each table has 35,000 rows. (35 thousands).
>> I will check the log for each step of indexing.
>>
>> I run Solr 3.5.
>>
>>
>> 2012/6/6 Jihyun Suh <jh...@gmail.com>
>>
>>> I have 128 tables of mysql 5.x and each table have 3,5000 rows.
>>> When I start dataimport(indexing) in Solr, it takes 5 minutes for one
>>> table.
>>> But When Solr indexs 20th table, it takes around 10 minutes for one table.
>>> And then When it indexs 40th table, it takes around 20 minutes for one
>>> table.
>>>
>>> Solr has some performance problem for too many documents?
>>> Should I set some configuration?
>>>
>>>

Re: Solr, I have perfomance problem for indexing.

Posted by Lee Carroll <le...@googlemail.com>.
what is your db schema ? do you need to import all the schema ? (128
joined tables ??)
or are the tables all independant ? (if so dump them out and import
them in using csv)

cheers lee c

On 7 June 2012 02:32, Jihyun Suh <jh...@gmail.com> wrote:
> Each table has 35,000 rows. (35 thousands).
> I will check the log for each step of indexing.
>
> I run Solr 3.5.
>
>
> 2012/6/6 Jihyun Suh <jh...@gmail.com>
>
>> I have 128 tables of mysql 5.x and each table have 3,5000 rows.
>> When I start dataimport(indexing) in Solr, it takes 5 minutes for one
>> table.
>> But When Solr indexs 20th table, it takes around 10 minutes for one table.
>> And then When it indexs 40th table, it takes around 20 minutes for one
>> table.
>>
>> Solr has some performance problem for too many documents?
>> Should I set some configuration?
>>
>>

Re: Solr, I have perfomance problem for indexing.

Posted by Jihyun Suh <jh...@gmail.com>.
Each table has 35,000 rows. (35 thousands).
I will check the log for each step of indexing.

I run Solr 3.5.


2012/6/6 Jihyun Suh <jh...@gmail.com>

> I have 128 tables of mysql 5.x and each table have 3,5000 rows.
> When I start dataimport(indexing) in Solr, it takes 5 minutes for one
> table.
> But When Solr indexs 20th table, it takes around 10 minutes for one table.
> And then When it indexs 40th table, it takes around 20 minutes for one
> table.
>
> Solr has some performance problem for too many documents?
> Should I set some configuration?
>
>