You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rohit <ro...@in-rev.com> on 2012/04/11 19:42:03 UTC

solr 3.5 taking long to index

We recently migrated from solr3.1 to solr3.5,  we have one master and one
slave configured. The master has two cores,

 

1) Core1 - 44555972 documents

2) Core2 - 29419244 documents

 

We commit every 5000 documents, but lately the commit is taking very long 15
minutes plus in some cases. What could have caused this, I have checked the
logs and the only warning i can see is,

 

"WARNING: Use of deprecated update request parameter update.processor
detected. Please use the new parameter update.chain instead, as support for
update.processor will be removed in a later version."

 

Memory details:

 

export JAVA_OPTS="$JAVA_OPTS -Xms6g -Xmx36g -XX:MaxPermSize=5g"

 

Solr Config:

 

<useCompoundFile>false</useCompoundFile>

<mergeFactor>10</mergeFactor>

<ramBufferSizeMB>32</ramBufferSizeMB>

<!-- <maxBufferedDocs>1000</maxBufferedDocs> -->

  <maxFieldLength>10000</maxFieldLength>

  <writeLockTimeout>1000</writeLockTimeout>

  <commitLockTimeout>10000</commitLockTimeout>

 

What could be causing this, as everything was running fine a few days back?

 

 

Regards,

Rohit

Mobile: +91-9901768202

About Me:  <http://about.me/rohitg> http://about.me/rohitg

 


Re: solr 3.5 taking long to index

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Thu, Apr 12, 2012 at 10:42 PM, Rohit <ro...@in-rev.com> wrote:
> The machine has a total ram of around 46GB. My Biggest concern is Solr index time gradually increasing and then the commit stops because of timeouts, out commit rate is very high, but I am not able to find the root cause of the issue.

The difference you're seeing between 3.1 and 3.5 may be due to a bug
in the former where fsync was not being called:
https://issues.apache.org/jira/browse/LUCENE-3418

> We commit every 5000 documents

If you are doing bulk indexing, wait until the end to commit.
Upcoming Solr4 has near realtime (soft commit) support to make doing
frequent commits (for the purposes of visibility) less expensive.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10

RE: solr 3.5 taking long to index

Posted by Rohit <ro...@in-rev.com>.
Hi Shawn,

Thanks for the information, let me give this a try, since this is a live box I will try it during the weekend and update you.

Regards,
Rohit
Mobile: +91-9901768202
About Me: http://about.me/rohitg


-----Original Message-----
From: Shawn Heisey [mailto:solr@elyograg.org] 
Sent: 13 April 2012 11:01
To: solr-user@lucene.apache.org
Subject: Re: solr 3.5 taking long to index

On 4/12/2012 8:42 PM, Rohit wrote:
> The machine has a total ram of around 46GB. My Biggest concern is Solr index time gradually increasing and then the commit stops because of timeouts, out commit rate is very high, but I am not able to find the root cause of the issue.

For good performance, Solr relies on the OS having enough free RAM to keep critical portions of the index in the disk cache.  Some numbers that I have collected from your information so far are listed below.  
Please let me know if I've got any of this wrong:

46GB total RAM
36GB RAM allocated to Solr
300GB total index size

This leaves only 10GB of RAM free to cache 300GB of index, assuming that this server is dedicated to Solr.  The critical portions of your index are very likely considerably larger than 10GB, which causes constant reading from the disk for queries and updates.  With a high commit rate and a relatively low mergeFactor of 10, your index will be doing a lot of merging during updates, and some of those merges are likely to be quite large, further complicating the I/O situation.

Another thing that can lead to increasing index update times is cache warming, also greatly affected by high I/O levels.  If you visit the /solr/corename/admin/stats.jsp#cache URL, you can see the warmupTime for each cache in milliseconds.

Adding more memory to the server would probably help things.  You'll want to carefully check all the server and Solr statistics you can to make sure that memory is the root of problem, before you actually spend the money.  At the server level, look for things like a high iowait CPU percentage.  For Solr, you can turn the logging level up to INFO in the admin interface as well as turn on the infostream in solrconfig.xml for extensive debugging.

I hope this is helpful.  If not, I can try to come up with more specific things you can look at.

Thanks,
Shawn



Re: solr 3.5 taking long to index

Posted by Lance Norskog <go...@gmail.com>.
You're doing more commits than you need. You may want to turn off
autocommit since you are running commit yourself. Every commit causes
segment activity, so if you want to minimize that, you don't need
autocommit.

About memory sizing: you should drop the memory assigned to Solr until
it slows down, then increase it a little. All of the rest should be
used by the OS for disk caching.

With this much ram, investigate "Large Pages" support. This is an
operating system hack to make large programs run faster in large ram
machines.

On Sat, Apr 14, 2012 at 7:33 PM, Rohit <ro...@in-rev.com> wrote:
> Hey Shawn,
>
> Solr is working better, though not out of the woods, freed up some memory is the system and also increased the mergeFactor to 20.
>
> Has another question, we had autocommit ON all this while in our solrconfig.xml, but since the upgrade we have been noticing keeping autocommit on is increasing the commit time, though I cannot find a reason are they related in anyway?
>
>
> Regards,
> Rohit
> Mobile: +91-9901768202
> About Me: http://about.me/rohitg
>
>
> -----Original Message-----
> From: Shawn Heisey [mailto:solr@elyograg.org]
> Sent: 13 April 2012 11:01
> To: solr-user@lucene.apache.org
> Subject: Re: solr 3.5 taking long to index
>
> On 4/12/2012 8:42 PM, Rohit wrote:
>> The machine has a total ram of around 46GB. My Biggest concern is Solr index time gradually increasing and then the commit stops because of timeouts, out commit rate is very high, but I am not able to find the root cause of the issue.
>
> For good performance, Solr relies on the OS having enough free RAM to keep critical portions of the index in the disk cache.  Some numbers that I have collected from your information so far are listed below.
> Please let me know if I've got any of this wrong:
>
> 46GB total RAM
> 36GB RAM allocated to Solr
> 300GB total index size
>
> This leaves only 10GB of RAM free to cache 300GB of index, assuming that this server is dedicated to Solr.  The critical portions of your index are very likely considerably larger than 10GB, which causes constant reading from the disk for queries and updates.  With a high commit rate and a relatively low mergeFactor of 10, your index will be doing a lot of merging during updates, and some of those merges are likely to be quite large, further complicating the I/O situation.
>
> Another thing that can lead to increasing index update times is cache warming, also greatly affected by high I/O levels.  If you visit the /solr/corename/admin/stats.jsp#cache URL, you can see the warmupTime for each cache in milliseconds.
>
> Adding more memory to the server would probably help things.  You'll want to carefully check all the server and Solr statistics you can to make sure that memory is the root of problem, before you actually spend the money.  At the server level, look for things like a high iowait CPU percentage.  For Solr, you can turn the logging level up to INFO in the admin interface as well as turn on the infostream in solrconfig.xml for extensive debugging.
>
> I hope this is helpful.  If not, I can try to come up with more specific things you can look at.
>
> Thanks,
> Shawn
>
>



-- 
Lance Norskog
goksron@gmail.com

RE: solr 3.5 taking long to index

Posted by Rohit <ro...@in-rev.com>.
Hey Shawn,

Solr is working better, though not out of the woods, freed up some memory is the system and also increased the mergeFactor to 20.

Has another question, we had autocommit ON all this while in our solrconfig.xml, but since the upgrade we have been noticing keeping autocommit on is increasing the commit time, though I cannot find a reason are they related in anyway?


Regards,
Rohit
Mobile: +91-9901768202
About Me: http://about.me/rohitg


-----Original Message-----
From: Shawn Heisey [mailto:solr@elyograg.org] 
Sent: 13 April 2012 11:01
To: solr-user@lucene.apache.org
Subject: Re: solr 3.5 taking long to index

On 4/12/2012 8:42 PM, Rohit wrote:
> The machine has a total ram of around 46GB. My Biggest concern is Solr index time gradually increasing and then the commit stops because of timeouts, out commit rate is very high, but I am not able to find the root cause of the issue.

For good performance, Solr relies on the OS having enough free RAM to keep critical portions of the index in the disk cache.  Some numbers that I have collected from your information so far are listed below.  
Please let me know if I've got any of this wrong:

46GB total RAM
36GB RAM allocated to Solr
300GB total index size

This leaves only 10GB of RAM free to cache 300GB of index, assuming that this server is dedicated to Solr.  The critical portions of your index are very likely considerably larger than 10GB, which causes constant reading from the disk for queries and updates.  With a high commit rate and a relatively low mergeFactor of 10, your index will be doing a lot of merging during updates, and some of those merges are likely to be quite large, further complicating the I/O situation.

Another thing that can lead to increasing index update times is cache warming, also greatly affected by high I/O levels.  If you visit the /solr/corename/admin/stats.jsp#cache URL, you can see the warmupTime for each cache in milliseconds.

Adding more memory to the server would probably help things.  You'll want to carefully check all the server and Solr statistics you can to make sure that memory is the root of problem, before you actually spend the money.  At the server level, look for things like a high iowait CPU percentage.  For Solr, you can turn the logging level up to INFO in the admin interface as well as turn on the infostream in solrconfig.xml for extensive debugging.

I hope this is helpful.  If not, I can try to come up with more specific things you can look at.

Thanks,
Shawn



Re: solr 3.5 taking long to index

Posted by Shawn Heisey <so...@elyograg.org>.
On 4/12/2012 8:42 PM, Rohit wrote:
> The machine has a total ram of around 46GB. My Biggest concern is Solr index time gradually increasing and then the commit stops because of timeouts, out commit rate is very high, but I am not able to find the root cause of the issue.

For good performance, Solr relies on the OS having enough free RAM to 
keep critical portions of the index in the disk cache.  Some numbers 
that I have collected from your information so far are listed below.  
Please let me know if I've got any of this wrong:

46GB total RAM
36GB RAM allocated to Solr
300GB total index size

This leaves only 10GB of RAM free to cache 300GB of index, assuming that 
this server is dedicated to Solr.  The critical portions of your index 
are very likely considerably larger than 10GB, which causes constant 
reading from the disk for queries and updates.  With a high commit rate 
and a relatively low mergeFactor of 10, your index will be doing a lot 
of merging during updates, and some of those merges are likely to be 
quite large, further complicating the I/O situation.

Another thing that can lead to increasing index update times is cache 
warming, also greatly affected by high I/O levels.  If you visit the 
/solr/corename/admin/stats.jsp#cache URL, you can see the warmupTime for 
each cache in milliseconds.

Adding more memory to the server would probably help things.  You'll 
want to carefully check all the server and Solr statistics you can to 
make sure that memory is the root of problem, before you actually spend 
the money.  At the server level, look for things like a high iowait CPU 
percentage.  For Solr, you can turn the logging level up to INFO in the 
admin interface as well as turn on the infostream in solrconfig.xml for 
extensive debugging.

I hope this is helpful.  If not, I can try to come up with more specific 
things you can look at.

Thanks,
Shawn


RE: solr 3.5 taking long to index

Posted by Rohit <ro...@in-rev.com>.
The machine has a total ram of around 46GB. My Biggest concern is Solr index time gradually increasing and then the commit stops because of timeouts, out commit rate is very high, but I am not able to find the root cause of the issue.

Regards,
Rohit
Mobile: +91-9901768202
About Me: http://about.me/rohitg

-----Original Message-----
From: Shawn Heisey [mailto:solr@elyograg.org] 
Sent: 13 April 2012 05:15
To: solr-user@lucene.apache.org
Subject: Re: solr 3.5 taking long to index

On 4/12/2012 12:42 PM, Rohit wrote:
> Thanks for pointing these out, but I still have one concern, why is 
> the Virtual Memory running in 300g+?

Solr 3.5 uses MMapDirectoryFactory by default to read the index.  This does an mmap on the files that make up your index, so their entire contents are simply accessible to the application as virtual memory (over 300GB in your case), the OS automatically takes care of swapping disk pages in and out of real RAM as required.  This approach has less overhead and tends to make better use of the OS disk cache than other methods.  It does lead to confused questions and scary numbers in memory usage reporting, though.

You have mentioned that you are giving 36GB of RAM to Solr.  How much total RAM does the machine have?

Thanks,
Shawn



Re: solr 3.5 taking long to index

Posted by Shawn Heisey <so...@elyograg.org>.
On 4/12/2012 12:42 PM, Rohit wrote:
> Thanks for pointing these out, but I still have one concern, why is the
> Virtual Memory running in 300g+?

Solr 3.5 uses MMapDirectoryFactory by default to read the index.  This 
does an mmap on the files that make up your index, so their entire 
contents are simply accessible to the application as virtual memory 
(over 300GB in your case), the OS automatically takes care of swapping 
disk pages in and out of real RAM as required.  This approach has less 
overhead and tends to make better use of the OS disk cache than other 
methods.  It does lead to confused questions and scary numbers in memory 
usage reporting, though.

You have mentioned that you are giving 36GB of RAM to Solr.  How much 
total RAM does the machine have?

Thanks,
Shawn


RE: solr 3.5 taking long to index

Posted by Rohit <ro...@in-rev.com>.
Thanks for pointing these out, but I still have one concern, why is the
Virtual Memory running in 300g+?

Regards,
Rohit
Mobile: +91-9901768202
About Me: http://about.me/rohitg


-----Original Message-----
From: Bernd Fehling [mailto:bernd.fehling@uni-bielefeld.de] 
Sent: 12 April 2012 11:58
To: solr-user@lucene.apache.org
Subject: Re: solr 3.5 taking long to index


There were some changes in solrconfig.xml between solr3.1 and solr3.5.
Always read CHANGES.txt when switching to a new version.
Also helpful is comparing both versions of solrconfig.xml from the examples.

Are you sure you need a MaxPermSize of 5g?
Use jvisualvm to see what you really need.
This is also for all other JAVA_OPTS.



Am 11.04.2012 19:42, schrieb Rohit:
> We recently migrated from solr3.1 to solr3.5,  we have one master and 
> one slave configured. The master has two cores,
> 
>  
> 
> 1) Core1 - 44555972 documents
> 
> 2) Core2 - 29419244 documents
> 
>  
> 
> We commit every 5000 documents, but lately the commit is taking very 
> long 15 minutes plus in some cases. What could have caused this, I 
> have checked the logs and the only warning i can see is,
> 
>  
> 
> "WARNING: Use of deprecated update request parameter update.processor 
> detected. Please use the new parameter update.chain instead, as 
> support for update.processor will be removed in a later version."
> 
>  
> 
> Memory details:
> 
>  
> 
> export JAVA_OPTS="$JAVA_OPTS -Xms6g -Xmx36g -XX:MaxPermSize=5g"
> 
>  
> 
> Solr Config:
> 
>  
> 
> <useCompoundFile>false</useCompoundFile>
> 
> <mergeFactor>10</mergeFactor>
> 
> <ramBufferSizeMB>32</ramBufferSizeMB>
> 
> <!-- <maxBufferedDocs>1000</maxBufferedDocs> -->
> 
>   <maxFieldLength>10000</maxFieldLength>
> 
>   <writeLockTimeout>1000</writeLockTimeout>
> 
>   <commitLockTimeout>10000</commitLockTimeout>
> 
>  
> 
> What could be causing this, as everything was running fine a few days
back?
> 
>  
> 
>  
> 
> Regards,
> 
> Rohit
> 
> Mobile: +91-9901768202
> 
> About Me:  <http://about.me/rohitg> http://about.me/rohitg
> 
>  
> 
> 



Re: solr 3.5 taking long to index

Posted by Bernd Fehling <be...@uni-bielefeld.de>.
There were some changes in solrconfig.xml between solr3.1 and solr3.5.
Always read CHANGES.txt when switching to a new version.
Also helpful is comparing both versions of solrconfig.xml from the examples.

Are you sure you need a MaxPermSize of 5g?
Use jvisualvm to see what you really need.
This is also for all other JAVA_OPTS.



Am 11.04.2012 19:42, schrieb Rohit:
> We recently migrated from solr3.1 to solr3.5,  we have one master and one
> slave configured. The master has two cores,
> 
>  
> 
> 1) Core1 - 44555972 documents
> 
> 2) Core2 - 29419244 documents
> 
>  
> 
> We commit every 5000 documents, but lately the commit is taking very long 15
> minutes plus in some cases. What could have caused this, I have checked the
> logs and the only warning i can see is,
> 
>  
> 
> "WARNING: Use of deprecated update request parameter update.processor
> detected. Please use the new parameter update.chain instead, as support for
> update.processor will be removed in a later version."
> 
>  
> 
> Memory details:
> 
>  
> 
> export JAVA_OPTS="$JAVA_OPTS -Xms6g -Xmx36g -XX:MaxPermSize=5g"
> 
>  
> 
> Solr Config:
> 
>  
> 
> <useCompoundFile>false</useCompoundFile>
> 
> <mergeFactor>10</mergeFactor>
> 
> <ramBufferSizeMB>32</ramBufferSizeMB>
> 
> <!-- <maxBufferedDocs>1000</maxBufferedDocs> -->
> 
>   <maxFieldLength>10000</maxFieldLength>
> 
>   <writeLockTimeout>1000</writeLockTimeout>
> 
>   <commitLockTimeout>10000</commitLockTimeout>
> 
>  
> 
> What could be causing this, as everything was running fine a few days back?
> 
>  
> 
>  
> 
> Regards,
> 
> Rohit
> 
> Mobile: +91-9901768202
> 
> About Me:  <http://about.me/rohitg> http://about.me/rohitg
> 
>  
> 
> 

Re: solr 3.5 taking long to index

Posted by Lance Norskog <go...@gmail.com>.
It's telling you the problem. Try your  solrconfig.xml against the one
in 3.5/solr/example/solr/conf. You will what has changed in the
suggested tools.


On Wed, Apr 11, 2012 at 10:42 AM, Rohit <ro...@in-rev.com> wrote:
> We recently migrated from solr3.1 to solr3.5,  we have one master and one
> slave configured. The master has two cores,
>
>
>
> 1) Core1 - 44555972 documents
>
> 2) Core2 - 29419244 documents
>
>
>
> We commit every 5000 documents, but lately the commit is taking very long 15
> minutes plus in some cases. What could have caused this, I have checked the
> logs and the only warning i can see is,
>
>
>
> "WARNING: Use of deprecated update request parameter update.processor
> detected. Please use the new parameter update.chain instead, as support for
> update.processor will be removed in a later version."
>
>
>
> Memory details:
>
>
>
> export JAVA_OPTS="$JAVA_OPTS -Xms6g -Xmx36g -XX:MaxPermSize=5g"
>
>
>
> Solr Config:
>
>
>
> <useCompoundFile>false</useCompoundFile>
>
> <mergeFactor>10</mergeFactor>
>
> <ramBufferSizeMB>32</ramBufferSizeMB>
>
> <!-- <maxBufferedDocs>1000</maxBufferedDocs> -->
>
>  <maxFieldLength>10000</maxFieldLength>
>
>  <writeLockTimeout>1000</writeLockTimeout>
>
>  <commitLockTimeout>10000</commitLockTimeout>
>
>
>
> What could be causing this, as everything was running fine a few days back?
>
>
>
>
>
> Regards,
>
> Rohit
>
> Mobile: +91-9901768202
>
> About Me:  <http://about.me/rohitg> http://about.me/rohitg
>
>
>



-- 
Lance Norskog
goksron@gmail.com