You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by zh...@dcits.com on 2015/01/04 10:55:20 UTC

HELP SolrJ: performance issue when adding index to SolrServer

hi all,

	thanks very much for any tip!!

	SolrCloud 4.10.2:
		3 solr instances in tomcat 7,
		3 zookeepers to manage solr config set,
		solr index data is stored on HDFS

	I have a web app, need  to build solr index when one user adds a new
post or comment the post.

	I use solrJ to add indexes to to SolrServer, below are my sample code
and time consumed,
	how to reduce the running time of below code? any tip to improve the
performance? any solution for this case? 200 ms is acceptable for us.


	//below codes take about 500 ms
	CloudSolrServer server = new CloudSolrServer(zkHost);
	server.setDefaultCollection(defaultCollection);

	//below codes take about 500 ms
	server.connect();

	//below codes take about 1500 ms
	server.add(doc);

	//I use autoSoftCommit=1000ms,since commit in code will take about 30
minutes
	//server.commit();

thanks

Jan


-------------------------------------------------------------------------------------------------------------------
免责声明(Disclaimer)
1.此电子邮件包含来自神州数码的信息,而且是机密的或者专用的信息。这些信息是供所有以上列出的个人或者团体使用的。如果您不是此邮件的预期收件人,请勿阅读、复制、转发或存储此邮件。如果已误收此邮件,请通知发件人。
This e-mail may contain confidential and/or privileged information from Digital China and is intended solely for the attention and use of the person(s) named above. If you are not the intended recipient (or have received this e-mail in error), please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this email is strictly forbidden.
2.本公司不担保本电子邮件中信息的准确性、适当性或完整性,并且对此产生的任何错误或疏忽不承担任何责任。
The content provided in this e-mail can not be guaranteed and assured to be accurate, appropriate for all, and complete by Digital China, and Digital China can not be held responsible for any error or negligence derived therefrom.
3.接收方应在接收电子邮件或任何附件时检查有无病毒。本公司对由于转载本电子邮件而引发病毒产生的任何损坏不承担任何责任。
The internet communications through this e-mail can not be guaranteed or assured to be error or virus-free, and the sender do not accept liability for any errors, omissions or damages arising therefrom.


Re: HELP SolrJ: performance issue when adding index to SolrServer

Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/4/2015 2:55 AM, zhangjianad@dcits.com wrote:
> 	SolrCloud 4.10.2:
> 		3 solr instances in tomcat 7,
> 		3 zookeepers to manage solr config set,
> 		solr index data is stored on HDFS
> 
> 	I have a web app, need  to build solr index when one user adds a new
> post or comment the post.
> 
> 	I use solrJ to add indexes to to SolrServer, below are my sample code
> and time consumed,
> 	how to reduce the running time of below code? any tip to improve the
> performance? any solution for this case? 200 ms is acceptable for us.
> 
> 
> 	//below codes take about 500 ms
> 	CloudSolrServer server = new CloudSolrServer(zkHost);
> 	server.setDefaultCollection(defaultCollection);
> 
> 	//below codes take about 500 ms
> 	server.connect();

Since creating the object and connecting to ZK should only be required
*once* during the lifetime of the application ... it doesn't really
matter how long this takes.  One second of time during application
initialization is *nothing*.  SolrServer objects are completely
thread-safe ... you do not need to create a new one every time your
application loops.

> 	//below codes take about 1500 ms
> 	server.add(doc);

Unless that Solr document is *enormous*, there's something really wrong
if adding a single document takes a second and a half.

This is where we get into very specific questions about your Solr
installation, trying to track down where the performance problem is.  On
a single Solr server, how many total documents are in all the replicas
that live on that machine, much disk space is taken by all the "index"
directories, how large is the Java heap, and how much total RAM does the
machine have?  Is there software other than Solr on the machine?  Have
you tuned your garbage collection on the JVM at all?

> 	//I use autoSoftCommit=1000ms,since commit in code will take about 30
> minutes
> 	//server.commit();

Do you actually *need* a document to be searchable within one second
after it is indexed?  Once the situation is examined deeply, this is
rarely a genuine requirement.  It is usually something that
sales/marketing lists as a requirement ... but in almost all real-world
situations, nobody ever notices if it takes a minute or two for new
stuff to become visible.

Unless the index is really tiny or you have taken special steps to
ensure it happens faster, a commit operation that opens a new searcher
will normally take a few seconds ... so you won't get that one second
visibility anyway.

Just like I mentioned for an "add" taking 1.5 seconds, if a commit takes
30 minutes, there's something wrong.

Normally these kinds of performance issues happen when there is not
enough memory on the machine.  They can be caused by other things,
that's just the most common problem.

Thanks,
Shawn