You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tod <li...@gmail.com> on 2011/08/18 16:06:17 UTC
Solr read timeout
I'm using perl to indirectly call the solr ExtractingRequestHandler to
stream remote documents into a solr index instance. Every 100 URL's I
process I do a commit. I've got about 30K documents to be indexed. I'm
using a stock, out of the box version of solr 1.4.1 with the necessary
schema changes for the fields I'm indexing.
I seem to be running into performance problems about 40 documents in. I
start getting Failed: 500 read timeouts that last about 4 minutes each
slowing processing down to a crawl. I've tried a later version of tika
(0.8) and that didn't seem to help. I'm also not sure it's the problem.
Given I'm using a pretty much unaltered version of Solr could it be my
problem? I'm running everything under a typical Tomcat install on a
Linux VM. I understand there are performance tweaks I can make to the
Solr config but would like to focus them first on resolving this problem
rather than blanket tweaking the entire config.
Is there anything in particular I should look at? Can I provide any
more information?
Thanks - Tod