You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Walter Ferrara <wa...@gmail.com> on 2007/10/12 12:34:36 UTC

CommonsHttpSolrServer and multithread

What is the best approach to use solrj CommonsHttpSolrServer for
execution of queries like in
CommonsHttpSolrServer server = new CommonsHttpSolrServer(url);
server.query(query);
?
I could build one CommonsHttpSolrServer for each query, or I could build
just one, put it in a singleton and reuse it.

The point is, I get exception both ways.

When I do standard queries test with a single thread, everything works
fine, although I feel far more safe recreating CommonsHttpSolrServer for
each query.
When I do stress test with 6 concurrent or more threads, I got
exception. (Not every query fails, just very few of them.)

If I recreate one server per query, I may end up with BindException on
Windows (which, AFAIK could be a windows-related problem, take a look
for example at:
http://www.mailinglistarchive.com/httpclient-user@jakarta.apache.org/msg00575.html
), and with a java.net.SocketException: Too many open files on Linux,
which maybe state the same issue (no free local ports?).

But if I reuse the same server object, maybe some piece of code inside
the XML parser is not thread-safe (but should CommonsHttpSolrServer be
thread-safe?) , I end up with exception like:
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[-1,-1]
Message: Element type "int" must be followed by either attribute
specifications, ">" or "/>".
    at
com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(Unknown
Source)
    at
org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:172)
    at
org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:196)
    at
org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:84)
    at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:239)
    at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:80)
[...]

or

com.sun.org.apache.xerces.internal.xni.XNIException: Scanner State 7 not
Recognized
    at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$TrailingMiscDriver.next(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.setInputSource(Unknown
Source)
    at
com.sun.xml.internal.stream.XMLInputFactoryImpl.getXMLStreamReaderImpl(Unknown
Source)
    at
com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(Unknown
Source)
    at
org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:67)
    at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:239)
    at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:80)
    at
org.apache.solr.client.solrj.impl.BaseSolrServer.query(BaseSolrServer.java:99)
[...]
(does it have something in common with SOLR-360?)
I got no such exception if I recreate a CommonsHttpSolrServer  object
every query, only the BindException.

My enviroment for testing is Windows (2003), while solr reside in a
jetty in a remote machine - but solr seems not responsible for it - look
to me more a httpclient related thing for the BindException. I've tried
acting on max-connection-per-host and similar parameters, (via
httpclient defaults), but it seems not to resolve.

Does httpclient works as a singleton, like a connection pool, i.e.
opening one new CommonsHttpSolrServer, even if it create a new
MultiThreadedHttpConnectionManager every time, just reuse the same
connection pool, is that right?

SolrJ I'm using is dated 2007-09-24 (downloaded from hudson), httpclient
libs used is 3.1 (the very one than came with that sorlj)

Walter
--


RE: CommonsHttpSolrServer and multithread

Posted by Will Johnson <wi...@gmail.com>.
You can also get a hold of the underlying MultiThreadedHttpConnectionManager
if you want to tweak the configuration further:


public class CommonsHttpSolrServer { 
  .....
  public MultiThreadedHttpConnectionManager getConnectionManager()
}

- will

-----Original Message-----
From: Ryan McKinley [mailto:ryantxu@gmail.com] 
Sent: Thursday, October 18, 2007 12:08 PM
To: solr-dev@lucene.apache.org
Subject: Re: CommonsHttpSolrServer and multithread

> 
> but Is CommonsHttpSolrServer thread-safe?
> 

It better be!  To the best of my knowledge, it is.  If you have any 
troubles with it, we need to fix them.

the underlying connections are thread safe:
http://jakarta.apache.org/httpcomponents/httpclient-3.x/threading.html

we use MultiThreadedHttpConnectionManager

ryan


Re: CommonsHttpSolrServer and multithread

Posted by Ryan McKinley <ry...@gmail.com>.
> 
> but Is CommonsHttpSolrServer thread-safe?
> 

It better be!  To the best of my knowledge, it is.  If you have any 
troubles with it, we need to fix them.

the underlying connections are thread safe:
http://jakarta.apache.org/httpcomponents/httpclient-3.x/threading.html

we use MultiThreadedHttpConnectionManager

ryan

Re: CommonsHttpSolrServer and multithread

Posted by Walter Ferrara <wa...@gmail.com>.
I've been playing with latest solrj.

If I make one object per query, I still got a "Too Many open file"
exception from java.net.SocketException (in a Linux environment);
however by reusing the same CommonsHttpSolrServer object, in
multi-thread fashion, I haven't got any exception so far, everything
seems to work (I haven't done test on the consistency of results, however).

but Is CommonsHttpSolrServer thread-safe?

Thanks,
Walter

Ryan McKinley wrote:
>
>> I could build one CommonsHttpSolrServer for each query, or I could build
>> just one, put it in a singleton and reuse it.
>>
>
> either way.  Solrj uses MultiThreadedHttpConnectionManager.
>
>>
>> SolrJ I'm using is dated 2007-09-24 (downloaded from hudson), httpclient
>> libs used is 3.1 (the very one than came with that sorlj)
>>
>
> Try with a nightly after Oct 5 and see what happens:
> http://svn.apache.org/viewvc?view=rev&revision=582349
>
> If that does not do it, we need to fix something.
>
> ryan
>

Re: CommonsHttpSolrServer and multithread

Posted by Ryan McKinley <ry...@gmail.com>.
> I could build one CommonsHttpSolrServer for each query, or I could build
> just one, put it in a singleton and reuse it.
> 

either way.  Solrj uses MultiThreadedHttpConnectionManager.

> 
> SolrJ I'm using is dated 2007-09-24 (downloaded from hudson), httpclient
> libs used is 3.1 (the very one than came with that sorlj)
> 

Try with a nightly after Oct 5 and see what happens:
http://svn.apache.org/viewvc?view=rev&revision=582349

If that does not do it, we need to fix something.

ryan