You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shawn Heisey <so...@elyograg.org> on 2012/03/27 01:33:44 UTC

StreamingUpdateSolrServer - exceptions not propagated

I've been building a new version of my app that keeps our Solr indexes 
up to date.  I had hoped to use StreamingUpdateSolrServer instead of 
CommonsHttpSolrServer for performance reasons, but I have run into a 
showstopper problem that has made me revert to CHSS.

I have been relying on exception handling to detect when there is any 
kind of problem with any request sent to Solr.  Looking at the code for 
SUSS, it seems that any exceptions thrown by lower level code are simply 
logged, then forgotten as if they had never happened.

So far I have not been able to decipher how things actually work, so I 
can't tell if it would be possible to propagate the exception back up 
into my code.

Questions for the experts: Would such propagation be possible without 
compromising performance?  Is this a bug?  Can I somehow detect the 
failure and throw an exception of my own?

For reference, here is the exception that gets logged, but not actually 
thrown:

java.net.ConnectException: Connection refused
         at java.net.PlainSocketImpl.socketConnect(Native Method)
         at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
         at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
         at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
         at java.net.Socket.connect(Socket.java:579)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:601)
         at 
org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140)
         at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
         at 
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
         at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
         at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
         at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
         at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
         at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
         at 
org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:154)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
         at java.lang.Thread.run(Thread.java:722)

Thanks,
Shawn


Re: StreamingUpdateSolrServer - exceptions not propagated

Posted by Mark Miller <ma...@gmail.com>.
Like I said, you have to extend the class and override the error method. 

Sent from my iPhone

On Mar 27, 2012, at 2:29 AM, Shawn Heisey <so...@elyograg.org> wrote:

> On 3/26/2012 10:25 PM, Shawn Heisey wrote:
>> The problem is that I currently have no way (that I know of so far) to detect that a problem happened.  As far as my code is concerned, everything worked, so it updates my position tracking and those documents will never be inserted.  I have not yet delved into the response object to see whether it can tell me anything.  My code currently assumes that if no exception was thrown, it was successful.  This works with CHSS.  I will write some test code that tries out various error situations and see what the response contains.
> 
> I've written some test code.  When doing an add with SUSS against a server that's down, no exception is thrown.  It does throw one for query and deleteByQuery.  When doing the add test with CHSS, an exception is thrown.  I guess I'll just have to use CHSS until this gets fixed, assuming it ever does.  Would it be at all helpful to file an issue in jira, or has one already been filed?  With a quick search, I could not find one.
> 
> Thanks,
> Shawn
> 

Re: StreamingUpdateSolrServer - exceptions not propagated

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/26/2012 10:25 PM, Shawn Heisey wrote:
> The problem is that I currently have no way (that I know of so far) to 
> detect that a problem happened.  As far as my code is concerned, 
> everything worked, so it updates my position tracking and those 
> documents will never be inserted.  I have not yet delved into the 
> response object to see whether it can tell me anything.  My code 
> currently assumes that if no exception was thrown, it was successful.  
> This works with CHSS.  I will write some test code that tries out 
> various error situations and see what the response contains.

I've written some test code.  When doing an add with SUSS against a 
server that's down, no exception is thrown.  It does throw one for query 
and deleteByQuery.  When doing the add test with CHSS, an exception is 
thrown.  I guess I'll just have to use CHSS until this gets fixed, 
assuming it ever does.  Would it be at all helpful to file an issue in 
jira, or has one already been filed?  With a quick search, I could not 
find one.

Thanks,
Shawn


Re: StreamingUpdateSolrServer - exceptions not propagated

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/26/2012 6:43 PM, Mark Miller wrote:
> It doesn't get thrown because that logic needs to continue - you don't necessarily want one bad document to stop all the following documents from being added. So the exception is sent to that method with the idea that you can override and do what you would like. I've written sample code around stopping and throwing an exception, but I guess its not totally trivial. Other ideas for reporting errors have been thrown around in the past, but no work on it has gotten any traction.
>
>
> - Mark Miller
> lucidimagination.com
>
> On Mar 26, 2012, at 7:33 PM, Shawn Heisey wrote:
>
>> I've been building a new version of my app that keeps our Solr indexes up to date.  I had hoped to use StreamingUpdateSolrServer instead of CommonsHttpSolrServer for performance reasons, but I have run into a showstopper problem that has made me revert to CHSS.
>>
>> I have been relying on exception handling to detect when there is any kind of problem with any request sent to Solr.  Looking at the code for SUSS, it seems that any exceptions thrown by lower level code are simply logged, then forgotten as if they had never happened.

The problem is that I currently have no way (that I know of so far) to 
detect that a problem happened.  As far as my code is concerned, 
everything worked, so it updates my position tracking and those 
documents will never be inserted.  I have not yet delved into the 
response object to see whether it can tell me anything.  My code 
currently assumes that if no exception was thrown, it was successful.  
This works with CHSS.  I will write some test code that tries out 
various error situations and see what the response contains.

Thanks,
Shawn


Re: StreamingUpdateSolrServer - exceptions not propagated

Posted by Erick Erickson <er...@gmail.com>.
https://issues.apache.org/jira/browse/SOLR-445

This JIRA reflects the slightly different case of wanting better
reporting of *which* document failed in a multi-document packet, it
doesn't specifically address SUSS. But it might serve to give you some
ideas if you tackle this.

On Tue, Mar 27, 2012 at 11:14 AM, Mark Miller <ma...@gmail.com> wrote:
>
> On Mar 27, 2012, at 10:51 AM, Shawn Heisey wrote:
>
>> On 3/26/2012 6:43 PM, Mark Miller wrote:
>>> It doesn't get thrown because that logic needs to continue - you don't necessarily want one bad document to stop all the following documents from being added. So the exception is sent to that method with the idea that you can override and do what you would like. I've written sample code around stopping and throwing an exception, but I guess its not totally trivial. Other ideas for reporting errors have been thrown around in the past, but no work on it has gotten any traction.
>>
>> It looks like StreamingUpdateSolrServer is not meant for situations where strict error checking is required.  I think the documentation should reflect that.  Would you be opposed to a javadoc update at the class level (plus a wiki addition) like the following? "Because document inserts are handled as background tasks, exceptions and errors that occur during those operations will not be available to the calling program, but they will be logged.  For example, if the Solr server is down, your program must determine this on its own.  If you need strict error handling, use CommonsHttpSolrServer."  If my wording is bad, feel free to make suggestions.
>>
>> If I'm wrong and you do have an example of an error handling override that would do what I need, I would love to see it.  From what I can tell, add requests are pushed down and handled by Runner threads, completely disconnected from the request.  The response to add calls always seems to be a NOTE element saying "the request is processed in a background stream", even if successful.
>>
>> Thanks,
>> Shawn
>>
>
>
> I'm not saying what it's meant for, I'm just saying what it is. Currently, the only thing you can do to check for errors is override that method. I understand it's still somewhat limiting - it depends on your use case how well it can work. For example, I've know people that just want to stop the update process if a doc fails, and throw an exception. You can write code to do that by extending the class and overriding handleError. You can also collection the exceptions, count the fails, read and parse any error messages, etc. It doesn't help you with an ID or anything though - unless you get unluck/lucky and can parse it out of error messages (if it's even in them). It might be more useful if you could set the name of an id field for it to look for and perhaps also dump to that method.
>
> Their have been previous conversations about improving error reporting for this SolrServer, but no work has ever really gotten off the ground. There may be existing JIRA issues around this topic - certainly there are previous email threads.
>
> All and all though, please, make all the suggestions and JIRA issues you want. Javadoc improvements can be submitted as patches through JIRA as well. Also, the Wiki is open to anyone to update.
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>

Re: StreamingUpdateSolrServer - exceptions not propagated

Posted by Mike Sokolov <so...@ifactory.com>.
On 3/27/2012 11:14 AM, Mark Miller wrote:
> On Mar 27, 2012, at 10:51 AM, Shawn Heisey wrote:
>
>> On 3/26/2012 6:43 PM, Mark Miller wrote:
>>> It doesn't get thrown because that logic needs to continue - you don't necessarily want one bad document to stop all the following documents from being added. So the exception is sent to that method with the idea that you can override and do what you would like. I've written sample code around stopping and throwing an exception, but I guess its not totally trivial. Other ideas for reporting errors have been thrown around in the past, but no work on it has gotten any traction.
>> It looks like StreamingUpdateSolrServer is not meant for situations where strict error checking is required.  I think the documentation should reflect that.  Would you be opposed to a javadoc update at the class level (plus a wiki addition) like the following? "Because document inserts are handled as background tasks, exceptions and errors that occur during those operations will not be available to the calling program, but they will be logged.  For example, if the Solr server is down, your program must determine this on its own.  If you need strict error handling, use CommonsHttpSolrServer."  If my wording is bad, feel free to make suggestions.
>>
It might make sense to accumulate the errors in a fixed-size queue and 
report them either when the queue fills up or when the client commits 
(assuming the commit will wait for all outstanding inserts to complete 
or fail).  This is what we do client-side when performing multi-threaded 
inserts.  Sounds great in theory, I think, but then I haven't delved in 
to SUSS at all ... just a suggestion, take it or leave it.  Actually I 
wonder whether SUSS is necessary of you do the threading client-side?  
You might get a similar perf gain; I know we see a substantial speedup 
that way.  because then your updates spawn multiple threads in the 
server anyway, don't they?

- Mike

Re: StreamingUpdateSolrServer - exceptions not propagated

Posted by Mark Miller <ma...@gmail.com>.
On Mar 27, 2012, at 10:51 AM, Shawn Heisey wrote:

> On 3/26/2012 6:43 PM, Mark Miller wrote:
>> It doesn't get thrown because that logic needs to continue - you don't necessarily want one bad document to stop all the following documents from being added. So the exception is sent to that method with the idea that you can override and do what you would like. I've written sample code around stopping and throwing an exception, but I guess its not totally trivial. Other ideas for reporting errors have been thrown around in the past, but no work on it has gotten any traction.
> 
> It looks like StreamingUpdateSolrServer is not meant for situations where strict error checking is required.  I think the documentation should reflect that.  Would you be opposed to a javadoc update at the class level (plus a wiki addition) like the following? "Because document inserts are handled as background tasks, exceptions and errors that occur during those operations will not be available to the calling program, but they will be logged.  For example, if the Solr server is down, your program must determine this on its own.  If you need strict error handling, use CommonsHttpSolrServer."  If my wording is bad, feel free to make suggestions.
> 
> If I'm wrong and you do have an example of an error handling override that would do what I need, I would love to see it.  From what I can tell, add requests are pushed down and handled by Runner threads, completely disconnected from the request.  The response to add calls always seems to be a NOTE element saying "the request is processed in a background stream", even if successful.
> 
> Thanks,
> Shawn
> 


I'm not saying what it's meant for, I'm just saying what it is. Currently, the only thing you can do to check for errors is override that method. I understand it's still somewhat limiting - it depends on your use case how well it can work. For example, I've know people that just want to stop the update process if a doc fails, and throw an exception. You can write code to do that by extending the class and overriding handleError. You can also collection the exceptions, count the fails, read and parse any error messages, etc. It doesn't help you with an ID or anything though - unless you get unluck/lucky and can parse it out of error messages (if it's even in them). It might be more useful if you could set the name of an id field for it to look for and perhaps also dump to that method.

Their have been previous conversations about improving error reporting for this SolrServer, but no work has ever really gotten off the ground. There may be existing JIRA issues around this topic - certainly there are previous email threads.

All and all though, please, make all the suggestions and JIRA issues you want. Javadoc improvements can be submitted as patches through JIRA as well. Also, the Wiki is open to anyone to update. 

- Mark Miller
lucidimagination.com












Re: StreamingUpdateSolrServer - exceptions not propagated

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/26/2012 6:43 PM, Mark Miller wrote:
> It doesn't get thrown because that logic needs to continue - you don't necessarily want one bad document to stop all the following documents from being added. So the exception is sent to that method with the idea that you can override and do what you would like. I've written sample code around stopping and throwing an exception, but I guess its not totally trivial. Other ideas for reporting errors have been thrown around in the past, but no work on it has gotten any traction.

It looks like StreamingUpdateSolrServer is not meant for situations 
where strict error checking is required.  I think the documentation 
should reflect that.  Would you be opposed to a javadoc update at the 
class level (plus a wiki addition) like the following? "Because document 
inserts are handled as background tasks, exceptions and errors that 
occur during those operations will not be available to the calling 
program, but they will be logged.  For example, if the Solr server is 
down, your program must determine this on its own.  If you need strict 
error handling, use CommonsHttpSolrServer."  If my wording is bad, feel 
free to make suggestions.

If I'm wrong and you do have an example of an error handling override 
that would do what I need, I would love to see it.  From what I can 
tell, add requests are pushed down and handled by Runner threads, 
completely disconnected from the request.  The response to add calls 
always seems to be a NOTE element saying "the request is processed in a 
background stream", even if successful.

Thanks,
Shawn


Re: StreamingUpdateSolrServer - exceptions not propagated

Posted by Mark Miller <ma...@gmail.com>.
It doesn't get thrown because that logic needs to continue - you don't necessarily want one bad document to stop all the following documents from being added. So the exception is sent to that method with the idea that you can override and do what you would like. I've written sample code around stopping and throwing an exception, but I guess its not totally trivial. Other ideas for reporting errors have been thrown around in the past, but no work on it has gotten any traction.


- Mark Miller
lucidimagination.com

On Mar 26, 2012, at 7:33 PM, Shawn Heisey wrote:

> I've been building a new version of my app that keeps our Solr indexes up to date.  I had hoped to use StreamingUpdateSolrServer instead of CommonsHttpSolrServer for performance reasons, but I have run into a showstopper problem that has made me revert to CHSS.
> 
> I have been relying on exception handling to detect when there is any kind of problem with any request sent to Solr.  Looking at the code for SUSS, it seems that any exceptions thrown by lower level code are simply logged, then forgotten as if they had never happened.
> 
> So far I have not been able to decipher how things actually work, so I can't tell if it would be possible to propagate the exception back up into my code.
> 
> Questions for the experts: Would such propagation be possible without compromising performance?  Is this a bug?  Can I somehow detect the failure and throw an exception of my own?
> 
> For reference, here is the exception that gets logged, but not actually thrown:
> 
> java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
>        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
>        at java.net.Socket.connect(Socket.java:579)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>        at java.lang.reflect.Method.invoke(Method.java:601)
>        at org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140)
>        at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
>        at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
>        at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
>        at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
>        at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>        at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>        at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>        at org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:154)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>        at java.lang.Thread.run(Thread.java:722)
> 
> Thanks,
> Shawn
>