You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Russell Teabeault (JIRA)" <ji...@apache.org> on 2011/01/07 18:35:45 UTC

[jira] Issue Comment Edited: (SOLR-2308) Race condition still exists in StreamingUpdateSolrServer which could cause it to hang

    [ https://issues.apache.org/jira/browse/SOLR-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978877#action_12978877 ] 

Russell Teabeault edited comment on SOLR-2308 at 1/7/11 12:35 PM:
------------------------------------------------------------------

It seems that this condition happens when HttpClient.executeMethod(method) is called in the Runner and it throws an exception.  This would cause the runner to get removed from the runners queue.  And if this happened quickly for all the runners then callers to the request method would block trying to add the request to the blocking queue if the request queue was full. 

Would another possible solution be to add a catch for the try block where HttpClient.executeMethod(method) is called?

{code}
    int statusCode = getHttpClient().executeMethod(method);
    if (statusCode != HttpStatus.SC_OK) {
        StringBuilder msg = new StringBuilder();
        msg.append( method.getStatusLine().getReasonPhrase() );
        msg.append( "\n\n" );
        msg.append( method.getStatusText() );
        msg.append( "\n\n" );
        msg.append( "request: "+method.getURI() );
        handleError( new Exception( msg.toString() ) );
     }
 }  /** catch added here */
 catch(Exception e) {
     handleError( e);
 } finally {
   try {
      // make sure to release the connection
       if(method != null)
           method.releaseConnection();
      }
      catch( Exception ex ){}
}
{code}

      was (Author: rteabeault):
    It seems that this condition happens when HttpClient.executeMethod(method) is called in the Runner and it throws an exception.  This would cause the runner to get removed from the runners queue.  And if this happened quickly for all the runners then callers to the request method would block trying to add the request to the blocking queue if the request queue was full. 

Would another possible solution be to add a catch for the try block where HttpClient.executeMethod(method) is called?

{code}
            int statusCode = getHttpClient().executeMethod(method);
            if (statusCode != HttpStatus.SC_OK) {
              StringBuilder msg = new StringBuilder();
              msg.append( method.getStatusLine().getReasonPhrase() );
              msg.append( "\n\n" );
              msg.append( method.getStatusText() );
              msg.append( "\n\n" );
              msg.append( "request: "+method.getURI() );
              handleError( new Exception( msg.toString() ) );
            }
          }  /** catch added here */
          catch(Exception e) {
            handleError( e);
          }finally {
            try {
              // make sure to release the connection
              if(method != null)
                method.releaseConnection();
            }
            catch( Exception ex ){}
          }
{code}


  
> Race condition still exists in StreamingUpdateSolrServer which could cause it to hang
> -------------------------------------------------------------------------------------
>
>                 Key: SOLR-2308
>                 URL: https://issues.apache.org/jira/browse/SOLR-2308
>             Project: Solr
>          Issue Type: Bug
>          Components: clients - java
>    Affects Versions: 1.4.1
>            Reporter: Johannes
>
> We are still seeing the same issue as SOLR-1711 & SOLR-1885 with Solr1.4.1
> We get into this situation when all the runner threads die due to a broken pipe, while the BlockingQueue is still full. All of the producer threads are all blocked on the BlockingQueue.put() method. Since the runners are spawned by the producers, which are all blocked, runner threads never get created to drain the queue.
> Here's a potential fix. In the runner code, replace these lines:
> {code}
> // remove it from the list of running things...
> synchronized (runners) {
>     runners.remove( this );
> }
> {code}
> with these lines:
> {code}
> // remove it from the list of running things unless we are the last runner and the queue is full...
> synchronized (runners) {
>     if (runners.size() == 1 && queue.remainingCapacity() == 0) {
>         // keep this runner alive
>         scheduler.execute(this);
>     } else {
>         runners.remove( this );
>     }
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org