You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2009/05/04 22:02:30 UTC

[jira] Created: (SOLR-1144) replication hang

replication hang
----------------

                 Key: SOLR-1144
                 URL: https://issues.apache.org/jira/browse/SOLR-1144
             Project: Solr
          Issue Type: Bug
            Reporter: Yonik Seeley


It seems that replication can sometimes hang.
http://www.lucidimagination.com/search/document/403305a3fda18599


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1144) replication hang

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706857#action_12706857 ] 

Yonik Seeley commented on SOLR-1144:
------------------------------------

I don't see a thread trace from the replication handler.... which one should if it was causing the hang, right?

> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1144) replication hang

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706302#action_12706302 ] 

Noble Paul commented on SOLR-1144:
----------------------------------

the stacktrace http://markmail.org/message/ecr6m4rf4iy2d652 . 

I suspect the following two threads are blocked

{code}
'NioBlockingSelector.BlockPoller-2' Id=10, RUNNABLE on lock=, total cpu
time=5580.0000ms user time=2120.0000ms
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at
org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSe
lector.java:305)
'NioBlockingSelector.BlockPoller-1' Id=9, RUNNABLE on lock=, total cpu
time=333280.0000ms user time=107520.0000ms
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollrrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at
org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSe
lector.java:305)
{code}



> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1144) replication hang

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706868#action_12706868 ] 

Noble Paul commented on SOLR-1144:
----------------------------------

ReplicationHandler does not cause the hang on the master. On the slave the SnapPuller was waiting forever which I hope would have fixed with SOLR-1096

> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1144) replication hang

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706199#action_12706199 ] 

Yonik Seeley commented on SOLR-1144:
------------------------------------

Hmmm, I had trouble finding SOLR-1096 before.
But it looks like it was used mainly for adding a timeout.  There's still an underlying bug somewhere, right?

> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (SOLR-1144) replication hang

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Noble Paul reassigned SOLR-1144:
--------------------------------

    Assignee: Noble Paul

> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>            Assignee: Noble Paul
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1144) replication hang

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-1144:
-------------------------------

    Fix Version/s: 1.4

> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1144) replication hang

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705891#action_12705891 ] 

Noble Paul commented on SOLR-1144:
----------------------------------

isn't this same as SOLR-1096 ?

> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1144) replication hang

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707425#action_12707425 ] 

Yonik Seeley commented on SOLR-1144:
------------------------------------

bq. The master closes the connection if everything is written. 

Hmmm, that doesn't jive with the slave hanging on a read though... seems like the only way read() should block is if there is no more data to read currently and the socket is still open.

> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-1144) replication hang

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Noble Paul resolved SOLR-1144.
------------------------------

    Resolution: Fixed

resolving for the time being. We can reopen if the issue is reported again



> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>            Assignee: Noble Paul
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1144) replication hang

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707405#action_12707405 ] 

Yonik Seeley commented on SOLR-1144:
------------------------------------

bq. ReplicationHandler does not cause the hang on the master.

The slave is waiting forever, but it *could* be due to a bug on either the master or the slave, and it could be due to the replication handler.  It could also be another Solr bug somewhere, or it could be a Tomcat bug.

What is apparent is that since there is no replication stack trace on the master, it thinks it finished the file send (either that or got an exception), but the slave is still expecting more for some reason.  Perhaps if we used non-persistent connections for replication, the master would close the connection when it thought it had sent everything?


> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1144) replication hang

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707421#action_12707421 ] 

Noble Paul commented on SOLR-1144:
----------------------------------

The master closes the connection if everything is written.  if the download of a file is complete slave also closes the stream . The fact that the slave continued to wait means the file has not been downloaded completely. 

> replication hang
> ----------------
>
>                 Key: SOLR-1144
>                 URL: https://issues.apache.org/jira/browse/SOLR-1144
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> It seems that replication can sometimes hang.
> http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.