You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sammy Yu (JIRA)" <ji...@apache.org> on 2009/08/26 20:35:59 UTC

[jira] Created: (CASSANDRA-392) Deadlock with SelectorManager.doProcess and TcpConnection.write

Deadlock with SelectorManager.doProcess and TcpConnection.write
---------------------------------------------------------------

                 Key: CASSANDRA-392
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-392
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.4
            Reporter: Sammy Yu
            Assignee: Sammy Yu
             Fix For: 0.4


We ran into a deadlock last night:
Name: MESSAGE-SERIALIZER-POOL:2
State: BLOCKED on sun.nio.ch.SelectionKeyImpl@2e257f1b owned by: TCP Selector Manager
Total blocked: 1  Total waited: 1

Stack trace: 
org.apache.cassandra.net.SelectionKeyHandler.turnOnInterestOps(SelectionKeyHandler.java:73)
org.apache.cassandra.net.TcpConnection.write(TcpConnection.java:186)
   - locked org.apache.cassandra.net.TcpConnection@5ab9f791
org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:67)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:619)



Name: TCP Selector Manager
State: BLOCKED on org.apache.cassandra.net.TcpConnection@5ab9f791 owned by: MESSAGE-SERIALIZER-POOL:2
Total blocked: 2  Total waited: 0

Stack trace: 
org.apache.cassandra.net.TcpConnection.connect(TcpConnection.java:360)
org.apache.cassandra.net.SelectorManager.doProcess(SelectorManager.java:131)
   - locked sun.nio.ch.SelectionKeyImpl@2e257f1b
org.apache.cassandra.net.SelectorManager.run(SelectorManager.java:98)


The SelectionManager.doProcess acquires a monitor on the SelectionKey and then calls methods such as TcpConnection.connect(SelectionKey key) which obtains a monitor for the TcpConnection object itself.  Another task eg: MessageSerializationTask can come along and call write(Message message) which obtains a monitor for the TCPConnection first and then on calls to turnOnInterestOps tries to obtain the monitor for the SelectionKey which causes the deadlock.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-392) Deadlock with SelectorManager.doProcess and TcpConnection.write

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated CASSANDRA-392:
------------------------------

    Attachment: issue392.patchv2

In SelectorManager.doProcess(), I don't see why we need to synchronize on each selection key any more. Within each of the process such as connect, read and write, we already synchronize on the selection key through turnOnInterestOps/turnOffInterestOps (which only holds a short lock on a selection key).

Attached is a patch that removes the selection key synchronization in SelectorManager(). Sammy, could you give it a try and see it works?

> Deadlock with SelectorManager.doProcess and TcpConnection.write
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-392
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Sammy Yu
>            Assignee: Sammy Yu
>             Fix For: 0.4
>
>         Attachments: 0001-CASSANDRA-392-Moved-turnOnInterestOps-outside-of-syn.patch, issue392.patchv2
>
>
> We ran into a deadlock last night:
> Name: MESSAGE-SERIALIZER-POOL:2
> State: BLOCKED on sun.nio.ch.SelectionKeyImpl@2e257f1b owned by: TCP Selector Manager
> Total blocked: 1  Total waited: 1
> Stack trace: 
> org.apache.cassandra.net.SelectionKeyHandler.turnOnInterestOps(SelectionKeyHandler.java:73)
> org.apache.cassandra.net.TcpConnection.write(TcpConnection.java:186)
>    - locked org.apache.cassandra.net.TcpConnection@5ab9f791
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:67)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:619)
> Name: TCP Selector Manager
> State: BLOCKED on org.apache.cassandra.net.TcpConnection@5ab9f791 owned by: MESSAGE-SERIALIZER-POOL:2
> Total blocked: 2  Total waited: 0
> Stack trace: 
> org.apache.cassandra.net.TcpConnection.connect(TcpConnection.java:360)
> org.apache.cassandra.net.SelectorManager.doProcess(SelectorManager.java:131)
>    - locked sun.nio.ch.SelectionKeyImpl@2e257f1b
> org.apache.cassandra.net.SelectorManager.run(SelectorManager.java:98)
> The SelectionManager.doProcess acquires a monitor on the SelectionKey and then calls methods such as TcpConnection.connect(SelectionKey key) which obtains a monitor for the TcpConnection object itself.  Another task eg: MessageSerializationTask can come along and call write(Message message) which obtains a monitor for the TCPConnection first and then on calls to turnOnInterestOps tries to obtain the monitor for the SelectionKey which causes the deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-392) Deadlock with SelectorManager.doProcess and TcpConnection.write

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reassigned CASSANDRA-392:
----------------------------------------

    Assignee: Jun Rao  (was: Sammy Yu)

> Deadlock with SelectorManager.doProcess and TcpConnection.write
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-392
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Sammy Yu
>            Assignee: Jun Rao
>             Fix For: 0.4
>
>         Attachments: 0001-CASSANDRA-392-Moved-turnOnInterestOps-outside-of-syn.patch, issue392.patchv2
>
>
> We ran into a deadlock last night:
> Name: MESSAGE-SERIALIZER-POOL:2
> State: BLOCKED on sun.nio.ch.SelectionKeyImpl@2e257f1b owned by: TCP Selector Manager
> Total blocked: 1  Total waited: 1
> Stack trace: 
> org.apache.cassandra.net.SelectionKeyHandler.turnOnInterestOps(SelectionKeyHandler.java:73)
> org.apache.cassandra.net.TcpConnection.write(TcpConnection.java:186)
>    - locked org.apache.cassandra.net.TcpConnection@5ab9f791
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:67)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:619)
> Name: TCP Selector Manager
> State: BLOCKED on org.apache.cassandra.net.TcpConnection@5ab9f791 owned by: MESSAGE-SERIALIZER-POOL:2
> Total blocked: 2  Total waited: 0
> Stack trace: 
> org.apache.cassandra.net.TcpConnection.connect(TcpConnection.java:360)
> org.apache.cassandra.net.SelectorManager.doProcess(SelectorManager.java:131)
>    - locked sun.nio.ch.SelectionKeyImpl@2e257f1b
> org.apache.cassandra.net.SelectorManager.run(SelectorManager.java:98)
> The SelectionManager.doProcess acquires a monitor on the SelectionKey and then calls methods such as TcpConnection.connect(SelectionKey key) which obtains a monitor for the TcpConnection object itself.  Another task eg: MessageSerializationTask can come along and call write(Message message) which obtains a monitor for the TCPConnection first and then on calls to turnOnInterestOps tries to obtain the monitor for the SelectionKey which causes the deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-392) Deadlock with SelectorManager.doProcess and TcpConnection.write

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749886#action_12749886 ] 

Hudson commented on CASSANDRA-392:
----------------------------------

Integrated in Cassandra #184 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/184/])
    Deadlock with SelectorManager.doProcess and TcpConnection.write; patch by Sammy Yu and junrao; reviewed by Chris Goffinet for 


> Deadlock with SelectorManager.doProcess and TcpConnection.write
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-392
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Sammy Yu
>            Assignee: Jun Rao
>             Fix For: 0.4
>
>         Attachments: 0001-CASSANDRA-392-Moved-turnOnInterestOps-outside-of-syn.patch, issue392.patchv2
>
>
> We ran into a deadlock last night:
> Name: MESSAGE-SERIALIZER-POOL:2
> State: BLOCKED on sun.nio.ch.SelectionKeyImpl@2e257f1b owned by: TCP Selector Manager
> Total blocked: 1  Total waited: 1
> Stack trace: 
> org.apache.cassandra.net.SelectionKeyHandler.turnOnInterestOps(SelectionKeyHandler.java:73)
> org.apache.cassandra.net.TcpConnection.write(TcpConnection.java:186)
>    - locked org.apache.cassandra.net.TcpConnection@5ab9f791
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:67)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:619)
> Name: TCP Selector Manager
> State: BLOCKED on org.apache.cassandra.net.TcpConnection@5ab9f791 owned by: MESSAGE-SERIALIZER-POOL:2
> Total blocked: 2  Total waited: 0
> Stack trace: 
> org.apache.cassandra.net.TcpConnection.connect(TcpConnection.java:360)
> org.apache.cassandra.net.SelectorManager.doProcess(SelectorManager.java:131)
>    - locked sun.nio.ch.SelectionKeyImpl@2e257f1b
> org.apache.cassandra.net.SelectorManager.run(SelectorManager.java:98)
> The SelectionManager.doProcess acquires a monitor on the SelectionKey and then calls methods such as TcpConnection.connect(SelectionKey key) which obtains a monitor for the TcpConnection object itself.  Another task eg: MessageSerializationTask can come along and call write(Message message) which obtains a monitor for the TCPConnection first and then on calls to turnOnInterestOps tries to obtain the monitor for the SelectionKey which causes the deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-392) Deadlock with SelectorManager.doProcess and TcpConnection.write

Posted by "Sammy Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sammy Yu updated CASSANDRA-392:
-------------------------------

    Attachment: 0001-CASSANDRA-392-Moved-turnOnInterestOps-outside-of-syn.patch

For TcpConnection.write(Message) moved turnOnInterestOps outside of synchronized block so that call only depends on having the SelectionKey monitor.



> Deadlock with SelectorManager.doProcess and TcpConnection.write
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-392
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Sammy Yu
>            Assignee: Sammy Yu
>             Fix For: 0.4
>
>         Attachments: 0001-CASSANDRA-392-Moved-turnOnInterestOps-outside-of-syn.patch
>
>
> We ran into a deadlock last night:
> Name: MESSAGE-SERIALIZER-POOL:2
> State: BLOCKED on sun.nio.ch.SelectionKeyImpl@2e257f1b owned by: TCP Selector Manager
> Total blocked: 1  Total waited: 1
> Stack trace: 
> org.apache.cassandra.net.SelectionKeyHandler.turnOnInterestOps(SelectionKeyHandler.java:73)
> org.apache.cassandra.net.TcpConnection.write(TcpConnection.java:186)
>    - locked org.apache.cassandra.net.TcpConnection@5ab9f791
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:67)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:619)
> Name: TCP Selector Manager
> State: BLOCKED on org.apache.cassandra.net.TcpConnection@5ab9f791 owned by: MESSAGE-SERIALIZER-POOL:2
> Total blocked: 2  Total waited: 0
> Stack trace: 
> org.apache.cassandra.net.TcpConnection.connect(TcpConnection.java:360)
> org.apache.cassandra.net.SelectorManager.doProcess(SelectorManager.java:131)
>    - locked sun.nio.ch.SelectionKeyImpl@2e257f1b
> org.apache.cassandra.net.SelectorManager.run(SelectorManager.java:98)
> The SelectionManager.doProcess acquires a monitor on the SelectionKey and then calls methods such as TcpConnection.connect(SelectionKey key) which obtains a monitor for the TcpConnection object itself.  Another task eg: MessageSerializationTask can come along and call write(Message message) which obtains a monitor for the TCPConnection first and then on calls to turnOnInterestOps tries to obtain the monitor for the SelectionKey which causes the deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-392) Deadlock with SelectorManager.doProcess and TcpConnection.write

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749194#action_12749194 ] 

Chris Goffinet commented on CASSANDRA-392:
------------------------------------------

+1.

We tested this in production and its working good!. Let's ship it.

> Deadlock with SelectorManager.doProcess and TcpConnection.write
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-392
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Sammy Yu
>            Assignee: Sammy Yu
>             Fix For: 0.4
>
>         Attachments: 0001-CASSANDRA-392-Moved-turnOnInterestOps-outside-of-syn.patch, issue392.patchv2
>
>
> We ran into a deadlock last night:
> Name: MESSAGE-SERIALIZER-POOL:2
> State: BLOCKED on sun.nio.ch.SelectionKeyImpl@2e257f1b owned by: TCP Selector Manager
> Total blocked: 1  Total waited: 1
> Stack trace: 
> org.apache.cassandra.net.SelectionKeyHandler.turnOnInterestOps(SelectionKeyHandler.java:73)
> org.apache.cassandra.net.TcpConnection.write(TcpConnection.java:186)
>    - locked org.apache.cassandra.net.TcpConnection@5ab9f791
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:67)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:619)
> Name: TCP Selector Manager
> State: BLOCKED on org.apache.cassandra.net.TcpConnection@5ab9f791 owned by: MESSAGE-SERIALIZER-POOL:2
> Total blocked: 2  Total waited: 0
> Stack trace: 
> org.apache.cassandra.net.TcpConnection.connect(TcpConnection.java:360)
> org.apache.cassandra.net.SelectorManager.doProcess(SelectorManager.java:131)
>    - locked sun.nio.ch.SelectionKeyImpl@2e257f1b
> org.apache.cassandra.net.SelectorManager.run(SelectorManager.java:98)
> The SelectionManager.doProcess acquires a monitor on the SelectionKey and then calls methods such as TcpConnection.connect(SelectionKey key) which obtains a monitor for the TcpConnection object itself.  Another task eg: MessageSerializationTask can come along and call write(Message message) which obtains a monitor for the TCPConnection first and then on calls to turnOnInterestOps tries to obtain the monitor for the SelectionKey which causes the deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (CASSANDRA-392) Deadlock with SelectorManager.doProcess and TcpConnection.write

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao closed CASSANDRA-392.
-----------------------------


> Deadlock with SelectorManager.doProcess and TcpConnection.write
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-392
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4
>            Reporter: Sammy Yu
>            Assignee: Sammy Yu
>             Fix For: 0.4
>
>         Attachments: 0001-CASSANDRA-392-Moved-turnOnInterestOps-outside-of-syn.patch, issue392.patchv2
>
>
> We ran into a deadlock last night:
> Name: MESSAGE-SERIALIZER-POOL:2
> State: BLOCKED on sun.nio.ch.SelectionKeyImpl@2e257f1b owned by: TCP Selector Manager
> Total blocked: 1  Total waited: 1
> Stack trace: 
> org.apache.cassandra.net.SelectionKeyHandler.turnOnInterestOps(SelectionKeyHandler.java:73)
> org.apache.cassandra.net.TcpConnection.write(TcpConnection.java:186)
>    - locked org.apache.cassandra.net.TcpConnection@5ab9f791
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:67)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:619)
> Name: TCP Selector Manager
> State: BLOCKED on org.apache.cassandra.net.TcpConnection@5ab9f791 owned by: MESSAGE-SERIALIZER-POOL:2
> Total blocked: 2  Total waited: 0
> Stack trace: 
> org.apache.cassandra.net.TcpConnection.connect(TcpConnection.java:360)
> org.apache.cassandra.net.SelectorManager.doProcess(SelectorManager.java:131)
>    - locked sun.nio.ch.SelectionKeyImpl@2e257f1b
> org.apache.cassandra.net.SelectorManager.run(SelectorManager.java:98)
> The SelectionManager.doProcess acquires a monitor on the SelectionKey and then calls methods such as TcpConnection.connect(SelectionKey key) which obtains a monitor for the TcpConnection object itself.  Another task eg: MessageSerializationTask can come along and call write(Message message) which obtains a monitor for the TCPConnection first and then on calls to turnOnInterestOps tries to obtain the monitor for the SelectionKey which causes the deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.