You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2011/04/25 19:21:03 UTC

[jira] [Created] (CASSANDRA-2554) Move gossip heartbeats [back] to its own thread

Move gossip heartbeats [back] to its own thread
-----------------------------------------------

                 Key: CASSANDRA-2554
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2554
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.7.0
            Reporter: Jonathan Ellis
            Assignee: Jonathan Ellis
             Fix For: 0.7.6


Gossip heartbeat *really* needs to run every 1s or other nodes may mark us down. But gossip currently shares an executor thread with other tasks.

I see at least two of these could cause blocking: hint cleanup post-delivery and flush-expired-memtables, both of which call forceFlush which will block if the flush queue + threads are full.

We've run into this before (CASSANDRA-2253); we should move Gossip back to its own dedicated executor or it will keep happening whenever someone accidentally puts something on the "shared" executor that can block.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2554) Move gossip heartbeats [back] to its own thread

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2554:
--------------------------------------

    Attachment: 2554-0.8.txt

patches against 0.7 and 0.8 to move Gossip to its own executor, and move hint deletion + flush expired memtables + cache saving to the long-execution-time executor.

> Move gossip heartbeats [back] to its own thread
> -----------------------------------------------
>
>                 Key: CASSANDRA-2554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2554
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.6
>
>         Attachments: 2554-0.7.txt, 2554-0.8.txt
>
>
> Gossip heartbeat *really* needs to run every 1s or other nodes may mark us down. But gossip currently shares an executor thread with other tasks.
> I see at least two of these could cause blocking: hint cleanup post-delivery and flush-expired-memtables, both of which call forceFlush which will block if the flush queue + threads are full.
> We've run into this before (CASSANDRA-2253); we should move Gossip back to its own dedicated executor or it will keep happening whenever someone accidentally puts something on the "shared" executor that can block.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2554) Move gossip heartbeats [back] to its own thread

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2554:
--------------------------------------

    Attachment: 2554-0.7.txt

> Move gossip heartbeats [back] to its own thread
> -----------------------------------------------
>
>                 Key: CASSANDRA-2554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2554
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.6
>
>         Attachments: 2554-0.7.txt
>
>
> Gossip heartbeat *really* needs to run every 1s or other nodes may mark us down. But gossip currently shares an executor thread with other tasks.
> I see at least two of these could cause blocking: hint cleanup post-delivery and flush-expired-memtables, both of which call forceFlush which will block if the flush queue + threads are full.
> We've run into this before (CASSANDRA-2253); we should move Gossip back to its own dedicated executor or it will keep happening whenever someone accidentally puts something on the "shared" executor that can block.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2554) Move gossip heartbeats [back] to its own thread

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025183#comment-13025183 ] 

Sylvain Lebresne commented on CASSANDRA-2554:
---------------------------------------------

+1

> Move gossip heartbeats [back] to its own thread
> -----------------------------------------------
>
>                 Key: CASSANDRA-2554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2554
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.6
>
>         Attachments: 2554-0.7.txt, 2554-0.8.txt
>
>
> Gossip heartbeat *really* needs to run every 1s or other nodes may mark us down. But gossip currently shares an executor thread with other tasks.
> I see at least two of these could cause blocking: hint cleanup post-delivery and flush-expired-memtables, both of which call forceFlush which will block if the flush queue + threads are full.
> We've run into this before (CASSANDRA-2253); we should move Gossip back to its own dedicated executor or it will keep happening whenever someone accidentally puts something on the "shared" executor that can block.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2554) Move gossip heartbeats [back] to its own thread

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2554:
--------------------------------------

    Attachment: 2554-0.7.txt

> Move gossip heartbeats [back] to its own thread
> -----------------------------------------------
>
>                 Key: CASSANDRA-2554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2554
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.6
>
>         Attachments: 2554-0.7.txt
>
>
> Gossip heartbeat *really* needs to run every 1s or other nodes may mark us down. But gossip currently shares an executor thread with other tasks.
> I see at least two of these could cause blocking: hint cleanup post-delivery and flush-expired-memtables, both of which call forceFlush which will block if the flush queue + threads are full.
> We've run into this before (CASSANDRA-2253); we should move Gossip back to its own dedicated executor or it will keep happening whenever someone accidentally puts something on the "shared" executor that can block.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2554) Move gossip heartbeats [back] to its own thread

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2554:
--------------------------------------

    Attachment:     (was: 2554-0.7.txt)

> Move gossip heartbeats [back] to its own thread
> -----------------------------------------------
>
>                 Key: CASSANDRA-2554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2554
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.6
>
>         Attachments: 2554-0.7.txt
>
>
> Gossip heartbeat *really* needs to run every 1s or other nodes may mark us down. But gossip currently shares an executor thread with other tasks.
> I see at least two of these could cause blocking: hint cleanup post-delivery and flush-expired-memtables, both of which call forceFlush which will block if the flush queue + threads are full.
> We've run into this before (CASSANDRA-2253); we should move Gossip back to its own dedicated executor or it will keep happening whenever someone accidentally puts something on the "shared" executor that can block.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2554) Move gossip heartbeats [back] to its own thread

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025343#comment-13025343 ] 

Hudson commented on CASSANDRA-2554:
-----------------------------------

Integrated in Cassandra-0.7 #458 (See [https://builds.apache.org/hudson/job/Cassandra-0.7/458/])
    movegossip heartbeat back to its own thread
patch by jbellis; reviewed by slebresne for CASSANDRA-2554


> Move gossip heartbeats [back] to its own thread
> -----------------------------------------------
>
>                 Key: CASSANDRA-2554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2554
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.6
>
>         Attachments: 2554-0.7.txt, 2554-0.8.txt
>
>
> Gossip heartbeat *really* needs to run every 1s or other nodes may mark us down. But gossip currently shares an executor thread with other tasks.
> I see at least two of these could cause blocking: hint cleanup post-delivery and flush-expired-memtables, both of which call forceFlush which will block if the flush queue + threads are full.
> We've run into this before (CASSANDRA-2253); we should move Gossip back to its own dedicated executor or it will keep happening whenever someone accidentally puts something on the "shared" executor that can block.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira