You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Chris Goffinet (JIRA)" <ji...@apache.org> on 2010/01/20 09:27:58 UTC

[jira] Created: (CASSANDRA-724) Insert/Get Contention

Insert/Get Contention
---------------------

                 Key: CASSANDRA-724
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.6
            Reporter: Chris Goffinet
         Attachments: test_case.py

We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.

Results
---------------------
Slow insert test.10882 0.203548192978
Slow insert test.18005 0.203876972198
Slow insert test.21154 0.204496860504
Slow insert test.22054 0.0444049835205
Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-724) Insert/Get Contention

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803485#action_12803485 ] 

Chris Goffinet commented on CASSANDRA-724:
------------------------------------------

We didn't see much change, I think applying debug is going to be required. It ranges from 45ms to 800ms sometimes.

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 724.patch, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment:     (was: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-724) Insert/Get Contention

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829737#action_12829737 ] 

Brandon Williams commented on CASSANDRA-724:
--------------------------------------------

+1, much improved for me:

Slow insert test.15910 0.0661840438843
Slow insert test.37799 0.073842048645
Slow insert test.38254 0.0541589260101
Slow insert test.46248 0.0541749000549
Slow insert test.56482 0.0474050045013
Slow insert test.70314 0.0435261726379
Slow insert test.76370 0.0660541057587
Slow insert test.170684 0.0553348064423
Slow insert test.170685 0.0560541152954
Slow insert test.202273 0.0667309761047

I also confirmed w/verbose:gc that the long gc pauses related to compaction/deletion are gone.

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment:     (was: 724.patch)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Goffinet updated CASSANDRA-724:
-------------------------------------

    Attachment: test_case.py

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Chris Goffinet
>         Attachments: test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829658#action_12829658 ] 

Jonathan Ellis commented on CASSANDRA-724:
------------------------------------------

rebased again

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment:     (was: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment:     (was: 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment: 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt
                0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt
                0001-decouple-periodic-sync-mode-from-commit-log-append.txt

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803007#action_12803007 ] 

Jonathan Ellis commented on CASSANDRA-724:
------------------------------------------

I can reproduce this on a single-node setup, so I think it is possible you are seeing two effects: one from the messagingservice stack (Brandon points out that this happens much more frequently w/o CASSANDRA-705, than with it), and one from the commitlog sync (which I can reproduce on a single node system).

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Chris Goffinet
>         Attachments: test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment:     (was: 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment:     (was: 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment:     (was: 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-724) Insert/Get Contention

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802743#action_12802743 ] 

Chris Goffinet commented on CASSANDRA-724:
------------------------------------------

Just an observation, we see this happening on 99% of keys that need to send a remote write.

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Chris Goffinet
>         Attachments: test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805282#action_12805282 ] 

Jonathan Ellis commented on CASSANDRA-724:
------------------------------------------

patches 02 and 03 will reduce your System.gc frequency (as long as you have spare disk space):

03
    only gc if there are undeleted sstables that gc-ing could free

02
    replace gc after each compaction w/ gc before compaction/flush only if we need it for the file space

01
    decouple periodic sync mode from commit log append [original patch posted]



> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803922#action_12803922 ] 

Jonathan Ellis commented on CASSANDRA-724:
------------------------------------------

Brandon's did some more testing and found that the System.gc() we request (to allow cleaning up obsolete sstables after a compaction) is the culprit.

Maybe it's time to experiment w/ the g1 garbage collector: http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp

Alternatively, one workaround might be to only issue the gc() request if we're within some percent of the disk filling up (we can use File.getUsableSpace / File.getTotalSpace for that)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 724.patch, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-724) Insert/Get Contention

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834894#action_12834894 ] 

Hudson commented on CASSANDRA-724:
----------------------------------

Integrated in Cassandra #357 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/357/])
    

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment: 724.patch

this addresses latency from waiting for commitlog append to finish (which will be delayed if commitlog is busy syncing).  in batch mode we have to wait because that is part of our contract, but in periodic mode we do not.

705 will be committed soon and that will address that.


> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>             Fix For: 0.6
>
>         Attachments: 724.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment: 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt
                0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt
                0001-decouple-periodic-sync-mode-from-commit-log-append.txt

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

          Component/s: Core
        Fix Version/s: 0.6
             Assignee: Jonathan Ellis
    Affects Version/s:     (was: 0.6)

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 724.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment: debug.patch

patch to add debug timing info if you want to investigate further.

there does seem to be occasional latency spikes inside ColumnFamilyStore.apply that I do not yet understand.

when cpus are busy w/ compaction latency increases.  no real surprise there.

thrift sometimes adds 10s of ms of latency according to the differences b/t what my python client sees and what CassandraServer sees.  the java side of thrift does call setTcpNoDelay(true), but the python side does not -- the equivalent would be, setsockopt(SOL_TCP, TCP_NODELAY, 1).  that is probably the culprit.


> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 724.patch, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-724) Insert/Get Contention

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-724:
-------------------------------------

    Attachment: 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt
                0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt
                0001-decouple-periodic-sync-mode-from-commit-log-append.txt

> Insert/Get Contention
> ---------------------
>
>                 Key: CASSANDRA-724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>         Attachments: 0001-decouple-periodic-sync-mode-from-commit-log-append.txt, 0002-replace-gc-after-each-compaction-w-gc-before-compactio.txt, 0003-only-gc-if-there-are-undeleted-sstables-that-gc-ing-co.txt, debug.patch, test_case.py
>
>
> We tried out the socket io patch in CASSANDRA-705, tested the latest JVM of b18 for 1.6. Still seeing very strange insert times. We see this with get_slices as well but it's easy to reproduce with batch_insert. I wonder if its related to Memtable contention, it's pretty easy to see the slow times when you restart the test script attached. We are running this on a 7 node cluster, <1% cpu. Consistency Level of 1.
> Results
> ---------------------
> Slow insert test.10882 0.203548192978
> Slow insert test.18005 0.203876972198
> Slow insert test.21154 0.204496860504
> Slow insert test.22054 0.0444049835205
> Slow insert test.26445 0.201545000076

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.