You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Peter Schuller (Created) (JIRA)" <ji...@apache.org> on 2012/03/09 16:10:57 UTC

[jira] [Created] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

memtable.updateLiveRatio() is blocking, causing insane latencies for writes
---------------------------------------------------------------------------

                 Key: CASSANDRA-4032
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Peter Schuller
            Assignee: Peter Schuller
             Fix For: 1.1.0


Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d smf1-amv-01-sr1 -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.

Example ("blocked for" is my debug log added around submit()):

{code}
 INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
 WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
{code}

The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4032:
--------------------------------------

    Attachment: 4032-v3.txt
    
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: 4032-v3.txt, CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4032:
--------------------------------------

    Attachment: 4032-v3.txt

v3 attached:

- uses putIfAbsent in the AtomicBoolean creation
- drops the catch block around executor submission
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: 4032-v3.txt, CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226144#comment-13226144 ] 

Jonathan Ellis commented on CASSANDRA-4032:
-------------------------------------------

The DTPE change looks good, but don't you need a catch (Rejected) { } block in updateLiveRatio?
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226153#comment-13226153 ] 

Peter Schuller commented on CASSANDRA-4032:
-------------------------------------------

This is the code:

{code}
        try
        {
            meterExecutor.submit(runnable);
        }
        catch (RejectedExecutionException e)
        {
            logger.debug("Meter thread is busy; skipping liveRatio update for {}", cfs);
        }
{code}

                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226152#comment-13226152 ] 

Peter Schuller commented on CASSANDRA-4032:
-------------------------------------------

There already is one, it's just a NOOP because of the blocking nature of the DTPE.
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Schuller updated CASSANDRA-4032:
--------------------------------------

    Description: 
Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.

Example ("blocked for" is my debug log added around submit()):

{code}
 INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
 WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
{code}

The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.


  was:
Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d smf1-amv-01-sr1 -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.

Example ("blocked for" is my debug log added around submit()):

{code}
 INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
 WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
{code}

The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.


    
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226272#comment-13226272 ] 

Peter Schuller commented on CASSANDRA-4032:
-------------------------------------------

(v2 doesn't keep the changes to DTPE, which are no longer used.)
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226160#comment-13226160 ] 

Jonathan Ellis commented on CASSANDRA-4032:
-------------------------------------------

bq. This is the code:

Either I'm blind or that's not in the attached patch.
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250746#comment-13250746 ] 

Sylvain Lebresne commented on CASSANDRA-4032:
---------------------------------------------

Wouldn't using a ConcurrentSkipListSet simply the code? Feels like a lot of boilerplate to record that there is already a metering running (it's not like this is a hot path enough to justify the NBHM). 
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: 4032-v3.txt, CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4032:
--------------------------------------

    Attachment:     (was: 4032-v3.txt)
    
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: 4032-v3.txt, CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226166#comment-13226166 ] 

Peter Schuller commented on CASSANDRA-4032:
-------------------------------------------

{code}
Are we sure that what we want is a SynchronousQueue with task rejected? After all, there is only on global memoryMeter, so we could end up failing to updateLiveRatio just based on a race, even if calculations are fast. I'd suggest instead a bounded queue (but maybe not infinite and we could indeed just skip task if that queue gets full).
{code}

I agree it's fishy, though I'd suggest a separate ticket. This patch is intended to make the code behave the way the original commit intended.

This (from the code, not my patch) seems legit though:

{code}
    // we're careful to only allow one count to run at a time because counting is slow
    // (can be minutes, for a large memtable and a busy server), so we could keep memtables
    // alive after they're flushed and would otherwise be GC'd.
{code}

We could have one queue per unique CF and have a consumer that iterates over the set of queues, guaranteeing that each CF gets processed once per "cycle". A simpler solution is probably preferable though if we can think of one.

                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226173#comment-13226173 ] 

Sylvain Lebresne commented on CASSANDRA-4032:
---------------------------------------------

Sounds fine to me.
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226169#comment-13226169 ] 

Peter Schuller commented on CASSANDRA-4032:
-------------------------------------------

(That seems simple enough to just do right now and not bother with a separate ticket. If you agree I'll submit a patch.)
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Schuller updated CASSANDRA-4032:
--------------------------------------

    Attachment: CASSANDRA-4032-1.1.0-v1.txt

Attaching patch that allows us to create both blocking and non-blocking {{DebuggableThreadPoolExecturor}}:s; use that for this particular case.
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Schuller updated CASSANDRA-4032:
--------------------------------------

    Attachment: CASSANDRA-4032-1.1.0-v2.txt

Attaching {{v2}}. Keeps NBHM mapping CFS -> AtomicBoolean. Mappings are never removed (assuming reasonably bound number of unique CFS:s during a lifetime). Queue used for DTPE is unbounded.

                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251622#comment-13251622 ] 

Sylvain Lebresne commented on CASSANDRA-4032:
---------------------------------------------

+1 with two nits:
* In the comment "to a maximum of one per CFS using this map" could have a s/map/set/
* Note sure if there was an intent in changing the maximumPoolSize of the meterExecutor to Integer.MAX_VALUE. As it stands, with an unbounded queue the maximumPoolSize is ignored so that doesn't really matter, but just wanted to mention it.
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: 4032-v3.txt, 4032-v4.txt, CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226167#comment-13226167 ] 

Peter Schuller commented on CASSANDRA-4032:
-------------------------------------------

How about just having an unbounded queue but having each CF just keep a flag that says whether or not there is a calculation currently pending or executing; it would be reset by the task when executed, and CAS:ed in the write path. This; single queue, same DTPE. We'd just submit a task with a reference to the AtomicBoolean which it will set to false on completion.
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4032:
--------------------------------------

    Attachment: 4032-v4.txt

you're right; v4 attached w/ Set approach.

(used NBHS instead of CSLS since the latter requires defining a comparator.  I'm not sure the overhead is substantially different.)
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: 4032-v3.txt, 4032-v4.txt, CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226166#comment-13226166 ] 

Peter Schuller edited comment on CASSANDRA-4032 at 3/9/12 4:05 PM:
-------------------------------------------------------------------

{quote}
Are we sure that what we want is a SynchronousQueue with task rejected? After all, there is only on global memoryMeter, so we could end up failing to updateLiveRatio just based on a race, even if calculations are fast. I'd suggest instead a bounded queue (but maybe not infinite and we could indeed just skip task if that queue gets full).
{quote}

I agree it's fishy, though I'd suggest a separate ticket. This patch is intended to make the code behave the way the original commit intended.

This (from the code, not my patch) seems legit though:

{code}
    // we're careful to only allow one count to run at a time because counting is slow
    // (can be minutes, for a large memtable and a busy server), so we could keep memtables
    // alive after they're flushed and would otherwise be GC'd.
{code}

We could have one queue per unique CF and have a consumer that iterates over the set of queues, guaranteeing that each CF gets processed once per "cycle". A simpler solution is probably preferable though if we can think of one.

                
      was (Author: scode):
    {code}
Are we sure that what we want is a SynchronousQueue with task rejected? After all, there is only on global memoryMeter, so we could end up failing to updateLiveRatio just based on a race, even if calculations are fast. I'd suggest instead a bounded queue (but maybe not infinite and we could indeed just skip task if that queue gets full).
{code}

I agree it's fishy, though I'd suggest a separate ticket. This patch is intended to make the code behave the way the original commit intended.

This (from the code, not my patch) seems legit though:

{code}
    // we're careful to only allow one count to run at a time because counting is slow
    // (can be minutes, for a large memtable and a busy server), so we could keep memtables
    // alive after they're flushed and would otherwise be GC'd.
{code}

We could have one queue per unique CF and have a consumer that iterates over the set of queues, guaranteeing that each CF gets processed once per "cycle". A simpler solution is probably preferable though if we can think of one.

                  
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226157#comment-13226157 ] 

Sylvain Lebresne commented on CASSANDRA-4032:
---------------------------------------------

Are we sure that what we want is a SynchronousQueue with task rejected? After all, there is only on global memoryMeter, so we could end up failing to updateLiveRatio just based on a race, even if calculations are fast. I'd suggest instead a bounded queue (but maybe not infinite and we could indeed just skip task if that queue gets full).
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226163#comment-13226163 ] 

Peter Schuller commented on CASSANDRA-4032:
-------------------------------------------

{quote}
Either I'm blind or that's not in the attached patch.
{quote}

It's not. It's *already there*, in the branch, committed. The code was written, I presume, without the author realizing that RejectedExecutionException would never be thrown.
                
> memtable.updateLiveRatio() is blocking, causing insane latencies for writes
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.1.0
>
>         Attachments: CASSANDRA-4032-1.1.0-v1.txt
>
>
> Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n100000000 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts.
> Example ("blocked for" is my debug log added around submit()):
> {code}
>  INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727).  calculation took 28273ms for 1320245 columns
>  WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135
> {code}
> The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira