You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stu Hood (JIRA)" <ji...@apache.org> on 2010/08/26 01:20:19 UTC

[jira] Created: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

ColumnFamilyOutputFormat performs blocking writes for large batches
-------------------------------------------------------------------

                 Key: CASSANDRA-1434
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
             Project: Cassandra
          Issue Type: Bug
          Components: Hadoop
            Reporter: Stu Hood
             Fix For: 0.7.0


By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment:     (was: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12912274#action_12912274 ] 

Stu Hood commented on CASSANDRA-1434:
-------------------------------------

> Probably. What should the parent thread do?
Probably what the previous version did.

> In Cassandra. Flip spent several days in the user IRC channel trying to deal with the load spikes.
Does changing this patch solve his problem, or are we assuming that?

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment:     (was: 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12912272#action_12912272 ] 

Stu Hood commented on CASSANDRA-1434:
-------------------------------------

> it's while (run || isEmpty). am i missing something?
Ah, sorry.

> the idea is it tries each endpoint, then throws if they all fail. (if !iter.hasnext() then throw)
Gotcha. I missed that part because there doesn't appear to be a way for the parent thread to figure out that a client died, so I assumed that the clients never died. Does it need an UncaughtExceptionHandler that alerts the parent thread? This was what was accomplished by using offer() rather than put() in the previous version.

> as described above, batching > 1 is a misfeature that has been demonstrated to cause badness in practice.
In Cassandra, or in general?

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1434:
--------------------------------------

    Attachment: 1434-v6.txt

v6.

bq. There is a race condition in put() between !run and the put itself

not really.  the check in put is just an attempt to abort earlier if possible.

bq. Exceptions thrown by child threads will be logged, but not reported to the Hadoop frontend

saved actual exceptions.

bq. the second open in RangeClient is getting TTransportException: Socket already connected

fixed

bq. Logging a NPE for the first batch is pretty ugly

nothing is logged.  the alternatives strike me as uglier.

bq. The default batchSize was increased back up to Long.MAX_VALUE

fixed

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt, 1434-v5.txt, 1434-v6.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment: 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment:     (was: 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12912273#action_12912273 ] 

Jonathan Ellis commented on CASSANDRA-1434:
-------------------------------------------

> Does it need an UncaughtExceptionHandler that alerts the parent thread?

Probably.  What should the parent thread do?

> In Cassandra, or in general?

In Cassandra.  Flip spent several days in the user IRC channel trying to deal with the load spikes.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1434:
--------------------------------------

    Attachment: 1434-v7.txt

v7

bq. This patch includes a change CompactionManager.java

fixed

bq.  it will block indefinitely if an exception occurs between lastException != null and the blocking put()

you're right.  fixed

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt, 1434-v5.txt, 1434-v6.txt, 1434-v7.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment:     (was: 0003-Remove-nesting-in-RingCache.patch)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915481#action_12915481 ] 

Stu Hood commented on CASSANDRA-1434:
-------------------------------------

This patch includes a change CompactionManager.java.

>> There is a race condition in put() between !run and the put itself
> not really. the check in put is just an attempt to abort earlier if possible.
put() is called from the parent thread: it isn't interrupted by the child thread, so it will block indefinitely if an exception occurs between lastException != null and the blocking put(). Unlikely, but...

----

Other than those two nitpicks, +1: tested against a 12 node cluster and saw smooth network utilization.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt, 1434-v5.txt, 1434-v6.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment:     (was: 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1434:
--------------------------------------

    Attachment: 1434-v4.txt

v4 attached to throw IOException on put or stopNicely if the thread has errored out

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1434:
--------------------------------------

    Attachment: 1434-v5.txt

v5 uses a small batch size and eagerly sends out "incomplete" batches if the reducer falls behind

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt, 1434-v5.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Philip (flip) Kromer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904098#action_12904098 ] 

Philip (flip) Kromer edited comment on CASSANDRA-1434 at 8/30/10 5:38 AM:
--------------------------------------------------------------------------

The blocking behavior is causing 'broken pipe' errors (even with relatively small batch sizes) when cassandra latency is high. (This is afaict not network latency but response latency due to a compaction or flush, etc.)

It also makes the whole cluster resonate: one slow node blocks many writers, which then all unblock at the same time, write bursts of enough size to cause a compaction or GC, etc simultaneously on every node. This means adding more writers doesn't work around the blocking write

      was (Author: mrflip):
    The blocking behavior is causing 'broken pipe' errors (even with relatively small batch sizes) when cassandra latency is high. 

It also makes the whole cluster resonate: one slow node blocks many writers, which then all unblock at the same time, write bursts of enough size to cause a compaction or GC, etc simultaneously on every node. This means adding more writers doesn't work around the blocking write
  
> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907245#action_12907245 ] 

Jonathan Ellis commented on CASSANDRA-1434:
-------------------------------------------

why is switching from Multimap<Range, InetAddress> to Map<Range, List<InetAddress>> an improvement?

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Philip (flip) Kromer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904098#action_12904098 ] 

Philip (flip) Kromer commented on CASSANDRA-1434:
-------------------------------------------------

The blocking behavior is causing 'broken pipe' errors (even with relatively small batch sizes) when cassandra latency is high. 

It also makes the whole cluster resonate: one slow node blocks many writers, which then all unblock at the same time, write bursts of enough size to cause a compaction or GC, etc simultaneously on every node. This means adding more writers doesn't work around the blocking write

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913658#action_12913658 ] 

Jonathan Ellis commented on CASSANDRA-1434:
-------------------------------------------

i'm -1 on batching at all.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906531#action_12906531 ] 

Jonathan Ellis commented on CASSANDRA-1434:
-------------------------------------------

why the double-flushing in close()?  can you add a comment for that?

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch, 0004-Fix-regression-introduced-on-1322-add-all-replicas-o.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment:     (was: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1434:
--------------------------------------

    Reviewer: jbellis

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12911018#action_12911018 ] 

Jonathan Ellis commented on CASSANDRA-1434:
-------------------------------------------

Had a look at 02. Still don't understand your objection to using Multimap -- use ListMultimap if you want to preserve ordering.  It's noticeably cleaner than Map<X, List<Y>>, and the Guava guys are very careful about performance.

Also, 02 kind of abuses an executor when a map of threads would be clearer as to what is going on, while not requiring much more code.  "Send this message" is a good task to submit to an executor; "run an infinite loop pulling messages off a public queue" is not.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment:     (was: 0004-Fix-regression-introduced-on-1322-add-all-replicas-o.patch)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment: 0004-Fix-regression-introduced-on-1322-add-all-replicas-o.patch

0004 collects all replicas for each range in RingCache, which I broke in 1322 (previously, we were completely rebuilding the tokenmap using a replication strategy, which would have recreated the lost information).

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch, 0004-Fix-regression-introduced-on-1322-add-all-replicas-o.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch
                0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch
                0003-Remove-nesting-in-RingCache.patch

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1434:
--------------------------------------

    Attachment: 1434-v3.txt

I squashed and added code to keep CFRW from slamming Cassandra with spikes of load: it keeps a pooled connection, and sends mutations one at a time over that.  This is only a trivial amount of overhead compared to using a large batch, since we're not reconnecting for each message.  (The main advantage of using a larger batch is that it gives you an idempotent group of work to replay if necessary, which doesn't matter here.  Under the hood it takes the same code path.)

Also attempted to distinguish between recoverable errors and non- in the exception handling.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915504#action_12915504 ] 

Stu Hood commented on CASSANDRA-1434:
-------------------------------------

+1

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt, 1434-v5.txt, 1434-v6.txt, 1434-v7.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Philip (flip) Kromer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906578#action_12906578 ] 

Philip (flip) Kromer commented on CASSANDRA-1434:
-------------------------------------------------

Right now the code does { buffer n mutations, holding each  acc. to its endpoint. After n writes, check that all endpoint writes are finished, and dispatch to each endpoint its share of the n mutations }

This is non-blocking at the socket level but ends up being blocking at the app level, and the wide variance in size has bad effects on gc at the cassandra end.

I think the ColumnFamilyRecordWriter would see a speedup & improved stability with  { buffer mutations, holding each acc. to its endpoint. When an endpoint has seen n writes, check that any previous write has finished, and dispatch to this endpoint a full buffer of N mutations }.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch, 0004-Fix-regression-introduced-on-1322-add-all-replicas-o.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch
                0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch

0001 is changes to the RingCache that survived from v1: it fixes the bug in ringcache that was handled by pre-0004, and removes the multimap.

0002 is a completely revamped ColumnFamilyRecordWriter: nothing from the original patch survived.
* Launches a client thread per unique range, which is responsible for communicating with endpoint replicas for that range.
** The client threads receives mutations for the range from the parent thread on a bounded queue.
** Client threads will attempt to send a full batch of mutations to its replicas in order: this means that each batch gets up to RF retries before failing, but without any failures, connections will always be made to the first replica.
* The parent thread loops trying to offer to queues for client threads, and checks that they are still alive (and fails if they aren't).
* For a N node cluster, up to (2 * N * batchSize) mutations will be in memory at once, so the default batchSize was lowered to 4096.

Fairly well tested against a 12 node cluster: no obvious races or bottlenecks.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910175#action_12910175 ] 

Stu Hood commented on CASSANDRA-1434:
-------------------------------------

Sent via e-mail while I was on vacation:
{quote}
I wanted to dodge object creation, but I guess I assumed that Multimap created Set and Collection facades for every call. Also, there didn't appear to be a way to iterate over unique keys without a facade.
{quote}

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch
                0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch
                0003-Remove-nesting-in-RingCache.patch

0001 and 0003 are minor fixes, but 0002:
* Avoids blocking processing for writes (but only 2 * batchSize mutations may be in memory at a time, so we may still block)
* Changes the default batchSize to 2^14
* Rotates through possible endpoints for a range per flush, which should more evenly distribute client connections when there are small numbers of keys in play

One issue we haven't tackled yet is how to handle failures: I've reopened CASSANDRA-1264 to handle that.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Fix Version/s: 0.7 beta 2
                       (was: 0.7.0)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch, 0004-Fix-regression-introduced-on-1322-add-all-replicas-o.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12912263#action_12912263 ] 

Stu Hood commented on CASSANDRA-1434:
-------------------------------------

* ArrayBlockingQueue.isEmpty will kill client threads if their queue is ever empty
* Interrupt handling doesn't seem like a clearer solution for killing client threads: what happens when an interrupt in received during a mutation?
* I don't like the idea of indefinite retries: pretending that the cluster is never unavailable sidesteps Hadoop's own retry system
* As mentioned in IRC, batchSize == 1 does not seem like a good value to hardcode. Any amount of overhead becomes measurable when you are sending small enough values: mutations containing a single integer might increase in size X fold for instance

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906533#action_12906533 ] 

Jonathan Ellis commented on CASSANDRA-1434:
-------------------------------------------

committed 01.  declining to apply 03; in general refactoring out a method that is called in a single place obscures control flow rather than clarifies it.  04 looks ok but i'm not sure to what degree it depends on 02 (see above) so leaving alone for now.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-to-TFramedTransport-in-TestRingCache.patch, 0002-Add-kth-endpoint-method-to-RingCache-and-improve-con.patch, 0003-Remove-nesting-in-RingCache.patch, 0004-Fix-regression-introduced-on-1322-add-all-replicas-o.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913654#action_12913654 ] 

Stu Hood commented on CASSANDRA-1434:
-------------------------------------

* There is a race condition in put() between {{!run}} and the put itself
* Exceptions thrown by child threads will be logged, but not reported to the Hadoop frontend, since they aren't what kill the parent thread

I'm -0 on v3 and v4: but I'll add a 0005 to separate queue size from batch size, so that we can tune down the batch size for Flip.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12912270#action_12912270 ] 

Jonathan Ellis commented on CASSANDRA-1434:
-------------------------------------------

bq. ArrayBlockingQueue.isEmpty will kill client threads if their queue is ever empty

it's while (run || isEmpty).  am i missing something?

bq. what happens when an interrupt in received during a mutation

nothing. InterruptedException is only thrown at well-defined points (one of the few times checked exceptions have done me a favor), and blocking socket send is not one of them.  the JDK uses this pattern to shut down threadpoolexecutors.

bq. I don't like the idea of indefinite retries

the idea is it tries each endpoint, then throws if they all fail. (if !iter.hasnext() then throw)

bq. batchSize == 1 does not seem like a good value to hardcode

as described above, batching > 1 is a misfeature that has been demonstrated to cause badness in practice.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1434:
--------------------------------------

    Fix Version/s: 0.7.0
                       (was: 0.7 beta 2)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.0
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt, 1434-v5.txt, 1434-v6.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment: 0003-Switch-RingCache-back-to-multimap.patch
                0004-Replace-Executor-with-map-of-threads.patch

Adding 0003 and 0004 with the requested changes: exception handling was also improved a bit.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment:     (was: 0003-Remove-nesting-in-RingCache.patch)

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood reassigned CASSANDRA-1434:
-----------------------------------

    Assignee: Jonathan Ellis  (was: Stu Hood)

* ColumnFamilyOutputFormat.createAuthenticatedClient calls socket.open, so the second open in RangeClient is getting {{TTransportException: Socket already connected}}
* Logging a NPE for the first batch is pretty ugly
* The default batchSize was increased back up to Long.MAX_VALUE: it should probably be significantly lower (32~128) for the reasons you've mentioned

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch, 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt, 1434-v4.txt, 1434-v5.txt
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.