You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Michael Kjellman (JIRA)" <ji...@apache.org> on 2012/11/08 07:22:13 UTC

[jira] [Comment Edited] (CASSANDRA-4912) BulkOutputFormat should support Hadoop MultipleOutput

    [ https://issues.apache.org/jira/browse/CASSANDRA-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492993#comment-13492993 ] 

Michael Kjellman edited comment on CASSANDRA-4912 at 11/8/12 6:21 AM:
----------------------------------------------------------------------

so obviously this is due to the handling in the close() function in BulkRecordWriter. So far i've been unable to get BOF to work in Local mode thru eclipse with MultipleOutput. ConfigHelper is happy on the first check of the job config, but when the reducer is instantiated the column family output names don't seem to be set. close() is pretty simple in BulkRecordWriter though, looks like the sstable is first closed, and then streamed to the nodes. I'm guessing that either close() is only being called on one of the sstables/named outputs (i do see in a fully distributed cluster the sstables get created for multiple column families).
                
      was (Author: mkjellman):
    so obviously this is due to the handling in the close() function in BulkRecordWriter. So far i've been unable to get BOF to work in Local mode thru eclipse with multipleoutput. ConfigHelper is happy on the first check, but when the reducer is created the column family output names don't seem to be set. close() is pretty simple, looks like the sstable is first closed, and then streamed to the nodes. I'm guessing that either close is only being close on one of the sstables (i do see in a fully distributed cluster the sstables get created for multiple column families) but maybe we don't close it thus it never streams to the nodes?
                  
> BulkOutputFormat should support Hadoop MultipleOutput
> -----------------------------------------------------
>
>                 Key: CASSANDRA-4912
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4912
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Michael Kjellman
>
> Much like CASSANDRA-4208 BOF should support outputting to Multiple Column Families. The current approach takken in the patch for COF results in only one stream being sent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira