You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Jerome Boulon (JIRA)" <ji...@apache.org> on 2009/01/13 19:32:06 UTC

[jira] Commented: (HADOOP-5018) Chukwa should support pipelined writers

    [ https://issues.apache.org/jira/browse/HADOOP-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663405#action_12663405 ] 

Jerome Boulon commented on HADOOP-5018:
---------------------------------------

Hi Ari,
I just want let you know that I'm planning to remove the HDFS dependency, 
1) The collector will first write to the local file system and then 2) the data will be pushed to a pub/sub framework to be used by real time components.
Later on the data will be moved to HDFS in a background thread or process.

Why 1 and 2

1) because people may want to only use chukwa to collect their data without any Hadoop dependency
2) to easily be able to extends Chukwa just by listening to an event.

The pub/sub framework will allow to filter by dataType and tags like source/cluster for example

I also want to solve the duplicate removal problem for chunks at the collector level.

> Chukwa should support pipelined writers
> ---------------------------------------
>
>                 Key: HADOOP-5018
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5018
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>         Attachments: pipeline.patch
>
>
> We ought to support chaining together writers; this will radically increase flexibility and make it practical to add new features without major surgery by putting them in pass-through or filter classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.