You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Jerome Boulon (JIRA)" <ji...@apache.org> on 2009/01/16 00:58:59 UTC

[jira] Created: (HADOOP-5060) Create a generic aggregator for Chukwa

Create a generic aggregator for Chukwa
--------------------------------------

                 Key: HADOOP-5060
                 URL: https://issues.apache.org/jira/browse/HADOOP-5060
             Project: Hadoop Core
          Issue Type: New Feature
          Components: contrib/chukwa
            Reporter: Jerome Boulon


Create a generic way to compute aggregation on top of chukwaRecords based on a config file
Should be able:
- work on several Chukwa streams 
- To aggregate by time period
- Group by values for specific keys
- Provide a redefine list of functions (AVG,MIN,MAX,Counter->Rate conversion...)
- work with new functions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5060) Create a generic aggregator for Chukwa

Posted by "Jerome Boulon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676043#action_12676043 ] 

Jerome Boulon commented on HADOOP-5060:
---------------------------------------

Should be able to compute recursive aggregation in one pass based on the key:
ex: Aggregation Key = ABCD => ABC,AB, A should automatically be computed

Real AVG will be compute at the minute level, approximation will be done for 5,10,30, etc aggregation periods



> Create a generic aggregator for Chukwa
> --------------------------------------
>
>                 Key: HADOOP-5060
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5060
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Jerome Boulon
>            Assignee: Jerome Boulon
>
> Create a generic way to compute aggregation on top of chukwaRecords based on a config file
> Should be able:
> - work on several Chukwa streams 
> - To aggregate by time period
> - Group by values for specific keys
> - Provide a redefine list of functions (AVG,MIN,MAX,Counter->Rate conversion...)
> - work with new functions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Work started: (HADOOP-5060) Create a generic aggregator for Chukwa

Posted by "Jerome Boulon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HADOOP-5060 started by Jerome Boulon.

> Create a generic aggregator for Chukwa
> --------------------------------------
>
>                 Key: HADOOP-5060
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5060
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Jerome Boulon
>            Assignee: Jerome Boulon
>
> Create a generic way to compute aggregation on top of chukwaRecords based on a config file
> Should be able:
> - work on several Chukwa streams 
> - To aggregate by time period
> - Group by values for specific keys
> - Provide a redefine list of functions (AVG,MIN,MAX,Counter->Rate conversion...)
> - work with new functions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-5060) Create a generic aggregator for Chukwa

Posted by "Jerome Boulon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jerome Boulon reassigned HADOOP-5060:
-------------------------------------

    Assignee: Jerome Boulon

> Create a generic aggregator for Chukwa
> --------------------------------------
>
>                 Key: HADOOP-5060
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5060
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Jerome Boulon
>            Assignee: Jerome Boulon
>
> Create a generic way to compute aggregation on top of chukwaRecords based on a config file
> Should be able:
> - work on several Chukwa streams 
> - To aggregate by time period
> - Group by values for specific keys
> - Provide a redefine list of functions (AVG,MIN,MAX,Counter->Rate conversion...)
> - work with new functions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5060) Create a generic aggregator for Chukwa

Posted by "Jerome Boulon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676045#action_12676045 ] 

Jerome Boulon commented on HADOOP-5060:
---------------------------------------

Aggregation will be done using M/R and input data should be on HDFS, so not at the collector level.


> Create a generic aggregator for Chukwa
> --------------------------------------
>
>                 Key: HADOOP-5060
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5060
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Jerome Boulon
>            Assignee: Jerome Boulon
>
> Create a generic way to compute aggregation on top of chukwaRecords based on a config file
> Should be able:
> - work on several Chukwa streams 
> - To aggregate by time period
> - Group by values for specific keys
> - Provide a redefine list of functions (AVG,MIN,MAX,Counter->Rate conversion...)
> - work with new functions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5060) Create a generic aggregator for Chukwa

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676065#action_12676065 ] 

Ari Rabkin commented on HADOOP-5060:
------------------------------------

There's substantial interest at Berkeley in doing some sort of in-collector aggregation, to get very short latencies.  It would be a big help if the code written for this JIRA were modular in such a way that it could be pulled out of the map-reduce framework, and run separately.   It's already pretty straightforward to extract Records in the collector.

> Create a generic aggregator for Chukwa
> --------------------------------------
>
>                 Key: HADOOP-5060
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5060
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Jerome Boulon
>            Assignee: Jerome Boulon
>
> Create a generic way to compute aggregation on top of chukwaRecords based on a config file
> Should be able:
> - work on several Chukwa streams 
> - To aggregate by time period
> - Group by values for specific keys
> - Provide a redefine list of functions (AVG,MIN,MAX,Counter->Rate conversion...)
> - work with new functions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5060) Create a generic aggregator for Chukwa

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676042#action_12676042 ] 

Ari Rabkin commented on HADOOP-5060:
------------------------------------

Clarification:

Is the idea to aggregate off HDFS, or do it at the collector?

> Create a generic aggregator for Chukwa
> --------------------------------------
>
>                 Key: HADOOP-5060
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5060
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Jerome Boulon
>            Assignee: Jerome Boulon
>
> Create a generic way to compute aggregation on top of chukwaRecords based on a config file
> Should be able:
> - work on several Chukwa streams 
> - To aggregate by time period
> - Group by values for specific keys
> - Provide a redefine list of functions (AVG,MIN,MAX,Counter->Rate conversion...)
> - work with new functions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.