You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by "Jerome Boulon (JIRA)" <ji...@apache.org> on 2009/04/17 02:22:15 UTC

[jira] Created: (CHUKWA-146) All hadoop logs should use a different RecordType

All hadoop logs should use a different RecordType
-------------------------------------------------

                 Key: CHUKWA-146
                 URL: https://issues.apache.org/jira/browse/CHUKWA-146
             Project: Hadoop Chukwa
          Issue Type: Improvement
            Reporter: Jerome Boulon
            Priority: Critical


All hadoop logs are using the same RecordType, so only 1 Reducer is used to process all log files (other than DN,NN,Audit).
This cause a SKU issue at the M/R level.
So all hadoop logs should use a different RecordType.

Note: 
- using the cluster information in the ChukwaRecordPartitioner will also help.
- using a predefine list of recordType/reducer association will also help by avoiding to have 2 log RecordType going to the same reducer,
the dynamic affectation ( ( hashCode() & Integer.MAX_VALUE) % numReduceTasks) could be used at a fallback mechanism


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-146) All hadoop logs should use a different RecordType

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699942#action_12699942 ] 

Eric Yang commented on CHUKWA-146:
----------------------------------

+1 on the design.

> All hadoop logs should use a different RecordType
> -------------------------------------------------
>
>                 Key: CHUKWA-146
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-146
>             Project: Hadoop Chukwa
>          Issue Type: Improvement
>            Reporter: Jerome Boulon
>            Priority: Critical
>
> All hadoop logs are using the same RecordType, so only 1 Reducer is used to process all log files (other than DN,NN,Audit).
> This cause a SKU issue at the M/R level.
> So all hadoop logs should use a different RecordType.
> Note: 
> - using the cluster information in the ChukwaRecordPartitioner will also help.
> - using a predefine list of recordType/reducer association will also help by avoiding to have 2 log RecordType going to the same reducer,
> the dynamic affectation ( ( hashCode() & Integer.MAX_VALUE) % numReduceTasks) could be used at a fallback mechanism

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-146) All hadoop logs should use a different RecordType

Posted by "Mac Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699948#action_12699948 ] 

Mac Yang commented on CHUKWA-146:
---------------------------------

Would do a +10 on this one if I could.

> All hadoop logs should use a different RecordType
> -------------------------------------------------
>
>                 Key: CHUKWA-146
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-146
>             Project: Hadoop Chukwa
>          Issue Type: Improvement
>            Reporter: Jerome Boulon
>            Priority: Critical
>
> All hadoop logs are using the same RecordType, so only 1 Reducer is used to process all log files (other than DN,NN,Audit).
> This cause a SKU issue at the M/R level.
> So all hadoop logs should use a different RecordType.
> Note: 
> - using the cluster information in the ChukwaRecordPartitioner will also help.
> - using a predefine list of recordType/reducer association will also help by avoiding to have 2 log RecordType going to the same reducer,
> the dynamic affectation ( ( hashCode() & Integer.MAX_VALUE) % numReduceTasks) could be used at a fallback mechanism

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.