You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ming Ma (JIRA)" <ji...@apache.org> on 2014/06/28 00:51:24 UTC

[jira] [Commented] (MAPREDUCE-207) Computing Input Splits on the MR Cluster

    [ https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046497#comment-14046497 ] 

Ming Ma commented on MAPREDUCE-207:
-----------------------------------

Thanks, Gera. Nice work and this will be quite useful. Overall it looks good. Per offline discussion with Gera,

1. It is unclear if there is any security related implication such as https://issues.apache.org/jira/browse/MAPREDUCE-5663.
2. The compatibility between new MR client with this feature and cluster with old MR. Given new MR client won't compute the split by default; the job will fail if the cluster still uses old MR. So in this case, new MR client needs to be configured to compute split. For a more general case where new MR client can talk to some cluster with old MR and some cluster with new MR, it will be nice if client can discover if the cluster supports this feature.

> Computing Input Splits on the MR Cluster
> ----------------------------------------
>
>                 Key: MAPREDUCE-207
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-207
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: applicationmaster, mrv2
>            Reporter: Philip Zeyliger
>            Assignee: Arun C Murthy
>         Attachments: MAPREDUCE-207.patch, MAPREDUCE-207.v02.patch, MAPREDUCE-207.v03.patch, MAPREDUCE-207.v05.patch
>
>
> Instead of computing the input splits as part of job submission, Hadoop could have a separate "job task type" that computes the input splits, therefore allowing that computation to happen on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)