You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Shevek (JIRA)" <ji...@apache.org> on 2009/04/24 22:56:30 UTC

[jira] Issue Comment Edited: (HADOOP-1230) Replace parameters with context objects in Mapper, Reducer, Partitioner, InputFormat, and OutputFormat classes

    [ https://issues.apache.org/jira/browse/HADOOP-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702547#action_12702547 ] 

Shevek edited comment on HADOOP-1230 at 4/24/09 1:56 PM:
---------------------------------------------------------

You might find that unless you pass (Context, Key, Value) as parameters to map(), it is very hard to implement ChainedMapper, since you will have to delegate an entire Context. It will also be very hard to do the things I want to do with Hadoop. Unless I hear a good argument otherwise, I will submit a new ticket.

If you want to lazily deserialize the values, there are a couple of options:

(a) Choice of methods on input object:
RecordInput.getKey() { return deserialize(getKeyBytes()); }
map(Context, RecordInput) { input.getKey[Bytes]()... }

(b) Choice of methods to override in Mapper:
Mapper.map(Context, byte[], byte[]) { map(ctx, deserialize(keybytes), deserialize(valuebytes)); }

      was (Author: arren):
    You might find that unless you pass (Context, Key, Value) as parameters to map(), it is very hard to implement ChainedMapper, since you will have to delegate an entire Context. It will also be very hard to do the things I want to do with Hadoop. Unless I hear a good argument otherwise, I will submit a new ticket.
  
> Replace parameters with context objects in Mapper, Reducer, Partitioner, InputFormat, and OutputFormat classes
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1230
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.0
>
>         Attachments: context-objs-2.patch, context-objs-3.patch, context-objs.patch, h1230.patch, h1230.patch, h1230.patch, h1230.patch, h1230.patch
>
>
> This is a big change, but it will future-proof our API's. To maintain backwards compatibility, I'd suggest that we move over to a new package name (org.apache.hadoop.mapreduce) and deprecate the old interfaces and package. Basically, it will replace:
> package org.apache.hadoop.mapred;
> public interface Mapper extends JobConfigurable, Closeable {
>   void map(WritableComparable key, Writable value, OutputCollector output, Reporter reporter) throws IOException;
> }
> with:
> package org.apache.hadoop.mapreduce;
> public interface Mapper extends Closable {
>   void map(MapContext context) throws IOException;
> }
> where MapContext has the methods like getKey(), getValue(), collect(Key, Value), progress(), etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.