You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Bjorn Jonsson <bj...@gmail.com> on 2013/04/04 08:05:32 UTC

Builder pattern for Configuration and Job, and no static .setInputPaths(job,path)

Hi all,

I have got some issues with having to use stuff like:
DistributedCache.addFileToClasspath()
FileInputFormat.setInputPaths()
FileOutputFormat.setOutputPaths()
Job.getInstance(conf)

What I want to do for very specific reasons is something that looks close
to this:

Configuration conf = Configuration
        .with("mapred.job.tracker", "10.10.10.10:8021")
        .with("fs.defaultFS", "hdfs://10.10.10.10:8020")
        .build();

Job job = Job
        .withConfig(conf)
        .withJarByClass(MyJob.class)
        .withJobName("My MR Job")
        .withMapperClass(MyMapper.class)
        .withReducerClass(MyReducer.class)
        .withMapOutputKeyClass(LongWritable.class)
        .withMapOutputValueClass(Text.class)
        .withOutputKeyClass(LongWritable.class)
        .withOutputValueClass(Text.class)
        .withLibJars(new Path("..."))
        .withInputPaths(new Path("..."))
        .withOutputPath(new Path("..."))
        .build();

job.waitForCompletion(true);

Is this something that has been considered? At least to get rid of the
static setInputPaths and .setOutputPath and put them on Job? I saw it that
way in demo code in the Javadoc for 2.0.3 but no implementation? Am I
missing something here?

Best,
Bjorn