You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Adam Pridgen <ad...@thecoverofnight.com> on 2011/02/09 02:03:05 UTC
Printing Debug Messages and Setting up Mapper Classes Programmatically
Hello,
I am trying to setup my Mapper class before it is set-up to run as a
task. Specifically, I am trying to override the method
Mapper.setup(Mapper.Context). When I run the MapReduce program I am
expecting an output to stdout of about about 6 lines along with the
configuration information read out of the Context. I have two
questions:
--- Am I correctly setting up the mapper set task?
--- Do I need to print/debug messages through an API of some sort, or
is printing output to stdout OK?
I am using the 0.21 Hadoop API with a Single Host set-up. Below is a
sample of how I am using the set-up method. Any help is greatly
appreciated.
public static class MyMapperClass extends
Reducer<Object, Text, Text, Text> {
private Integer daysWindow;
private static final HashMap<Long, HashSet<Long>>
windowedIpAddresses = new HashMap<Long, HashSet<Long>>();
private Long minTimestamp, maxTimestamp;
private Configuration configuration;
public void setup(Context context)
throws IOException, InterruptedException{
Configuration conf = context.getConfiguration();
System.out.println("Map.configure();");
this.configuration = conf;
this.minTimestamp = this.configuration.getLong(
MIN_TIMESTAMP_WINDOW, 0);
this.maxTimestamp = this.configuration.getLong(
MAX_TIMESTAMP_WINDOW, 0);
this.daysWindow = this.configuration.getInt(DAYS_WINDOW, 0);
System.out.println("\n\n\n\n\n\n\nminTimestamp :=
"+Long.toString(minTimestamp)+" maxTimestamp :=
"+Long.toString(maxTimestamp));
}
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {}
}
Re: Printing Debug Messages and Setting up Mapper Classes Programmatically
Posted by Adam Pridgen <ad...@thecoverofnight.com>.
Thanks for your help. I will look at log4j.
-- Adam
sent from a mobile phone.
On Feb 8, 2011 7:01 PM, "Harsh J" <qw...@gmail.com> wrote:
> Hello,
>
> On Wed, Feb 9, 2011 at 6:33 AM, Adam Pridgen
> <ad...@thecoverofnight.com> wrote:
>> Hello,
>>
>> I am trying to setup my Mapper class before it is set-up to run as a
>> task. Specifically, I am trying to override the method
>> Mapper.setup(Mapper.Context). When I run the MapReduce program I am
>> expecting an output to stdout of about about 6 lines along with the
>> configuration information read out of the Context. I have two
>> questions:
>>
>> --- Am I correctly setting up the mapper set task?
>
> Yes, @Overriding the setup method is the right way to do this with the
> new Mapper API.
>
>> --- Do I need to print/debug messages through an API of some sort, or
>> is printing output to stdout OK?
>
> While stdout is okay to use, and the outputs of that do get stored in
> stdout/stderr files of the Task on the TaskTracker machine, it makes
> more sense to use a logger API instead for debugging purposes for
> automatic time-stamps, levels of severity, classnames, etc.. Much more
> easier to replay logger outputs in mind than stdouts while debugging.
>
> Hadoop comes with commons-logging and log4j libraries for use
out-of-the-box.
>
> --
> Harsh J
> www.harshj.com
Re: Printing Debug Messages and Setting up Mapper Classes Programmatically
Posted by Harsh J <qw...@gmail.com>.
Hello,
On Wed, Feb 9, 2011 at 6:33 AM, Adam Pridgen
<ad...@thecoverofnight.com> wrote:
> Hello,
>
> I am trying to setup my Mapper class before it is set-up to run as a
> task. Specifically, I am trying to override the method
> Mapper.setup(Mapper.Context). When I run the MapReduce program I am
> expecting an output to stdout of about about 6 lines along with the
> configuration information read out of the Context. I have two
> questions:
>
> --- Am I correctly setting up the mapper set task?
Yes, @Overriding the setup method is the right way to do this with the
new Mapper API.
> --- Do I need to print/debug messages through an API of some sort, or
> is printing output to stdout OK?
While stdout is okay to use, and the outputs of that do get stored in
stdout/stderr files of the Task on the TaskTracker machine, it makes
more sense to use a logger API instead for debugging purposes for
automatic time-stamps, levels of severity, classnames, etc.. Much more
easier to replay logger outputs in mind than stdouts while debugging.
Hadoop comes with commons-logging and log4j libraries for use out-of-the-box.
--
Harsh J
www.harshj.com