You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Adam Pridgen <ad...@thecoverofnight.com> on 2011/02/09 02:03:05 UTC

Printing Debug Messages and Setting up Mapper Classes Programmatically

Hello,

I am trying to setup my Mapper class before it is set-up to run as a
task.  Specifically, I am trying to override the method
Mapper.setup(Mapper.Context).  When I run the MapReduce program I am
expecting an output to stdout of about about 6 lines along with the
configuration information read out of the Context.  I have two
questions:

--- Am I correctly setting up the mapper set task?
--- Do I need to print/debug messages through an API of some sort, or
is printing output to stdout OK?

I am using the 0.21 Hadoop API with a Single Host set-up.  Below is a
sample of how I am using the set-up method.  Any help is greatly
appreciated.



	public static class MyMapperClass extends
			Reducer<Object, Text, Text, Text> {

		
		private Integer daysWindow;
		private static final HashMap<Long, HashSet<Long>>
windowedIpAddresses = new HashMap<Long, HashSet<Long>>();
		
		
		private Long minTimestamp, maxTimestamp;
		private Configuration configuration;
		public void setup(Context context)
        	throws IOException, InterruptedException{
			
			Configuration conf = context.getConfiguration();
			System.out.println("Map.configure();");
			this.configuration = conf;

			this.minTimestamp = this.configuration.getLong(
					MIN_TIMESTAMP_WINDOW, 0);
			this.maxTimestamp = this.configuration.getLong(
					MAX_TIMESTAMP_WINDOW, 0);
			this.daysWindow = this.configuration.getInt(DAYS_WINDOW, 0);
			System.out.println("\n\n\n\n\n\n\nminTimestamp :=
"+Long.toString(minTimestamp)+" maxTimestamp :=
"+Long.toString(maxTimestamp));

		}
public void map(Object key, Text value, Context context)
				throws IOException, InterruptedException {}
}

Re: Printing Debug Messages and Setting up Mapper Classes Programmatically

Posted by Adam Pridgen <ad...@thecoverofnight.com>.
Thanks for your help.  I will look at log4j.

-- Adam

sent from a mobile phone.
On Feb 8, 2011 7:01 PM, "Harsh J" <qw...@gmail.com> wrote:
> Hello,
>
> On Wed, Feb 9, 2011 at 6:33 AM, Adam Pridgen
> <ad...@thecoverofnight.com> wrote:
>> Hello,
>>
>> I am trying to setup my Mapper class before it is set-up to run as a
>> task.  Specifically, I am trying to override the method
>> Mapper.setup(Mapper.Context).  When I run the MapReduce program I am
>> expecting an output to stdout of about about 6 lines along with the
>> configuration information read out of the Context.  I have two
>> questions:
>>
>> --- Am I correctly setting up the mapper set task?
>
> Yes, @Overriding the setup method is the right way to do this with the
> new Mapper API.
>
>> --- Do I need to print/debug messages through an API of some sort, or
>> is printing output to stdout OK?
>
> While stdout is okay to use, and the outputs of that do get stored in
> stdout/stderr files of the Task on the TaskTracker machine, it makes
> more sense to use a logger API instead for debugging purposes for
> automatic time-stamps, levels of severity, classnames, etc.. Much more
> easier to replay logger outputs in mind than stdouts while debugging.
>
> Hadoop comes with commons-logging and log4j libraries for use
out-of-the-box.
>
> --
> Harsh J
> www.harshj.com

Re: Printing Debug Messages and Setting up Mapper Classes Programmatically

Posted by Harsh J <qw...@gmail.com>.
Hello,

On Wed, Feb 9, 2011 at 6:33 AM, Adam Pridgen
<ad...@thecoverofnight.com> wrote:
> Hello,
>
> I am trying to setup my Mapper class before it is set-up to run as a
> task.  Specifically, I am trying to override the method
> Mapper.setup(Mapper.Context).  When I run the MapReduce program I am
> expecting an output to stdout of about about 6 lines along with the
> configuration information read out of the Context.  I have two
> questions:
>
> --- Am I correctly setting up the mapper set task?

Yes, @Overriding the setup method is the right way to do this with the
new Mapper API.

> --- Do I need to print/debug messages through an API of some sort, or
> is printing output to stdout OK?

While stdout is okay to use, and the outputs of that do get stored in
stdout/stderr files of the Task on the TaskTracker machine, it makes
more sense to use a logger API instead for debugging purposes for
automatic time-stamps, levels of severity, classnames, etc.. Much more
easier to replay logger outputs in mind than stdouts while debugging.

Hadoop comes with commons-logging and log4j libraries for use out-of-the-box.

-- 
Harsh J
www.harshj.com