You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by unmesha sreeveni <un...@gmail.com> on 2014/06/09 11:31:08 UTC

Counters in MapReduce

I am trying to do iteration with map reduce. I have 3 sequence job running

*//job1 configuration*
*FileInputFormat.addInputPath(job1,new Path(args[0]));*
*FileOutputFormat.setOutputPath(job1,out1);*
*job1.waitForCompletion(true);*

*job2 configuration*
*FileInputFormat.addInputPath(job2,out1);*
*FileOutputFormat.setOutputPath(job2,out2);*
*job2.waitForCompletion(true);*

 *job3 configuration*
*FileInputFormat.addInputPath(job3,out2);*
*FileOutputFormat.setOutputPath(job3,new Path(args[1]);*
*boolean success = job3.waitForCompletion(true);*
*return(success ? 0 : 1);*

After job3 I should continue an iteration - job 3 's output should be the
input for job1. And the iteration should continue until the input file is
empty. How to accomplish this.

Will counters do the work.


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

yes rectified that error
But after 1 st iteration when it enters to second iteration
showing

 java.io.FileNotFoundException: for  *Path out1 = new Path(CL);*
*Why is it so .*
*Normally that should be in this way only the o/p folder should not exist*

* //other configuration*
*     job1.setMapperClass(ID3ClsLabelMapper.class);*
*     job1.setReducerClass(ID3ClsLabelReducer.class);*
*     Path in = new Path(args[0]);*
*     Path out1 = new Path(CL);*
*    //delete the file if exist*
*     if(counter == 0){*
*            FileInputFormat.addInputPath(job1, in);*
*     }*
*     else{*
*            FileInputFormat.addInputPath(job1, out5);   *
*     }*
    *  FileOutputFormat.setOutputPath(job1,out1);*
*     job1.waitForCompletion(true);*


On Thu, Jun 12, 2014 at 10:29 AM, unmesha sreeveni <un...@gmail.com>
wrote:

> I tried out by setting an enum to count no. of lines in output file from
> job3.
>
> But I am getting
> 14/06/12 10:12:30 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=1238630400
> conf3
> Exception in thread "main" java.lang.IllegalStateException: Job in state
> DEFINE instead of RUNNING
> at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
>  at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)
>
>
> Below is my current code
>
> *static enum UpdateCounter {*
> *        INCOMING_ATTR*
> *    }*
>
> *public static void main(String[] args) throws Exception {*
> *    Configuration conf = new Configuration();*
> *    int res = ToolRunner.run(conf, new Driver(), args);*
> *    System.exit(res);*
> *}*
>
>
> *@Override*
> *public int run(String[] args) throws Exception {*
> *while(counter >= 0){*
>
> *      Configuration conf = getConf();*
> *     /**
> *     * Job 1: *
> *     */*
> *     Job job1 = new Job(conf, "");*
> *     //other configuration*
> *     job1.setMapperClass(ID3ClsLabelMapper.class);*
> *     job1.setReducerClass(ID3ClsLabelReducer.class);*
> *     Path in = new Path(args[0]);*
> *     Path out1 = new Path(CL);*
> *     if(counter == 0){*
> *            FileInputFormat.addInputPath(job1, in);*
> *     }*
> *     else{*
> *            FileInputFormat.addInputPath(job1, out5);   *
> *     }*
> *     FileInputFormat.addInputPath(job1, in);*
> *     FileOutputFormat.setOutputPath(job1,out1);*
> *     job1.waitForCompletion(true);*
> *    /**
> *     * Job 2: *
> *     *  *
> *     */*
> *    Configuration conf2 = getConf();*
> *    Job job2 = new Job(conf2, "");*
> *    Path out2 = new Path(ANC);*
> *    FileInputFormat.addInputPath(job2, in);*
> *    FileOutputFormat.setOutputPath(job2,out2);*
> *   job2.waitForCompletion(true);*
>
>  *   /**
> *     * Job3*
> *    */*
> *    Configuration conf3 = getConf();*
> *    Job job3 = new Job(conf3, "");*
> *    System.out.println("conf3");*
> *    Path out5 = new Path(args[1]);*
> *    if(fs.exists(out5)){*
> *        fs.delete(out5, true);*
> *    }*
> *    FileInputFormat.addInputPath(job3,out2);*
> *    FileOutputFormat.setOutputPath(job3,out5);*
> *    job3.waitForCompletion(true);*
> *    FileInputFormat.addInputPath(job3,new Path(args[0]));*
> *    FileOutputFormat.setOutputPath(job3,out5);*
> *    job3.waitForCompletion(true);*
> *    counter =
> job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();*
> *  }*
> * return 0;*
>
>  Am I doing anything wrong?
>
>
> On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kk...@nanigans.com> wrote:
>
>> You should use FileStatus to  decide what files you want to include in
>> the InputPath, and use the FileSystem class to delete or process the
>> intermediate / final paths. Moving each job in your iteration logic into
>> different methods would help keep things simple.
>>
>>
>>
>> From: unmesha sreeveni <un...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Monday, June 9, 2014 at 6:02 AM
>> To: User Hadoop <us...@hadoop.apache.org>
>> Subject: Re: Counters in MapReduce
>>
>> Ok I will check out with counters.
>> And after I st iteration the input file to job1 will be the output file
>> of job 3.How to give that.
>> *Inorder to satisfy 2 conditions*
>> First iteration : users input file
>> after first iteration :job 3 's output file as job 1 s input.
>>
>>
>>
>>> --
>>> *Thanks & Regards*
>>>
>>>
>>> *Unmesha Sreeveni U.B*
>>> *Hadoop, Bigdata Developer*
>>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>>> http://www.unmeshasreeveni.blogspot.in/
>>>
>>>
>>>
>>>   ------------------------------
>>> *Kai Voigt* Am Germaniahafen 1 k@123.org
>>>  24143 Kiel +49 160 96683050
>>>  Germany @KaiVoigt
>>>
>>>
>>
>>
>> --
>> *Thanks & Regards*
>>
>>
>> *Unmesha Sreeveni U.B*
>> *Hadoop, Bigdata Developer*
>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>
>
> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

yes rectified that error
But after 1 st iteration when it enters to second iteration
showing

 java.io.FileNotFoundException: for  *Path out1 = new Path(CL);*
*Why is it so .*
*Normally that should be in this way only the o/p folder should not exist*

* //other configuration*
*     job1.setMapperClass(ID3ClsLabelMapper.class);*
*     job1.setReducerClass(ID3ClsLabelReducer.class);*
*     Path in = new Path(args[0]);*
*     Path out1 = new Path(CL);*
*    //delete the file if exist*
*     if(counter == 0){*
*            FileInputFormat.addInputPath(job1, in);*
*     }*
*     else{*
*            FileInputFormat.addInputPath(job1, out5);   *
*     }*
    *  FileOutputFormat.setOutputPath(job1,out1);*
*     job1.waitForCompletion(true);*


On Thu, Jun 12, 2014 at 10:29 AM, unmesha sreeveni <un...@gmail.com>
wrote:

> I tried out by setting an enum to count no. of lines in output file from
> job3.
>
> But I am getting
> 14/06/12 10:12:30 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=1238630400
> conf3
> Exception in thread "main" java.lang.IllegalStateException: Job in state
> DEFINE instead of RUNNING
> at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
>  at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)
>
>
> Below is my current code
>
> *static enum UpdateCounter {*
> *        INCOMING_ATTR*
> *    }*
>
> *public static void main(String[] args) throws Exception {*
> *    Configuration conf = new Configuration();*
> *    int res = ToolRunner.run(conf, new Driver(), args);*
> *    System.exit(res);*
> *}*
>
>
> *@Override*
> *public int run(String[] args) throws Exception {*
> *while(counter >= 0){*
>
> *      Configuration conf = getConf();*
> *     /**
> *     * Job 1: *
> *     */*
> *     Job job1 = new Job(conf, "");*
> *     //other configuration*
> *     job1.setMapperClass(ID3ClsLabelMapper.class);*
> *     job1.setReducerClass(ID3ClsLabelReducer.class);*
> *     Path in = new Path(args[0]);*
> *     Path out1 = new Path(CL);*
> *     if(counter == 0){*
> *            FileInputFormat.addInputPath(job1, in);*
> *     }*
> *     else{*
> *            FileInputFormat.addInputPath(job1, out5);   *
> *     }*
> *     FileInputFormat.addInputPath(job1, in);*
> *     FileOutputFormat.setOutputPath(job1,out1);*
> *     job1.waitForCompletion(true);*
> *    /**
> *     * Job 2: *
> *     *  *
> *     */*
> *    Configuration conf2 = getConf();*
> *    Job job2 = new Job(conf2, "");*
> *    Path out2 = new Path(ANC);*
> *    FileInputFormat.addInputPath(job2, in);*
> *    FileOutputFormat.setOutputPath(job2,out2);*
> *   job2.waitForCompletion(true);*
>
>  *   /**
> *     * Job3*
> *    */*
> *    Configuration conf3 = getConf();*
> *    Job job3 = new Job(conf3, "");*
> *    System.out.println("conf3");*
> *    Path out5 = new Path(args[1]);*
> *    if(fs.exists(out5)){*
> *        fs.delete(out5, true);*
> *    }*
> *    FileInputFormat.addInputPath(job3,out2);*
> *    FileOutputFormat.setOutputPath(job3,out5);*
> *    job3.waitForCompletion(true);*
> *    FileInputFormat.addInputPath(job3,new Path(args[0]));*
> *    FileOutputFormat.setOutputPath(job3,out5);*
> *    job3.waitForCompletion(true);*
> *    counter =
> job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();*
> *  }*
> * return 0;*
>
>  Am I doing anything wrong?
>
>
> On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kk...@nanigans.com> wrote:
>
>> You should use FileStatus to  decide what files you want to include in
>> the InputPath, and use the FileSystem class to delete or process the
>> intermediate / final paths. Moving each job in your iteration logic into
>> different methods would help keep things simple.
>>
>>
>>
>> From: unmesha sreeveni <un...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Monday, June 9, 2014 at 6:02 AM
>> To: User Hadoop <us...@hadoop.apache.org>
>> Subject: Re: Counters in MapReduce
>>
>> Ok I will check out with counters.
>> And after I st iteration the input file to job1 will be the output file
>> of job 3.How to give that.
>> *Inorder to satisfy 2 conditions*
>> First iteration : users input file
>> after first iteration :job 3 's output file as job 1 s input.
>>
>>
>>
>>> --
>>> *Thanks & Regards*
>>>
>>>
>>> *Unmesha Sreeveni U.B*
>>> *Hadoop, Bigdata Developer*
>>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>>> http://www.unmeshasreeveni.blogspot.in/
>>>
>>>
>>>
>>>   ------------------------------
>>> *Kai Voigt* Am Germaniahafen 1 k@123.org
>>>  24143 Kiel +49 160 96683050
>>>  Germany @KaiVoigt
>>>
>>>
>>
>>
>> --
>> *Thanks & Regards*
>>
>>
>> *Unmesha Sreeveni U.B*
>> *Hadoop, Bigdata Developer*
>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>
>
> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

yes rectified that error
But after 1 st iteration when it enters to second iteration
showing

 java.io.FileNotFoundException: for  *Path out1 = new Path(CL);*
*Why is it so .*
*Normally that should be in this way only the o/p folder should not exist*

* //other configuration*
*     job1.setMapperClass(ID3ClsLabelMapper.class);*
*     job1.setReducerClass(ID3ClsLabelReducer.class);*
*     Path in = new Path(args[0]);*
*     Path out1 = new Path(CL);*
*    //delete the file if exist*
*     if(counter == 0){*
*            FileInputFormat.addInputPath(job1, in);*
*     }*
*     else{*
*            FileInputFormat.addInputPath(job1, out5);   *
*     }*
    *  FileOutputFormat.setOutputPath(job1,out1);*
*     job1.waitForCompletion(true);*


On Thu, Jun 12, 2014 at 10:29 AM, unmesha sreeveni <un...@gmail.com>
wrote:

> I tried out by setting an enum to count no. of lines in output file from
> job3.
>
> But I am getting
> 14/06/12 10:12:30 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=1238630400
> conf3
> Exception in thread "main" java.lang.IllegalStateException: Job in state
> DEFINE instead of RUNNING
> at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
>  at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)
>
>
> Below is my current code
>
> *static enum UpdateCounter {*
> *        INCOMING_ATTR*
> *    }*
>
> *public static void main(String[] args) throws Exception {*
> *    Configuration conf = new Configuration();*
> *    int res = ToolRunner.run(conf, new Driver(), args);*
> *    System.exit(res);*
> *}*
>
>
> *@Override*
> *public int run(String[] args) throws Exception {*
> *while(counter >= 0){*
>
> *      Configuration conf = getConf();*
> *     /**
> *     * Job 1: *
> *     */*
> *     Job job1 = new Job(conf, "");*
> *     //other configuration*
> *     job1.setMapperClass(ID3ClsLabelMapper.class);*
> *     job1.setReducerClass(ID3ClsLabelReducer.class);*
> *     Path in = new Path(args[0]);*
> *     Path out1 = new Path(CL);*
> *     if(counter == 0){*
> *            FileInputFormat.addInputPath(job1, in);*
> *     }*
> *     else{*
> *            FileInputFormat.addInputPath(job1, out5);   *
> *     }*
> *     FileInputFormat.addInputPath(job1, in);*
> *     FileOutputFormat.setOutputPath(job1,out1);*
> *     job1.waitForCompletion(true);*
> *    /**
> *     * Job 2: *
> *     *  *
> *     */*
> *    Configuration conf2 = getConf();*
> *    Job job2 = new Job(conf2, "");*
> *    Path out2 = new Path(ANC);*
> *    FileInputFormat.addInputPath(job2, in);*
> *    FileOutputFormat.setOutputPath(job2,out2);*
> *   job2.waitForCompletion(true);*
>
>  *   /**
> *     * Job3*
> *    */*
> *    Configuration conf3 = getConf();*
> *    Job job3 = new Job(conf3, "");*
> *    System.out.println("conf3");*
> *    Path out5 = new Path(args[1]);*
> *    if(fs.exists(out5)){*
> *        fs.delete(out5, true);*
> *    }*
> *    FileInputFormat.addInputPath(job3,out2);*
> *    FileOutputFormat.setOutputPath(job3,out5);*
> *    job3.waitForCompletion(true);*
> *    FileInputFormat.addInputPath(job3,new Path(args[0]));*
> *    FileOutputFormat.setOutputPath(job3,out5);*
> *    job3.waitForCompletion(true);*
> *    counter =
> job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();*
> *  }*
> * return 0;*
>
>  Am I doing anything wrong?
>
>
> On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kk...@nanigans.com> wrote:
>
>> You should use FileStatus to  decide what files you want to include in
>> the InputPath, and use the FileSystem class to delete or process the
>> intermediate / final paths. Moving each job in your iteration logic into
>> different methods would help keep things simple.
>>
>>
>>
>> From: unmesha sreeveni <un...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Monday, June 9, 2014 at 6:02 AM
>> To: User Hadoop <us...@hadoop.apache.org>
>> Subject: Re: Counters in MapReduce
>>
>> Ok I will check out with counters.
>> And after I st iteration the input file to job1 will be the output file
>> of job 3.How to give that.
>> *Inorder to satisfy 2 conditions*
>> First iteration : users input file
>> after first iteration :job 3 's output file as job 1 s input.
>>
>>
>>
>>> --
>>> *Thanks & Regards*
>>>
>>>
>>> *Unmesha Sreeveni U.B*
>>> *Hadoop, Bigdata Developer*
>>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>>> http://www.unmeshasreeveni.blogspot.in/
>>>
>>>
>>>
>>>   ------------------------------
>>> *Kai Voigt* Am Germaniahafen 1 k@123.org
>>>  24143 Kiel +49 160 96683050
>>>  Germany @KaiVoigt
>>>
>>>
>>
>>
>> --
>> *Thanks & Regards*
>>
>>
>> *Unmesha Sreeveni U.B*
>> *Hadoop, Bigdata Developer*
>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>
>
> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

yes rectified that error
But after 1 st iteration when it enters to second iteration
showing

 java.io.FileNotFoundException: for  *Path out1 = new Path(CL);*
*Why is it so .*
*Normally that should be in this way only the o/p folder should not exist*

* //other configuration*
*     job1.setMapperClass(ID3ClsLabelMapper.class);*
*     job1.setReducerClass(ID3ClsLabelReducer.class);*
*     Path in = new Path(args[0]);*
*     Path out1 = new Path(CL);*
*    //delete the file if exist*
*     if(counter == 0){*
*            FileInputFormat.addInputPath(job1, in);*
*     }*
*     else{*
*            FileInputFormat.addInputPath(job1, out5);   *
*     }*
    *  FileOutputFormat.setOutputPath(job1,out1);*
*     job1.waitForCompletion(true);*


On Thu, Jun 12, 2014 at 10:29 AM, unmesha sreeveni <un...@gmail.com>
wrote:

> I tried out by setting an enum to count no. of lines in output file from
> job3.
>
> But I am getting
> 14/06/12 10:12:30 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=1238630400
> conf3
> Exception in thread "main" java.lang.IllegalStateException: Job in state
> DEFINE instead of RUNNING
> at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
>  at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)
>
>
> Below is my current code
>
> *static enum UpdateCounter {*
> *        INCOMING_ATTR*
> *    }*
>
> *public static void main(String[] args) throws Exception {*
> *    Configuration conf = new Configuration();*
> *    int res = ToolRunner.run(conf, new Driver(), args);*
> *    System.exit(res);*
> *}*
>
>
> *@Override*
> *public int run(String[] args) throws Exception {*
> *while(counter >= 0){*
>
> *      Configuration conf = getConf();*
> *     /**
> *     * Job 1: *
> *     */*
> *     Job job1 = new Job(conf, "");*
> *     //other configuration*
> *     job1.setMapperClass(ID3ClsLabelMapper.class);*
> *     job1.setReducerClass(ID3ClsLabelReducer.class);*
> *     Path in = new Path(args[0]);*
> *     Path out1 = new Path(CL);*
> *     if(counter == 0){*
> *            FileInputFormat.addInputPath(job1, in);*
> *     }*
> *     else{*
> *            FileInputFormat.addInputPath(job1, out5);   *
> *     }*
> *     FileInputFormat.addInputPath(job1, in);*
> *     FileOutputFormat.setOutputPath(job1,out1);*
> *     job1.waitForCompletion(true);*
> *    /**
> *     * Job 2: *
> *     *  *
> *     */*
> *    Configuration conf2 = getConf();*
> *    Job job2 = new Job(conf2, "");*
> *    Path out2 = new Path(ANC);*
> *    FileInputFormat.addInputPath(job2, in);*
> *    FileOutputFormat.setOutputPath(job2,out2);*
> *   job2.waitForCompletion(true);*
>
>  *   /**
> *     * Job3*
> *    */*
> *    Configuration conf3 = getConf();*
> *    Job job3 = new Job(conf3, "");*
> *    System.out.println("conf3");*
> *    Path out5 = new Path(args[1]);*
> *    if(fs.exists(out5)){*
> *        fs.delete(out5, true);*
> *    }*
> *    FileInputFormat.addInputPath(job3,out2);*
> *    FileOutputFormat.setOutputPath(job3,out5);*
> *    job3.waitForCompletion(true);*
> *    FileInputFormat.addInputPath(job3,new Path(args[0]));*
> *    FileOutputFormat.setOutputPath(job3,out5);*
> *    job3.waitForCompletion(true);*
> *    counter =
> job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();*
> *  }*
> * return 0;*
>
>  Am I doing anything wrong?
>
>
> On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kk...@nanigans.com> wrote:
>
>> You should use FileStatus to  decide what files you want to include in
>> the InputPath, and use the FileSystem class to delete or process the
>> intermediate / final paths. Moving each job in your iteration logic into
>> different methods would help keep things simple.
>>
>>
>>
>> From: unmesha sreeveni <un...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Monday, June 9, 2014 at 6:02 AM
>> To: User Hadoop <us...@hadoop.apache.org>
>> Subject: Re: Counters in MapReduce
>>
>> Ok I will check out with counters.
>> And after I st iteration the input file to job1 will be the output file
>> of job 3.How to give that.
>> *Inorder to satisfy 2 conditions*
>> First iteration : users input file
>> after first iteration :job 3 's output file as job 1 s input.
>>
>>
>>
>>> --
>>> *Thanks & Regards*
>>>
>>>
>>> *Unmesha Sreeveni U.B*
>>> *Hadoop, Bigdata Developer*
>>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>>> http://www.unmeshasreeveni.blogspot.in/
>>>
>>>
>>>
>>>   ------------------------------
>>> *Kai Voigt* Am Germaniahafen 1 k@123.org
>>>  24143 Kiel +49 160 96683050
>>>  Germany @KaiVoigt
>>>
>>>
>>
>>
>> --
>> *Thanks & Regards*
>>
>>
>> *Unmesha Sreeveni U.B*
>> *Hadoop, Bigdata Developer*
>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>
>
> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

I tried out by setting an enum to count no. of lines in output file from
job3.

But I am getting
14/06/12 10:12:30 INFO mapred.JobClient:     Total committed heap usage
(bytes)=1238630400
conf3
Exception in thread "main" java.lang.IllegalStateException: Job in state
DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
 at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)


Below is my current code

*static enum UpdateCounter {*
*        INCOMING_ATTR*
*    }*

*public static void main(String[] args) throws Exception {*
*    Configuration conf = new Configuration();*
*    int res = ToolRunner.run(conf, new Driver(), args);*
*    System.exit(res);*
*}*


*@Override*
*public int run(String[] args) throws Exception {*
*while(counter >= 0){*

*      Configuration conf = getConf();*
*     /**
*     * Job 1: *
*     */*
*     Job job1 = new Job(conf, "");*
*     //other configuration*
*     job1.setMapperClass(ID3ClsLabelMapper.class);*
*     job1.setReducerClass(ID3ClsLabelReducer.class);*
*     Path in = new Path(args[0]);*
*     Path out1 = new Path(CL);*
*     if(counter == 0){*
*            FileInputFormat.addInputPath(job1, in);*
*     }*
*     else{*
*            FileInputFormat.addInputPath(job1, out5);   *
*     }*
*     FileInputFormat.addInputPath(job1, in);*
*     FileOutputFormat.setOutputPath(job1,out1);*
*     job1.waitForCompletion(true);*
*    /**
*     * Job 2: *
*     *  *
*     */*
*    Configuration conf2 = getConf();*
*    Job job2 = new Job(conf2, "");*
*    Path out2 = new Path(ANC);*
*    FileInputFormat.addInputPath(job2, in);*
*    FileOutputFormat.setOutputPath(job2,out2);*
*   job2.waitForCompletion(true);*

 *   /**
*     * Job3*
*    */*
*    Configuration conf3 = getConf();*
*    Job job3 = new Job(conf3, "");*
*    System.out.println("conf3");*
*    Path out5 = new Path(args[1]);*
*    if(fs.exists(out5)){*
*        fs.delete(out5, true);*
*    }*
*    FileInputFormat.addInputPath(job3,out2);*
*    FileOutputFormat.setOutputPath(job3,out5);*
*    job3.waitForCompletion(true);*
*    FileInputFormat.addInputPath(job3,new Path(args[0]));*
*    FileOutputFormat.setOutputPath(job3,out5);*
*    job3.waitForCompletion(true);*
*    counter =
job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();*
*  }*
* return 0;*

 Am I doing anything wrong?


On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kk...@nanigans.com> wrote:

> You should use FileStatus to  decide what files you want to include in the
> InputPath, and use the FileSystem class to delete or process the
> intermediate / final paths. Moving each job in your iteration logic into
> different methods would help keep things simple.
>
>
>
> From: unmesha sreeveni <un...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Monday, June 9, 2014 at 6:02 AM
> To: User Hadoop <us...@hadoop.apache.org>
> Subject: Re: Counters in MapReduce
>
> Ok I will check out with counters.
> And after I st iteration the input file to job1 will be the output file of
> job 3.How to give that.
> *Inorder to satisfy 2 conditions*
> First iteration : users input file
> after first iteration :job 3 's output file as job 1 s input.
>
>
>
>> --
>> *Thanks & Regards*
>>
>>
>> *Unmesha Sreeveni U.B*
>> *Hadoop, Bigdata Developer*
>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>>   ------------------------------
>> *Kai Voigt* Am Germaniahafen 1 k@123.org
>>  24143 Kiel +49 160 96683050
>>  Germany @KaiVoigt
>>
>>
>
>
> --
> *Thanks & Regards*
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

I tried out by setting an enum to count no. of lines in output file from
job3.

But I am getting
14/06/12 10:12:30 INFO mapred.JobClient:     Total committed heap usage
(bytes)=1238630400
conf3
Exception in thread "main" java.lang.IllegalStateException: Job in state
DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
 at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)


Below is my current code

*static enum UpdateCounter {*
*        INCOMING_ATTR*
*    }*

*public static void main(String[] args) throws Exception {*
*    Configuration conf = new Configuration();*
*    int res = ToolRunner.run(conf, new Driver(), args);*
*    System.exit(res);*
*}*


*@Override*
*public int run(String[] args) throws Exception {*
*while(counter >= 0){*

*      Configuration conf = getConf();*
*     /**
*     * Job 1: *
*     */*
*     Job job1 = new Job(conf, "");*
*     //other configuration*
*     job1.setMapperClass(ID3ClsLabelMapper.class);*
*     job1.setReducerClass(ID3ClsLabelReducer.class);*
*     Path in = new Path(args[0]);*
*     Path out1 = new Path(CL);*
*     if(counter == 0){*
*            FileInputFormat.addInputPath(job1, in);*
*     }*
*     else{*
*            FileInputFormat.addInputPath(job1, out5);   *
*     }*
*     FileInputFormat.addInputPath(job1, in);*
*     FileOutputFormat.setOutputPath(job1,out1);*
*     job1.waitForCompletion(true);*
*    /**
*     * Job 2: *
*     *  *
*     */*
*    Configuration conf2 = getConf();*
*    Job job2 = new Job(conf2, "");*
*    Path out2 = new Path(ANC);*
*    FileInputFormat.addInputPath(job2, in);*
*    FileOutputFormat.setOutputPath(job2,out2);*
*   job2.waitForCompletion(true);*

 *   /**
*     * Job3*
*    */*
*    Configuration conf3 = getConf();*
*    Job job3 = new Job(conf3, "");*
*    System.out.println("conf3");*
*    Path out5 = new Path(args[1]);*
*    if(fs.exists(out5)){*
*        fs.delete(out5, true);*
*    }*
*    FileInputFormat.addInputPath(job3,out2);*
*    FileOutputFormat.setOutputPath(job3,out5);*
*    job3.waitForCompletion(true);*
*    FileInputFormat.addInputPath(job3,new Path(args[0]));*
*    FileOutputFormat.setOutputPath(job3,out5);*
*    job3.waitForCompletion(true);*
*    counter =
job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();*
*  }*
* return 0;*

 Am I doing anything wrong?


On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kk...@nanigans.com> wrote:

> You should use FileStatus to  decide what files you want to include in the
> InputPath, and use the FileSystem class to delete or process the
> intermediate / final paths. Moving each job in your iteration logic into
> different methods would help keep things simple.
>
>
>
> From: unmesha sreeveni <un...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Monday, June 9, 2014 at 6:02 AM
> To: User Hadoop <us...@hadoop.apache.org>
> Subject: Re: Counters in MapReduce
>
> Ok I will check out with counters.
> And after I st iteration the input file to job1 will be the output file of
> job 3.How to give that.
> *Inorder to satisfy 2 conditions*
> First iteration : users input file
> after first iteration :job 3 's output file as job 1 s input.
>
>
>
>> --
>> *Thanks & Regards*
>>
>>
>> *Unmesha Sreeveni U.B*
>> *Hadoop, Bigdata Developer*
>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>>   ------------------------------
>> *Kai Voigt* Am Germaniahafen 1 k@123.org
>>  24143 Kiel +49 160 96683050
>>  Germany @KaiVoigt
>>
>>
>
>
> --
> *Thanks & Regards*
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

I tried out by setting an enum to count no. of lines in output file from
job3.

But I am getting
14/06/12 10:12:30 INFO mapred.JobClient:     Total committed heap usage
(bytes)=1238630400
conf3
Exception in thread "main" java.lang.IllegalStateException: Job in state
DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
 at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)


Below is my current code

*static enum UpdateCounter {*
*        INCOMING_ATTR*
*    }*

*public static void main(String[] args) throws Exception {*
*    Configuration conf = new Configuration();*
*    int res = ToolRunner.run(conf, new Driver(), args);*
*    System.exit(res);*
*}*


*@Override*
*public int run(String[] args) throws Exception {*
*while(counter >= 0){*

*      Configuration conf = getConf();*
*     /**
*     * Job 1: *
*     */*
*     Job job1 = new Job(conf, "");*
*     //other configuration*
*     job1.setMapperClass(ID3ClsLabelMapper.class);*
*     job1.setReducerClass(ID3ClsLabelReducer.class);*
*     Path in = new Path(args[0]);*
*     Path out1 = new Path(CL);*
*     if(counter == 0){*
*            FileInputFormat.addInputPath(job1, in);*
*     }*
*     else{*
*            FileInputFormat.addInputPath(job1, out5);   *
*     }*
*     FileInputFormat.addInputPath(job1, in);*
*     FileOutputFormat.setOutputPath(job1,out1);*
*     job1.waitForCompletion(true);*
*    /**
*     * Job 2: *
*     *  *
*     */*
*    Configuration conf2 = getConf();*
*    Job job2 = new Job(conf2, "");*
*    Path out2 = new Path(ANC);*
*    FileInputFormat.addInputPath(job2, in);*
*    FileOutputFormat.setOutputPath(job2,out2);*
*   job2.waitForCompletion(true);*

 *   /**
*     * Job3*
*    */*
*    Configuration conf3 = getConf();*
*    Job job3 = new Job(conf3, "");*
*    System.out.println("conf3");*
*    Path out5 = new Path(args[1]);*
*    if(fs.exists(out5)){*
*        fs.delete(out5, true);*
*    }*
*    FileInputFormat.addInputPath(job3,out2);*
*    FileOutputFormat.setOutputPath(job3,out5);*
*    job3.waitForCompletion(true);*
*    FileInputFormat.addInputPath(job3,new Path(args[0]));*
*    FileOutputFormat.setOutputPath(job3,out5);*
*    job3.waitForCompletion(true);*
*    counter =
job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();*
*  }*
* return 0;*

 Am I doing anything wrong?


On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kk...@nanigans.com> wrote:

> You should use FileStatus to  decide what files you want to include in the
> InputPath, and use the FileSystem class to delete or process the
> intermediate / final paths. Moving each job in your iteration logic into
> different methods would help keep things simple.
>
>
>
> From: unmesha sreeveni <un...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Monday, June 9, 2014 at 6:02 AM
> To: User Hadoop <us...@hadoop.apache.org>
> Subject: Re: Counters in MapReduce
>
> Ok I will check out with counters.
> And after I st iteration the input file to job1 will be the output file of
> job 3.How to give that.
> *Inorder to satisfy 2 conditions*
> First iteration : users input file
> after first iteration :job 3 's output file as job 1 s input.
>
>
>
>> --
>> *Thanks & Regards*
>>
>>
>> *Unmesha Sreeveni U.B*
>> *Hadoop, Bigdata Developer*
>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>>   ------------------------------
>> *Kai Voigt* Am Germaniahafen 1 k@123.org
>>  24143 Kiel +49 160 96683050
>>  Germany @KaiVoigt
>>
>>
>
>
> --
> *Thanks & Regards*
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

I tried out by setting an enum to count no. of lines in output file from
job3.

But I am getting
14/06/12 10:12:30 INFO mapred.JobClient:     Total committed heap usage
(bytes)=1238630400
conf3
Exception in thread "main" java.lang.IllegalStateException: Job in state
DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
 at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)


Below is my current code

*static enum UpdateCounter {*
*        INCOMING_ATTR*
*    }*

*public static void main(String[] args) throws Exception {*
*    Configuration conf = new Configuration();*
*    int res = ToolRunner.run(conf, new Driver(), args);*
*    System.exit(res);*
*}*


*@Override*
*public int run(String[] args) throws Exception {*
*while(counter >= 0){*

*      Configuration conf = getConf();*
*     /**
*     * Job 1: *
*     */*
*     Job job1 = new Job(conf, "");*
*     //other configuration*
*     job1.setMapperClass(ID3ClsLabelMapper.class);*
*     job1.setReducerClass(ID3ClsLabelReducer.class);*
*     Path in = new Path(args[0]);*
*     Path out1 = new Path(CL);*
*     if(counter == 0){*
*            FileInputFormat.addInputPath(job1, in);*
*     }*
*     else{*
*            FileInputFormat.addInputPath(job1, out5);   *
*     }*
*     FileInputFormat.addInputPath(job1, in);*
*     FileOutputFormat.setOutputPath(job1,out1);*
*     job1.waitForCompletion(true);*
*    /**
*     * Job 2: *
*     *  *
*     */*
*    Configuration conf2 = getConf();*
*    Job job2 = new Job(conf2, "");*
*    Path out2 = new Path(ANC);*
*    FileInputFormat.addInputPath(job2, in);*
*    FileOutputFormat.setOutputPath(job2,out2);*
*   job2.waitForCompletion(true);*

 *   /**
*     * Job3*
*    */*
*    Configuration conf3 = getConf();*
*    Job job3 = new Job(conf3, "");*
*    System.out.println("conf3");*
*    Path out5 = new Path(args[1]);*
*    if(fs.exists(out5)){*
*        fs.delete(out5, true);*
*    }*
*    FileInputFormat.addInputPath(job3,out2);*
*    FileOutputFormat.setOutputPath(job3,out5);*
*    job3.waitForCompletion(true);*
*    FileInputFormat.addInputPath(job3,new Path(args[0]));*
*    FileOutputFormat.setOutputPath(job3,out5);*
*    job3.waitForCompletion(true);*
*    counter =
job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();*
*  }*
* return 0;*

 Am I doing anything wrong?


On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kk...@nanigans.com> wrote:

> You should use FileStatus to  decide what files you want to include in the
> InputPath, and use the FileSystem class to delete or process the
> intermediate / final paths. Moving each job in your iteration logic into
> different methods would help keep things simple.
>
>
>
> From: unmesha sreeveni <un...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Monday, June 9, 2014 at 6:02 AM
> To: User Hadoop <us...@hadoop.apache.org>
> Subject: Re: Counters in MapReduce
>
> Ok I will check out with counters.
> And after I st iteration the input file to job1 will be the output file of
> job 3.How to give that.
> *Inorder to satisfy 2 conditions*
> First iteration : users input file
> after first iteration :job 3 's output file as job 1 s input.
>
>
>
>> --
>> *Thanks & Regards*
>>
>>
>> *Unmesha Sreeveni U.B*
>> *Hadoop, Bigdata Developer*
>> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>>   ------------------------------
>> *Kai Voigt* Am Germaniahafen 1 k@123.org
>>  24143 Kiel +49 160 96683050
>>  Germany @KaiVoigt
>>
>>
>
>
> --
> *Thanks & Regards*
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by Krishna Kumar <kk...@nanigans.com>.

You should use FileStatus to  decide what files you want to include in the InputPath, and use the FileSystem class to delete or process the intermediate / final paths. Moving each job in your iteration logic into different methods would help keep things simple.

From: unmesha sreeveni <un...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Monday, June 9, 2014 at 6:02 AM
To: User Hadoop <us...@hadoop.apache.org>>
Subject: Re: Counters in MapReduce

Ok I will check out with counters.
And after I st iteration the input file to job1 will be the output file of job 3.How to give that.
Inorder to satisfy 2 conditions
First iteration : users input file
after first iteration :job 3 's output file as job 1 s input.

--
Thanks & Regards

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/

________________________________
Kai Voigt Am Germaniahafen 1 k@123.org<ma...@123.org>
24143 Kiel +49 160 96683050
Germany @KaiVoigt

--
Thanks & Regards

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by Krishna Kumar <kk...@nanigans.com>.

You should use FileStatus to  decide what files you want to include in the InputPath, and use the FileSystem class to delete or process the intermediate / final paths. Moving each job in your iteration logic into different methods would help keep things simple.

From: unmesha sreeveni <un...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Monday, June 9, 2014 at 6:02 AM
To: User Hadoop <us...@hadoop.apache.org>>
Subject: Re: Counters in MapReduce

Ok I will check out with counters.
And after I st iteration the input file to job1 will be the output file of job 3.How to give that.
Inorder to satisfy 2 conditions
First iteration : users input file
after first iteration :job 3 's output file as job 1 s input.

--
Thanks & Regards

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/

________________________________
Kai Voigt Am Germaniahafen 1 k@123.org<ma...@123.org>
24143 Kiel +49 160 96683050
Germany @KaiVoigt

--
Thanks & Regards

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by Krishna Kumar <kk...@nanigans.com>.

You should use FileStatus to  decide what files you want to include in the InputPath, and use the FileSystem class to delete or process the intermediate / final paths. Moving each job in your iteration logic into different methods would help keep things simple.

From: unmesha sreeveni <un...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Monday, June 9, 2014 at 6:02 AM
To: User Hadoop <us...@hadoop.apache.org>>
Subject: Re: Counters in MapReduce

Ok I will check out with counters.
And after I st iteration the input file to job1 will be the output file of job 3.How to give that.
Inorder to satisfy 2 conditions
First iteration : users input file
after first iteration :job 3 's output file as job 1 s input.

--
Thanks & Regards

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/

________________________________
Kai Voigt Am Germaniahafen 1 k@123.org<ma...@123.org>
24143 Kiel +49 160 96683050
Germany @KaiVoigt

--
Thanks & Regards

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by Krishna Kumar <kk...@nanigans.com>.

You should use FileStatus to  decide what files you want to include in the InputPath, and use the FileSystem class to delete or process the intermediate / final paths. Moving each job in your iteration logic into different methods would help keep things simple.

From: unmesha sreeveni <un...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Monday, June 9, 2014 at 6:02 AM
To: User Hadoop <us...@hadoop.apache.org>>
Subject: Re: Counters in MapReduce

Ok I will check out with counters.
And after I st iteration the input file to job1 will be the output file of job 3.How to give that.
Inorder to satisfy 2 conditions
First iteration : users input file
after first iteration :job 3 's output file as job 1 s input.

--
Thanks & Regards

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/

________________________________
Kai Voigt Am Germaniahafen 1 k@123.org<ma...@123.org>
24143 Kiel +49 160 96683050
Germany @KaiVoigt

--
Thanks & Regards

Unmesha Sreeveni U.B
Hadoop, Bigdata Developer
Center for Cyber Security | Amrita Vishwa Vidyapeetham
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

Ok I will check out with counters.
And after I st iteration the input file to job1 will be the output file of
job 3.How to give that.
*Inorder to satisfy 2 conditions*
First iteration : users input file
after first iteration :job 3 's output file as job 1 s input.



> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>
>    ------------------------------
> *Kai Voigt* Am Germaniahafen 1 k@123.org
>  24143 Kiel +49 160 96683050
>  Germany @KaiVoigt
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

Ok I will check out with counters.
And after I st iteration the input file to job1 will be the output file of
job 3.How to give that.
*Inorder to satisfy 2 conditions*
First iteration : users input file
after first iteration :job 3 's output file as job 1 s input.



> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>
>    ------------------------------
> *Kai Voigt* Am Germaniahafen 1 k@123.org
>  24143 Kiel +49 160 96683050
>  Germany @KaiVoigt
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

Ok I will check out with counters.
And after I st iteration the input file to job1 will be the output file of
job 3.How to give that.
*Inorder to satisfy 2 conditions*
First iteration : users input file
after first iteration :job 3 's output file as job 1 s input.



> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>
>    ------------------------------
> *Kai Voigt* Am Germaniahafen 1 k@123.org
>  24143 Kiel +49 160 96683050
>  Germany @KaiVoigt
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by unmesha sreeveni <un...@gmail.com>.

Ok I will check out with counters.
And after I st iteration the input file to job1 will be the output file of
job 3.How to give that.
*Inorder to satisfy 2 conditions*
First iteration : users input file
after first iteration :job 3 's output file as job 1 s input.



> --
> *Thanks & Regards *
>
>
> *Unmesha Sreeveni U.B*
> *Hadoop, Bigdata Developer*
> *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> http://www.unmeshasreeveni.blogspot.in/
>
>
>
>    ------------------------------
> *Kai Voigt* Am Germaniahafen 1 k@123.org
>  24143 Kiel +49 160 96683050
>  Germany @KaiVoigt
>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Counters in MapReduce

Posted by Kai Voigt <k...@123.org>.

Like you said, just wrap your 3 jobs into a while loop and check the built-in counters, like the number of reduce output records to check if the job output was empty.

Unfortunately, oozie cannot do iterations or loops of jobs, as it only supports DAGs.

Kai

Am 09.06.2014 um 10:31 schrieb unmesha sreeveni <un...@gmail.com>:

> I am trying to do iteration with map reduce. I have 3 sequence job running
> 
> //job1 configuration
> FileInputFormat.addInputPath(job1,new Path(args[0]));
> FileOutputFormat.setOutputPath(job1,out1);
> job1.waitForCompletion(true);
> 
> job2 configuration
> FileInputFormat.addInputPath(job2,out1);
> FileOutputFormat.setOutputPath(job2,out2);
> job2.waitForCompletion(true);
> 
> job3 configuration
> FileInputFormat.addInputPath(job3,out2);
> FileOutputFormat.setOutputPath(job3,new Path(args[1]);
> boolean success = job3.waitForCompletion(true);
> return(success ? 0 : 1);
> 
> After job3 I should continue an iteration - job 3 's output should be the input for job1. And the iteration should continue until the input file is empty. How to accomplish this.
> 
> Will counters do the work. 
> 
> 
> -- 
> Thanks & Regards
> 
> Unmesha Sreeveni U.B
> Hadoop, Bigdata Developer
> Center for Cyber Security | Amrita Vishwa Vidyapeetham
> http://www.unmeshasreeveni.blogspot.in/
> 
> 

Kai Voigt			Am Germaniahafen 1			k@123.org
					24143 Kiel					+49 160 96683050
					Germany						@KaiVoigt

Re: Counters in MapReduce

Posted by Kai Voigt <k...@123.org>.

Like you said, just wrap your 3 jobs into a while loop and check the built-in counters, like the number of reduce output records to check if the job output was empty.

Unfortunately, oozie cannot do iterations or loops of jobs, as it only supports DAGs.

Kai

Am 09.06.2014 um 10:31 schrieb unmesha sreeveni <un...@gmail.com>:

> I am trying to do iteration with map reduce. I have 3 sequence job running
> 
> //job1 configuration
> FileInputFormat.addInputPath(job1,new Path(args[0]));
> FileOutputFormat.setOutputPath(job1,out1);
> job1.waitForCompletion(true);
> 
> job2 configuration
> FileInputFormat.addInputPath(job2,out1);
> FileOutputFormat.setOutputPath(job2,out2);
> job2.waitForCompletion(true);
> 
> job3 configuration
> FileInputFormat.addInputPath(job3,out2);
> FileOutputFormat.setOutputPath(job3,new Path(args[1]);
> boolean success = job3.waitForCompletion(true);
> return(success ? 0 : 1);
> 
> After job3 I should continue an iteration - job 3 's output should be the input for job1. And the iteration should continue until the input file is empty. How to accomplish this.
> 
> Will counters do the work. 
> 
> 
> -- 
> Thanks & Regards
> 
> Unmesha Sreeveni U.B
> Hadoop, Bigdata Developer
> Center for Cyber Security | Amrita Vishwa Vidyapeetham
> http://www.unmeshasreeveni.blogspot.in/
> 
> 

Kai Voigt			Am Germaniahafen 1			k@123.org
					24143 Kiel					+49 160 96683050
					Germany						@KaiVoigt

Re: Counters in MapReduce

Posted by Kai Voigt <k...@123.org>.

Like you said, just wrap your 3 jobs into a while loop and check the built-in counters, like the number of reduce output records to check if the job output was empty.

Unfortunately, oozie cannot do iterations or loops of jobs, as it only supports DAGs.

Kai

Am 09.06.2014 um 10:31 schrieb unmesha sreeveni <un...@gmail.com>:

> I am trying to do iteration with map reduce. I have 3 sequence job running
> 
> //job1 configuration
> FileInputFormat.addInputPath(job1,new Path(args[0]));
> FileOutputFormat.setOutputPath(job1,out1);
> job1.waitForCompletion(true);
> 
> job2 configuration
> FileInputFormat.addInputPath(job2,out1);
> FileOutputFormat.setOutputPath(job2,out2);
> job2.waitForCompletion(true);
> 
> job3 configuration
> FileInputFormat.addInputPath(job3,out2);
> FileOutputFormat.setOutputPath(job3,new Path(args[1]);
> boolean success = job3.waitForCompletion(true);
> return(success ? 0 : 1);
> 
> After job3 I should continue an iteration - job 3 's output should be the input for job1. And the iteration should continue until the input file is empty. How to accomplish this.
> 
> Will counters do the work. 
> 
> 
> -- 
> Thanks & Regards
> 
> Unmesha Sreeveni U.B
> Hadoop, Bigdata Developer
> Center for Cyber Security | Amrita Vishwa Vidyapeetham
> http://www.unmeshasreeveni.blogspot.in/
> 
> 

Kai Voigt			Am Germaniahafen 1			k@123.org
					24143 Kiel					+49 160 96683050
					Germany						@KaiVoigt

Re: Counters in MapReduce

Posted by Kai Voigt <k...@123.org>.

Like you said, just wrap your 3 jobs into a while loop and check the built-in counters, like the number of reduce output records to check if the job output was empty.

Unfortunately, oozie cannot do iterations or loops of jobs, as it only supports DAGs.

Kai

Am 09.06.2014 um 10:31 schrieb unmesha sreeveni <un...@gmail.com>:

> I am trying to do iteration with map reduce. I have 3 sequence job running
> 
> //job1 configuration
> FileInputFormat.addInputPath(job1,new Path(args[0]));
> FileOutputFormat.setOutputPath(job1,out1);
> job1.waitForCompletion(true);
> 
> job2 configuration
> FileInputFormat.addInputPath(job2,out1);
> FileOutputFormat.setOutputPath(job2,out2);
> job2.waitForCompletion(true);
> 
> job3 configuration
> FileInputFormat.addInputPath(job3,out2);
> FileOutputFormat.setOutputPath(job3,new Path(args[1]);
> boolean success = job3.waitForCompletion(true);
> return(success ? 0 : 1);
> 
> After job3 I should continue an iteration - job 3 's output should be the input for job1. And the iteration should continue until the input file is empty. How to accomplish this.
> 
> Will counters do the work. 
> 
> 
> -- 
> Thanks & Regards
> 
> Unmesha Sreeveni U.B
> Hadoop, Bigdata Developer
> Center for Cyber Security | Amrita Vishwa Vidyapeetham
> http://www.unmeshasreeveni.blogspot.in/
> 
> 

Kai Voigt			Am Germaniahafen 1			k@123.org
					24143 Kiel					+49 160 96683050
					Germany						@KaiVoigt