You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by paco <pa...@gmail.com> on 2015/10/04 10:38:43 UTC
Combiner and KeyComposite
I am doing a secondary sort in Hadoop 2.6.0, I am following this
tutorial:
https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
I have the exact same code, but now I am trying to improve performance
so I have decided to add a combiner. I have added two modifications:
Main file:
|job.setCombinerClass(CombinerK.class);
|
Combiner file:
|public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
Iterator<KeyWritable> it = values;
System.err.println("combiner " + key);
KeyWritable first_value = it.next();
System.err.println("va: " + first_value);
while (it.hasNext()) {
sum += it.next().getSs();
}
first_value.setS(sum);
context.write(key, first_value);
}
}
|
But it seems that it is not run because I can't find any logs file which
have the word "combiner". When I saw counters after running, I could see:
| Combine input records=4040000
Combine output records=4040000
|
The combiner seems like it is being executed but it seems as it has been
receiving a call for each key and by this reason it has the same number
in input as output.
Re: Combiner and KeyComposite
Posted by paco <pa...@gmail.com>.
Yes, I only have one unique node. I am checking over
/usr/local/hadoop/logs/userlogs/*
In order to check if I am right, I have run another example (Which is
not using KeyComposite) and I can read the word combiner in my logs file.
Thanks.
On 04/10/15 10:42, ☼ R Nair (रविशंकर नायर) wrote:
>
> Are you checking logs at correct place?
>
>
> On Sun, Oct 4, 2015, 4:39 PM paco <pacopww@gmail.com
> <ma...@gmail.com>> wrote:
>
> I am doing a secondary sort in Hadoop 2.6.0, I am following this
> tutorial:
> https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
>
> I have the exact same code, but now I am trying to improve
> performance so I have decided to add a combiner. I have added two
> modifications:
>
> Main file:
>
> |job.setCombinerClass(CombinerK.class);
> |
>
> Combiner file:
>
> |public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
>
> public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
>
>
> Iterator<KeyWritable> it = values;
>
> System.err.println("combiner " + key);
>
> KeyWritable first_value = it.next();
> System.err.println("va: " + first_value);
>
> while (it.hasNext()) {
>
> sum += it.next().getSs();
>
> }
> first_value.setS(sum);
> context.write(key, first_value);
>
>
> }
> }
> |
>
> But it seems that it is not run because I can't find any logs file
> which have the word "combiner". When I saw counters after running,
> I could see:
>
> | Combine input records=4040000
> Combine output records=4040000
> |
>
> The combiner seems like it is being executed but it seems as it
> has been receiving a call for each key and by this reason it has
> the same number in input as output.
>
Re: Combiner and KeyComposite
Posted by paco <pa...@gmail.com>.
Yes, I only have one unique node. I am checking over
/usr/local/hadoop/logs/userlogs/*
In order to check if I am right, I have run another example (Which is
not using KeyComposite) and I can read the word combiner in my logs file.
Thanks.
On 04/10/15 10:42, ☼ R Nair (रविशंकर नायर) wrote:
>
> Are you checking logs at correct place?
>
>
> On Sun, Oct 4, 2015, 4:39 PM paco <pacopww@gmail.com
> <ma...@gmail.com>> wrote:
>
> I am doing a secondary sort in Hadoop 2.6.0, I am following this
> tutorial:
> https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
>
> I have the exact same code, but now I am trying to improve
> performance so I have decided to add a combiner. I have added two
> modifications:
>
> Main file:
>
> |job.setCombinerClass(CombinerK.class);
> |
>
> Combiner file:
>
> |public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
>
> public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
>
>
> Iterator<KeyWritable> it = values;
>
> System.err.println("combiner " + key);
>
> KeyWritable first_value = it.next();
> System.err.println("va: " + first_value);
>
> while (it.hasNext()) {
>
> sum += it.next().getSs();
>
> }
> first_value.setS(sum);
> context.write(key, first_value);
>
>
> }
> }
> |
>
> But it seems that it is not run because I can't find any logs file
> which have the word "combiner". When I saw counters after running,
> I could see:
>
> | Combine input records=4040000
> Combine output records=4040000
> |
>
> The combiner seems like it is being executed but it seems as it
> has been receiving a call for each key and by this reason it has
> the same number in input as output.
>
Re: Combiner and KeyComposite
Posted by paco <pa...@gmail.com>.
Yes, I only have one unique node. I am checking over
/usr/local/hadoop/logs/userlogs/*
In order to check if I am right, I have run another example (Which is
not using KeyComposite) and I can read the word combiner in my logs file.
Thanks.
On 04/10/15 10:42, ☼ R Nair (रविशंकर नायर) wrote:
>
> Are you checking logs at correct place?
>
>
> On Sun, Oct 4, 2015, 4:39 PM paco <pacopww@gmail.com
> <ma...@gmail.com>> wrote:
>
> I am doing a secondary sort in Hadoop 2.6.0, I am following this
> tutorial:
> https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
>
> I have the exact same code, but now I am trying to improve
> performance so I have decided to add a combiner. I have added two
> modifications:
>
> Main file:
>
> |job.setCombinerClass(CombinerK.class);
> |
>
> Combiner file:
>
> |public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
>
> public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
>
>
> Iterator<KeyWritable> it = values;
>
> System.err.println("combiner " + key);
>
> KeyWritable first_value = it.next();
> System.err.println("va: " + first_value);
>
> while (it.hasNext()) {
>
> sum += it.next().getSs();
>
> }
> first_value.setS(sum);
> context.write(key, first_value);
>
>
> }
> }
> |
>
> But it seems that it is not run because I can't find any logs file
> which have the word "combiner". When I saw counters after running,
> I could see:
>
> | Combine input records=4040000
> Combine output records=4040000
> |
>
> The combiner seems like it is being executed but it seems as it
> has been receiving a call for each key and by this reason it has
> the same number in input as output.
>
Re: Combiner and KeyComposite
Posted by paco <pa...@gmail.com>.
Yes, I only have one unique node. I am checking over
/usr/local/hadoop/logs/userlogs/*
In order to check if I am right, I have run another example (Which is
not using KeyComposite) and I can read the word combiner in my logs file.
Thanks.
On 04/10/15 10:42, ☼ R Nair (रविशंकर नायर) wrote:
>
> Are you checking logs at correct place?
>
>
> On Sun, Oct 4, 2015, 4:39 PM paco <pacopww@gmail.com
> <ma...@gmail.com>> wrote:
>
> I am doing a secondary sort in Hadoop 2.6.0, I am following this
> tutorial:
> https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
>
> I have the exact same code, but now I am trying to improve
> performance so I have decided to add a combiner. I have added two
> modifications:
>
> Main file:
>
> |job.setCombinerClass(CombinerK.class);
> |
>
> Combiner file:
>
> |public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
>
> public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
>
>
> Iterator<KeyWritable> it = values;
>
> System.err.println("combiner " + key);
>
> KeyWritable first_value = it.next();
> System.err.println("va: " + first_value);
>
> while (it.hasNext()) {
>
> sum += it.next().getSs();
>
> }
> first_value.setS(sum);
> context.write(key, first_value);
>
>
> }
> }
> |
>
> But it seems that it is not run because I can't find any logs file
> which have the word "combiner". When I saw counters after running,
> I could see:
>
> | Combine input records=4040000
> Combine output records=4040000
> |
>
> The combiner seems like it is being executed but it seems as it
> has been receiving a call for each key and by this reason it has
> the same number in input as output.
>
Re: Combiner and KeyComposite
Posted by "☼ R Nair (रविशंकर नायर)" <ra...@gmail.com>.
Are you checking logs at correct place?
On Sun, Oct 4, 2015, 4:39 PM paco <pa...@gmail.com> wrote:
> I am doing a secondary sort in Hadoop 2.6.0, I am following this tutorial:
> https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
>
> I have the exact same code, but now I am trying to improve performance so
> I have decided to add a combiner. I have added two modifications:
>
> Main file:
>
> job.setCombinerClass(CombinerK.class);
>
> Combiner file:
>
> public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
>
> public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
>
>
> Iterator<KeyWritable> it = values;
>
> System.err.println("combiner " + key);
>
> KeyWritable first_value = it.next();
> System.err.println("va: " + first_value);
>
> while (it.hasNext()) {
>
> sum += it.next().getSs();
>
> }
> first_value.setS(sum);
> context.write(key, first_value);
>
>
> }
> }
>
> But it seems that it is not run because I can't find any logs file which
> have the word "combiner". When I saw counters after running, I could see:
>
> Combine input records=4040000
> Combine output records=4040000
>
> The combiner seems like it is being executed but it seems as it has been
> receiving a call for each key and by this reason it has the same number in
> input as output.
>
Re: Combiner and KeyComposite
Posted by "☼ R Nair (रविशंकर नायर)" <ra...@gmail.com>.
Are you checking logs at correct place?
On Sun, Oct 4, 2015, 4:39 PM paco <pa...@gmail.com> wrote:
> I am doing a secondary sort in Hadoop 2.6.0, I am following this tutorial:
> https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
>
> I have the exact same code, but now I am trying to improve performance so
> I have decided to add a combiner. I have added two modifications:
>
> Main file:
>
> job.setCombinerClass(CombinerK.class);
>
> Combiner file:
>
> public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
>
> public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
>
>
> Iterator<KeyWritable> it = values;
>
> System.err.println("combiner " + key);
>
> KeyWritable first_value = it.next();
> System.err.println("va: " + first_value);
>
> while (it.hasNext()) {
>
> sum += it.next().getSs();
>
> }
> first_value.setS(sum);
> context.write(key, first_value);
>
>
> }
> }
>
> But it seems that it is not run because I can't find any logs file which
> have the word "combiner". When I saw counters after running, I could see:
>
> Combine input records=4040000
> Combine output records=4040000
>
> The combiner seems like it is being executed but it seems as it has been
> receiving a call for each key and by this reason it has the same number in
> input as output.
>
Re: Combiner and KeyComposite
Posted by "☼ R Nair (रविशंकर नायर)" <ra...@gmail.com>.
Are you checking logs at correct place?
On Sun, Oct 4, 2015, 4:39 PM paco <pa...@gmail.com> wrote:
> I am doing a secondary sort in Hadoop 2.6.0, I am following this tutorial:
> https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
>
> I have the exact same code, but now I am trying to improve performance so
> I have decided to add a combiner. I have added two modifications:
>
> Main file:
>
> job.setCombinerClass(CombinerK.class);
>
> Combiner file:
>
> public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
>
> public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
>
>
> Iterator<KeyWritable> it = values;
>
> System.err.println("combiner " + key);
>
> KeyWritable first_value = it.next();
> System.err.println("va: " + first_value);
>
> while (it.hasNext()) {
>
> sum += it.next().getSs();
>
> }
> first_value.setS(sum);
> context.write(key, first_value);
>
>
> }
> }
>
> But it seems that it is not run because I can't find any logs file which
> have the word "combiner". When I saw counters after running, I could see:
>
> Combine input records=4040000
> Combine output records=4040000
>
> The combiner seems like it is being executed but it seems as it has been
> receiving a call for each key and by this reason it has the same number in
> input as output.
>
Re: Combiner and KeyComposite
Posted by "☼ R Nair (रविशंकर नायर)" <ra...@gmail.com>.
Are you checking logs at correct place?
On Sun, Oct 4, 2015, 4:39 PM paco <pa...@gmail.com> wrote:
> I am doing a secondary sort in Hadoop 2.6.0, I am following this tutorial:
> https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
>
> I have the exact same code, but now I am trying to improve performance so
> I have decided to add a combiner. I have added two modifications:
>
> Main file:
>
> job.setCombinerClass(CombinerK.class);
>
> Combiner file:
>
> public class CombinerK extends Reducer<KeyWritable, KeyWritable, KeyWritable, KeyWritable> {
>
> public void reduce(KeyWritable key, Iterator<KeyWritable> values, Context context) throws IOException, InterruptedException {
>
>
> Iterator<KeyWritable> it = values;
>
> System.err.println("combiner " + key);
>
> KeyWritable first_value = it.next();
> System.err.println("va: " + first_value);
>
> while (it.hasNext()) {
>
> sum += it.next().getSs();
>
> }
> first_value.setS(sum);
> context.write(key, first_value);
>
>
> }
> }
>
> But it seems that it is not run because I can't find any logs file which
> have the word "combiner". When I saw counters after running, I could see:
>
> Combine input records=4040000
> Combine output records=4040000
>
> The combiner seems like it is being executed but it seems as it has been
> receiving a call for each key and by this reason it has the same number in
> input as output.
>