You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Lukas Vlcek <lu...@gmail.com> on 2008/02/19 22:52:54 UTC
FileOutputFormat which does not write key value?
Hi,
I don't care about key value in the output file. Is there any way how I can
suppress key in the output?
Is there a way how to tell (Text)OutputFormat not to write key but value
only? Or can I pass my own implementation of RecordWriter into
FileOutputFormat?
Regards,
Lukas
--
http://blog.lukas-vlcek.com/
Re: FileOutputFormat which does not write key value?
Posted by Ted Dunning <td...@veoh.com>.
Re-reading the thread convinces me that this is a difference between
TextOutputFormat and other output formats.
On 2/19/08 6:01 PM, "Andy Li" <an...@gmail.com> wrote:
> Shouldn't the official way to do this is to implement your own RecordWriter
> and implement the
> OutputFormatClass.
>
> conf.setOutputFormat(yourClass);
>
> Inside the yourClass, you can return your own RecordWriter class in the
> getRecordWriter method.
>
> I did it on the FileInputFormat with my own RecordReader and it worked for
> me
> to take KEY and null VALUE into the Mapper. I believe it is the same thing
> vice versa.
>
> But there should be a formal way instead of try-and-error to see what the
> system default
> is. I guess the system does not have a standard spec to define what is the
> default values?
> Maybe this is why Ted has such concern of incompatible in 0.16.*?
>
> -Andy
>
> On Feb 19, 2008 3:02 PM, Lukas Vlcek <lu...@gmail.com> wrote:
>
>> Hmmm...
>>
>> May be I should rather go to bet (it is just midnight in my part of the
>> world...) but I think I did what you are saying:
>>
>> Configuration:
>> conf.setOutputKeyClass(NullWritable.class);
>> conf.setOutputValueClass(Text.class);
>>
>> And the reducer:
>> public class PermutationReduce extends MapReduceBase implements
>> Reducer<Text, Text, NullWritable, Text> {
>>
>> public void reduce(Text key, Iterator<Text> values,
>> OutputCollector<NullWritable, Text> output, Reporter reporter) throws
>> IOException {
>> while (values.hasNext()) {
>> output.collect(NullWritable.get(), values.next());
>> }
>>
>> }
>> }
>>
>> Regards,
>> Lukas
>>
>> On 2/19/08, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>>>
>>>
>>> On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
>>>
>>>> Hi,
>>>>
>>>> I don't care about key value in the output file. Is there any way
>>>> how I can
>>>> suppress key in the output?
>>>> Is there a way how to tell (Text)OutputFormat not to write key but
>>>> value
>>>> only? Or can I pass my own implementation of RecordWriter into
>>>> FileOutputFormat?
>>>
>>> The easiest way is to put either null or a NullWritable in for the
>>> key coming out of the reduce. The TextOutputFormat will drop the tab
>>> character. You can also define your own OutputFormat and encode them
>>> as you wish.
>>>
>>> -- Owen
>>>
>>
>>
>>
>> --
>> http://blog.lukas-vlcek.com/
>>
Re: FileOutputFormat which does not write key value?
Posted by Andy Li <an...@gmail.com>.
Shouldn't the official way to do this is to implement your own RecordWriter
and implement the
OutputFormatClass.
conf.setOutputFormat(yourClass);
Inside the yourClass, you can return your own RecordWriter class in the
getRecordWriter method.
I did it on the FileInputFormat with my own RecordReader and it worked for
me
to take KEY and null VALUE into the Mapper. I believe it is the same thing
vice versa.
But there should be a formal way instead of try-and-error to see what the
system default
is. I guess the system does not have a standard spec to define what is the
default values?
Maybe this is why Ted has such concern of incompatible in 0.16.*?
-Andy
On Feb 19, 2008 3:02 PM, Lukas Vlcek <lu...@gmail.com> wrote:
> Hmmm...
>
> May be I should rather go to bet (it is just midnight in my part of the
> world...) but I think I did what you are saying:
>
> Configuration:
> conf.setOutputKeyClass(NullWritable.class);
> conf.setOutputValueClass(Text.class);
>
> And the reducer:
> public class PermutationReduce extends MapReduceBase implements
> Reducer<Text, Text, NullWritable, Text> {
>
> public void reduce(Text key, Iterator<Text> values,
> OutputCollector<NullWritable, Text> output, Reporter reporter) throws
> IOException {
> while (values.hasNext()) {
> output.collect(NullWritable.get(), values.next());
> }
>
> }
> }
>
> Regards,
> Lukas
>
> On 2/19/08, Owen O'Malley <oo...@yahoo-inc.com> wrote:
> >
> >
> > On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
> >
> > > Hi,
> > >
> > > I don't care about key value in the output file. Is there any way
> > > how I can
> > > suppress key in the output?
> > > Is there a way how to tell (Text)OutputFormat not to write key but
> > > value
> > > only? Or can I pass my own implementation of RecordWriter into
> > > FileOutputFormat?
> >
> > The easiest way is to put either null or a NullWritable in for the
> > key coming out of the reduce. The TextOutputFormat will drop the tab
> > character. You can also define your own OutputFormat and encode them
> > as you wish.
> >
> > -- Owen
> >
>
>
>
> --
> http://blog.lukas-vlcek.com/
>
Re: FileOutputFormat which does not write key value?
Posted by Lukas Vlcek <lu...@gmail.com>.
Hmmm...
May be I should rather go to bet (it is just midnight in my part of the
world...) but I think I did what you are saying:
Configuration:
conf.setOutputKeyClass(NullWritable.class);
conf.setOutputValueClass(Text.class);
And the reducer:
public class PermutationReduce extends MapReduceBase implements
Reducer<Text, Text, NullWritable, Text> {
public void reduce(Text key, Iterator<Text> values,
OutputCollector<NullWritable, Text> output, Reporter reporter) throws
IOException {
while (values.hasNext()) {
output.collect(NullWritable.get(), values.next());
}
}
}
Regards,
Lukas
On 2/19/08, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>
>
> On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
>
> > Hi,
> >
> > I don't care about key value in the output file. Is there any way
> > how I can
> > suppress key in the output?
> > Is there a way how to tell (Text)OutputFormat not to write key but
> > value
> > only? Or can I pass my own implementation of RecordWriter into
> > FileOutputFormat?
>
> The easiest way is to put either null or a NullWritable in for the
> key coming out of the reduce. The TextOutputFormat will drop the tab
> character. You can also define your own OutputFormat and encode them
> as you wish.
>
> -- Owen
>
--
http://blog.lukas-vlcek.com/
Re: FileOutputFormat which does not write key value?
Posted by Lukas Vlcek <lu...@gmail.com>.
Owen,
I would like to fix this but since I am new to Hadoop I am not sure how I
can get instance of Configuration within getKey() method in
WritableComparator class.
The method newKey() is as follows:
public WritableComparable newKey() {
try {
return (WritableComparable)keyClass.newInstance(); // <- line #73
} catch (InstantiationException e) {
throw new RuntimeException(e);
} catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
}
I would like to use ReflectionUtils according to your suggestion:
public WritableComparable newKey() {
try {
return
(WritableComparable)ReflectionUtils.newInstance(keyClass,null); // <-
changed line #73
} catch (InstantiationException e) {
throw new RuntimeException(e);
} catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
}
The second argument should be Configuration but looking into ReflectionUtils
implementation it should also work with null (should not directly throw any
exception). But I am not sure if it is recommended.
Anyway, do you want me to create a new JIRA ticket?
Regards,
Lukas
On Wed, Feb 20, 2008 at 5:33 PM, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>
> On Feb 20, 2008, at 6:23 AM, Lukas Vlcek wrote:
>
> > Hi,
> >
> > WritableComparator, line# 73 (trunk version) is using Class'
> > newInstance()
> > method which can not work for singletons like NullWritable.
>
> *SIgh* I thought I had removed all of those direct calls to
> Class.newInstance. It really should be using
> ReflectionUtils.newInstance, which would work. NullWritable is mostly
> a singleton, but it isn't really required to be a singleton.
>
> > Should this be changed to:
> >
> > /** Construct a new {@link WritableComparable} instance. */
> > public WritableComparable newKey() {
> > try {
> > if (keyClass instanceof NullWritable) return NullWritable.get
> > (); //
> > <--- this is new line
> > return (WritableComparable)keyClass.newInstance();
> > } catch (InstantiationException e) {
> > throw new RuntimeException(e);
> > } catch (IllegalAccessException e) {
> > throw new RuntimeException(e);
> > }
> > }
>
> I think you mean:
>
> if (keyClass == NullWritable.class) {
> return NullWritable.get();
> }
>
> and in reality, it would be better to put that fix in
> ReflectionUtils.newInstance and just call it from here.
>
> Thanks,
> Owen
>
--
http://blog.lukas-vlcek.com/
Re: FileOutputFormat which does not write key value?
Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Feb 20, 2008, at 6:23 AM, Lukas Vlcek wrote:
> Hi,
>
> WritableComparator, line# 73 (trunk version) is using Class'
> newInstance()
> method which can not work for singletons like NullWritable.
*SIgh* I thought I had removed all of those direct calls to
Class.newInstance. It really should be using
ReflectionUtils.newInstance, which would work. NullWritable is mostly
a singleton, but it isn't really required to be a singleton.
> Should this be changed to:
>
> /** Construct a new {@link WritableComparable} instance. */
> public WritableComparable newKey() {
> try {
> if (keyClass instanceof NullWritable) return NullWritable.get
> (); //
> <--- this is new line
> return (WritableComparable)keyClass.newInstance();
> } catch (InstantiationException e) {
> throw new RuntimeException(e);
> } catch (IllegalAccessException e) {
> throw new RuntimeException(e);
> }
> }
I think you mean:
if (keyClass == NullWritable.class) {
return NullWritable.get();
}
and in reality, it would be better to put that fix in
ReflectionUtils.newInstance and just call it from here.
Thanks,
Owen
Re: FileOutputFormat which does not write key value?
Posted by Lukas Vlcek <lu...@gmail.com>.
Hi,
WritableComparator, line# 73 (trunk version) is using Class' newInstance()
method which can not work for singletons like NullWritable.
/** Construct a new {@link WritableComparable} instance. */
public WritableComparable newKey() {
try {
return (WritableComparable)keyClass.newInstance();
} catch (InstantiationException e) {
throw new RuntimeException(e);
} catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
}
Should this be changed to:
/** Construct a new {@link WritableComparable} instance. */
public WritableComparable newKey() {
try {
if (keyClass instanceof NullWritable) return NullWritable.get(); //
<--- this is new line
return (WritableComparable)keyClass.newInstance();
} catch (InstantiationException e) {
throw new RuntimeException(e);
} catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
}
I didn't have a chance to check all WritableComparable classes whether they
have public constructor (or implicit one) but I am facing an issue with
NullWritable used as a key in reducer and I believe it is caused by the
newInstance() method being called on singleton.
Regards,
Lukas
On Feb 20, 2008 2:56 PM, Lukas Vlcek <lu...@gmail.com> wrote:
> Owen,
>
> This is still not clear to me. I see the following code in
> TextOutputFormat class:
>
> ...
> public synchronized void write(K key, V value)
> throws IOException {
>
> boolean nullKey = key == null || key instanceof NullWritable;
> boolean nullValue = value == null || value instanceof NullWritable;
> if (nullKey && nullValue) {
> return;
> }
> if (!nullKey) {
> writeObject(key);
> }
> if (!(nullKey || nullValue)) {
> out.write(tab);
> }
> if (!nullValue) {
> writeObject(value);
> }
> out.write(newline);
> }
>
> This seems to me as if the tab is always output if the value is not null
> regardless key being null or not.
> Am I missing something?
>
> Lukas
>
>
> On Feb 19, 2008 11:55 PM, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>
> >
> > On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
> >
> > > Hi,
> > >
> > > I don't care about key value in the output file. Is there any way
> > > how I can
> > > suppress key in the output?
> > > Is there a way how to tell (Text)OutputFormat not to write key but
> > > value
> > > only? Or can I pass my own implementation of RecordWriter into
> > > FileOutputFormat?
> >
> > The easiest way is to put either null or a NullWritable in for the
> > key coming out of the reduce. The TextOutputFormat will drop the tab
> > character. You can also define your own OutputFormat and encode them
> > as you wish.
> >
> > -- Owen
> >
>
>
>
> --
> http://blog.lukas-vlcek.com/
>
--
http://blog.lukas-vlcek.com/
Re: FileOutputFormat which does not write key value?
Posted by Lukas Vlcek <lu...@gmail.com>.
Owen,
you are correct. The boolean logic can be tricky :-)
Lukas
On Wed, Feb 20, 2008 at 8:47 PM, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>
> On Feb 20, 2008, at 5:56 AM, Lukas Vlcek wrote:
>
> > Owen,
> >
> > This is still not clear to me. I see the following code in
> > TextOutputFormat
> > class:
> >
> > ...
> > public synchronized void write(K key, V value)
> > throws IOException {
> >
> > boolean nullKey = key == null || key instanceof NullWritable;
> > boolean nullValue = value == null || value instanceof
> > NullWritable;
> > if (nullKey && nullValue) {
> > return;
> > }
> > if (!nullKey) {
> > writeObject(key);
> > }
> > if (!(nullKey || nullValue)) {
> > out.write(tab);
> > }
> > if (!nullValue) {
> > writeObject(value);
> > }
> > out.write(newline);
> > }
> >
> > This seems to me as if the tab is always output if the value is not
> > null
> > regardless key being null or not.
> > Am I missing something?
>
> The condition !(nullKey || nullValue) is true only if the key AND
> value are non-null.
>
> -- Owen
>
>
--
http://blog.lukas-vlcek.com/
Re: FileOutputFormat which does not write key value?
Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Feb 20, 2008, at 5:56 AM, Lukas Vlcek wrote:
> Owen,
>
> This is still not clear to me. I see the following code in
> TextOutputFormat
> class:
>
> ...
> public synchronized void write(K key, V value)
> throws IOException {
>
> boolean nullKey = key == null || key instanceof NullWritable;
> boolean nullValue = value == null || value instanceof
> NullWritable;
> if (nullKey && nullValue) {
> return;
> }
> if (!nullKey) {
> writeObject(key);
> }
> if (!(nullKey || nullValue)) {
> out.write(tab);
> }
> if (!nullValue) {
> writeObject(value);
> }
> out.write(newline);
> }
>
> This seems to me as if the tab is always output if the value is not
> null
> regardless key being null or not.
> Am I missing something?
The condition !(nullKey || nullValue) is true only if the key AND
value are non-null.
-- Owen
Re: FileOutputFormat which does not write key value?
Posted by Lukas Vlcek <lu...@gmail.com>.
Owen,
This is still not clear to me. I see the following code in TextOutputFormat
class:
...
public synchronized void write(K key, V value)
throws IOException {
boolean nullKey = key == null || key instanceof NullWritable;
boolean nullValue = value == null || value instanceof NullWritable;
if (nullKey && nullValue) {
return;
}
if (!nullKey) {
writeObject(key);
}
if (!(nullKey || nullValue)) {
out.write(tab);
}
if (!nullValue) {
writeObject(value);
}
out.write(newline);
}
This seems to me as if the tab is always output if the value is not null
regardless key being null or not.
Am I missing something?
Lukas
On Feb 19, 2008 11:55 PM, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>
> On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
>
> > Hi,
> >
> > I don't care about key value in the output file. Is there any way
> > how I can
> > suppress key in the output?
> > Is there a way how to tell (Text)OutputFormat not to write key but
> > value
> > only? Or can I pass my own implementation of RecordWriter into
> > FileOutputFormat?
>
> The easiest way is to put either null or a NullWritable in for the
> key coming out of the reduce. The TextOutputFormat will drop the tab
> character. You can also define your own OutputFormat and encode them
> as you wish.
>
> -- Owen
>
--
http://blog.lukas-vlcek.com/
Re: FileOutputFormat which does not write key value?
Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
> Hi,
>
> I don't care about key value in the output file. Is there any way
> how I can
> suppress key in the output?
> Is there a way how to tell (Text)OutputFormat not to write key but
> value
> only? Or can I pass my own implementation of RecordWriter into
> FileOutputFormat?
The easiest way is to put either null or a NullWritable in for the
key coming out of the reduce. The TextOutputFormat will drop the tab
character. You can also define your own OutputFormat and encode them
as you wish.
-- Owen
Re: FileOutputFormat which does not write key value?
Posted by Ted Dunning <td...@veoh.com>.
Actually, I DID mean for you to pass a null.
And you have provided me a warning about what might break in 16.* when I get
there.
On 2/19/08 2:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
> I think you didn't mean that I should directly pass a null into a key (this
> is what I did in my example code). I have just found that there is
> NullWritable class in hadoop.io package but still I can not make it work
> correctly.
Re: FileOutputFormat which does not write key value?
Posted by Lukas Vlcek <lu...@gmail.com>.
Ted,
I think you didn't mean that I should directly pass a null into a key (this
is what I did in my example code). I have just found that there is
NullWritable class in hadoop.io package but still I can not make it work
correctly. I am getting the following exception:
java.lang.RuntimeException: java.lang.IllegalAccessException: Class
org.apache.hadoop.io.WritableComparator can not access a member of class
org.apache.hadoop.io.NullWritable with modifiers "private"
at org.apache.hadoop.io.WritableComparator.newKey(
WritableComparator.java:77)
at org.apache.hadoop.io.WritableComparator.<init>(
WritableComparator.java:63)
at org.apache.hadoop.io.WritableComparator.get(WritableComparator.java
:42)
at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java
:642)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java
:313)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:174)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java
:132)
Caused by: java.lang.IllegalAccessException: Class
org.apache.hadoop.io.WritableComparator can not access a member of class
org.apache.hadoop.io.NullWritable with modifiers "private"
at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)
at java.lang.Class.newInstance0(Class.java:349)
at java.lang.Class.newInstance(Class.java:308)
at org.apache.hadoop.io.WritableComparator.newKey(
WritableComparator.java:73)
... 6 more
Is there any test of NullWritable in Hadoop unit test suite?
Lukas
On Feb 19, 2008 11:35 PM, Ted Dunning <td...@veoh.com> wrote:
>
>
> I use 15.1 and it does work there. Pity if we lost that capability.
> Having
> to take a structure apart and put together a new one just to move one
> field
> out is a real pain and significantly increases garbage allocations.
>
>
> On 2/19/08 2:08 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
>
> > Hi,
> >
> > Either I am doing something wrong or this does not work (I am using
> 0.16.0):
> >
> > My class:
> >
> > public class PermutationReduce extends MapReduceBase implements
> > Reducer<Text, Text, Text, Text> {
> >
> > public void reduce(Text key, Iterator<Text> values,
> > OutputCollector<Text, Text> output, Reporter reporter) throws
> IOException {
> > while (values.hasNext()) {
> > output.collect(null, values.next());
> > }
> > }
> > }
> >
> > the Exception:
> >
> > java.lang.NullPointerException
> > at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java
> > :948)
> > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$2.collect(
> > MapTask.java:489)
> > at org.permutation.PermutationReduce.reduce(PermutationReduce.java
> :16)
> > at org.permutation.PermutationReduce.reduce(PermutationReduce.java
> :1)
> > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(
> > MapTask.java:522)
> > at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(
> > MapTask.java:493)
> > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(
> MapTask.java
> > :713)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:209)
> > at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
> LocalJobRunner.java
> > :132)
> > Exception in thread "main" java.io.IOException: Job failed!
> > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
> > at org.permutation.Starter.main(Starter.java:37)
> >
> > Since all I need is just to output all mapper emits (every value which
> > enters output collector in Mapper) I thought I could use
> IdentityReducer.
> > But it seems that this will not give me any option to suppress key in
> > output.
> >
> > Regards,
> > Lukas
> >
> > On Feb 19, 2008 11:00 PM, Ted Dunning <td...@veoh.com> wrote:
> >
> >>
> >> Give a key of null to the reducer's output collector.
> >>
> >>
> >> On 2/19/08 1:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I don't care about key value in the output file. Is there any way how
> I
> >> can
> >>> suppress key in the output?
> >>> Is there a way how to tell (Text)OutputFormat not to write key but
> value
> >>> only? Or can I pass my own implementation of RecordWriter into
> >>> FileOutputFormat?
> >>>
> >>> Regards,
> >>> Lukas
> >>
> >>
> >
>
>
--
http://blog.lukas-vlcek.com/
Re: FileOutputFormat which does not write key value?
Posted by Ted Dunning <td...@veoh.com>.
I use 15.1 and it does work there. Pity if we lost that capability. Having
to take a structure apart and put together a new one just to move one field
out is a real pain and significantly increases garbage allocations.
On 2/19/08 2:08 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
> Hi,
>
> Either I am doing something wrong or this does not work (I am using 0.16.0):
>
> My class:
>
> public class PermutationReduce extends MapReduceBase implements
> Reducer<Text, Text, Text, Text> {
>
> public void reduce(Text key, Iterator<Text> values,
> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
> while (values.hasNext()) {
> output.collect(null, values.next());
> }
> }
> }
>
> the Exception:
>
> java.lang.NullPointerException
> at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java
> :948)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$2.collect(
> MapTask.java:489)
> at org.permutation.PermutationReduce.reduce(PermutationReduce.java:16)
> at org.permutation.PermutationReduce.reduce(PermutationReduce.java:1)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(
> MapTask.java:522)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(
> MapTask.java:493)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java
> :713)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:209)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java
> :132)
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
> at org.permutation.Starter.main(Starter.java:37)
>
> Since all I need is just to output all mapper emits (every value which
> enters output collector in Mapper) I thought I could use IdentityReducer.
> But it seems that this will not give me any option to suppress key in
> output.
>
> Regards,
> Lukas
>
> On Feb 19, 2008 11:00 PM, Ted Dunning <td...@veoh.com> wrote:
>
>>
>> Give a key of null to the reducer's output collector.
>>
>>
>> On 2/19/08 1:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I don't care about key value in the output file. Is there any way how I
>> can
>>> suppress key in the output?
>>> Is there a way how to tell (Text)OutputFormat not to write key but value
>>> only? Or can I pass my own implementation of RecordWriter into
>>> FileOutputFormat?
>>>
>>> Regards,
>>> Lukas
>>
>>
>
Re: FileOutputFormat which does not write key value?
Posted by Lukas Vlcek <lu...@gmail.com>.
Hi,
Either I am doing something wrong or this does not work (I am using 0.16.0):
My class:
public class PermutationReduce extends MapReduceBase implements
Reducer<Text, Text, Text, Text> {
public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
while (values.hasNext()) {
output.collect(null, values.next());
}
}
}
the Exception:
java.lang.NullPointerException
at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java
:948)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$2.collect(
MapTask.java:489)
at org.permutation.PermutationReduce.reduce(PermutationReduce.java:16)
at org.permutation.PermutationReduce.reduce(PermutationReduce.java:1)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(
MapTask.java:522)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(
MapTask.java:493)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java
:713)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:209)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java
:132)
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
at org.permutation.Starter.main(Starter.java:37)
Since all I need is just to output all mapper emits (every value which
enters output collector in Mapper) I thought I could use IdentityReducer.
But it seems that this will not give me any option to suppress key in
output.
Regards,
Lukas
On Feb 19, 2008 11:00 PM, Ted Dunning <td...@veoh.com> wrote:
>
> Give a key of null to the reducer's output collector.
>
>
> On 2/19/08 1:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
>
> > Hi,
> >
> > I don't care about key value in the output file. Is there any way how I
> can
> > suppress key in the output?
> > Is there a way how to tell (Text)OutputFormat not to write key but value
> > only? Or can I pass my own implementation of RecordWriter into
> > FileOutputFormat?
> >
> > Regards,
> > Lukas
>
>
--
http://blog.lukas-vlcek.com/
Re: FileOutputFormat which does not write key value?
Posted by Ted Dunning <td...@veoh.com>.
Give a key of null to the reducer's output collector.
On 2/19/08 1:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
> Hi,
>
> I don't care about key value in the output file. Is there any way how I can
> suppress key in the output?
> Is there a way how to tell (Text)OutputFormat not to write key but value
> only? Or can I pass my own implementation of RecordWriter into
> FileOutputFormat?
>
> Regards,
> Lukas