You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Lukas Vlcek <lu...@gmail.com> on 2008/02/19 22:52:54 UTC

FileOutputFormat which does not write key value?

Hi,

I don't care about key value in the output file. Is there any way how I can
suppress key in the output?
Is there a way how to tell (Text)OutputFormat not to write key but value
only? Or can I pass my own implementation of RecordWriter into
FileOutputFormat?

Regards,
Lukas

-- 
http://blog.lukas-vlcek.com/

Re: FileOutputFormat which does not write key value?

Posted by Ted Dunning <td...@veoh.com>.
Re-reading the thread convinces me that this is a difference between
TextOutputFormat and other output formats.


On 2/19/08 6:01 PM, "Andy Li" <an...@gmail.com> wrote:

> Shouldn't the official way to do this is to implement your own RecordWriter
> and implement the
> OutputFormatClass.
> 
> conf.setOutputFormat(yourClass);
> 
> Inside the yourClass, you can return your own RecordWriter class in the
> getRecordWriter method.
> 
> I did it on the FileInputFormat with my own RecordReader and it worked for
> me
> to take KEY and null VALUE into the Mapper.  I believe it is the same thing
> vice versa.
> 
> But there should be a formal way instead of try-and-error to see what the
> system default
> is.  I guess the system does not have a standard spec to define what is the
> default values?
> Maybe this is why Ted has such concern of incompatible in 0.16.*?
> 
> -Andy
> 
> On Feb 19, 2008 3:02 PM, Lukas Vlcek <lu...@gmail.com> wrote:
> 
>> Hmmm...
>> 
>> May be I should rather go to bet (it is just midnight in my part of the
>> world...) but I think I did what you are saying:
>> 
>> Configuration:
>>         conf.setOutputKeyClass(NullWritable.class);
>>         conf.setOutputValueClass(Text.class);
>> 
>> And the reducer:
>> public class PermutationReduce extends MapReduceBase implements
>> Reducer<Text, Text, NullWritable, Text> {
>> 
>>    public void reduce(Text key, Iterator<Text> values,
>> OutputCollector<NullWritable, Text> output, Reporter reporter) throws
>> IOException {
>>        while (values.hasNext()) {
>>            output.collect(NullWritable.get(), values.next());
>>        }
>> 
>>    }
>> }
>> 
>> Regards,
>> Lukas
>> 
>> On 2/19/08, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>>> 
>>> 
>>> On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I don't care about key value in the output file. Is there any way
>>>> how I can
>>>> suppress key in the output?
>>>> Is there a way how to tell (Text)OutputFormat not to write key but
>>>> value
>>>> only? Or can I pass my own implementation of RecordWriter into
>>>> FileOutputFormat?
>>> 
>>> The easiest way is to put either null or a NullWritable in for the
>>> key coming out of the reduce. The TextOutputFormat will drop the tab
>>> character. You can also define your own OutputFormat and encode them
>>> as you wish.
>>> 
>>> -- Owen
>>> 
>> 
>> 
>> 
>> --
>> http://blog.lukas-vlcek.com/
>> 


Re: FileOutputFormat which does not write key value?

Posted by Andy Li <an...@gmail.com>.
Shouldn't the official way to do this is to implement your own RecordWriter
and implement the
OutputFormatClass.

conf.setOutputFormat(yourClass);

Inside the yourClass, you can return your own RecordWriter class in the
getRecordWriter method.

I did it on the FileInputFormat with my own RecordReader and it worked for
me
to take KEY and null VALUE into the Mapper.  I believe it is the same thing
vice versa.

But there should be a formal way instead of try-and-error to see what the
system default
is.  I guess the system does not have a standard spec to define what is the
default values?
Maybe this is why Ted has such concern of incompatible in 0.16.*?

-Andy

On Feb 19, 2008 3:02 PM, Lukas Vlcek <lu...@gmail.com> wrote:

> Hmmm...
>
> May be I should rather go to bet (it is just midnight in my part of the
> world...) but I think I did what you are saying:
>
> Configuration:
>         conf.setOutputKeyClass(NullWritable.class);
>         conf.setOutputValueClass(Text.class);
>
> And the reducer:
> public class PermutationReduce extends MapReduceBase implements
> Reducer<Text, Text, NullWritable, Text> {
>
>    public void reduce(Text key, Iterator<Text> values,
> OutputCollector<NullWritable, Text> output, Reporter reporter) throws
> IOException {
>        while (values.hasNext()) {
>            output.collect(NullWritable.get(), values.next());
>        }
>
>    }
> }
>
> Regards,
> Lukas
>
> On 2/19/08, Owen O'Malley <oo...@yahoo-inc.com> wrote:
> >
> >
> > On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
> >
> > > Hi,
> > >
> > > I don't care about key value in the output file. Is there any way
> > > how I can
> > > suppress key in the output?
> > > Is there a way how to tell (Text)OutputFormat not to write key but
> > > value
> > > only? Or can I pass my own implementation of RecordWriter into
> > > FileOutputFormat?
> >
> > The easiest way is to put either null or a NullWritable in for the
> > key coming out of the reduce. The TextOutputFormat will drop the tab
> > character. You can also define your own OutputFormat and encode them
> > as you wish.
> >
> > -- Owen
> >
>
>
>
> --
> http://blog.lukas-vlcek.com/
>

Re: FileOutputFormat which does not write key value?

Posted by Lukas Vlcek <lu...@gmail.com>.
Hmmm...

May be I should rather go to bet (it is just midnight in my part of the
world...) but I think I did what you are saying:

Configuration:
         conf.setOutputKeyClass(NullWritable.class);
         conf.setOutputValueClass(Text.class);

And the reducer:
public class PermutationReduce extends MapReduceBase implements
Reducer<Text, Text, NullWritable, Text> {

    public void reduce(Text key, Iterator<Text> values,
OutputCollector<NullWritable, Text> output, Reporter reporter) throws
IOException {
        while (values.hasNext()) {
            output.collect(NullWritable.get(), values.next());
        }

    }
}

Regards,
Lukas

On 2/19/08, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>
>
> On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
>
> > Hi,
> >
> > I don't care about key value in the output file. Is there any way
> > how I can
> > suppress key in the output?
> > Is there a way how to tell (Text)OutputFormat not to write key but
> > value
> > only? Or can I pass my own implementation of RecordWriter into
> > FileOutputFormat?
>
> The easiest way is to put either null or a NullWritable in for the
> key coming out of the reduce. The TextOutputFormat will drop the tab
> character. You can also define your own OutputFormat and encode them
> as you wish.
>
> -- Owen
>



-- 
http://blog.lukas-vlcek.com/

Re: FileOutputFormat which does not write key value?

Posted by Lukas Vlcek <lu...@gmail.com>.
Owen,

I would like to fix this but since I am new to Hadoop I am not sure how I
can get instance of Configuration within getKey() method in
WritableComparator class.

The method newKey() is as follows:

  public WritableComparable newKey() {
    try {
      return (WritableComparable)keyClass.newInstance();  // <- line #73
    } catch (InstantiationException e) {
      throw new RuntimeException(e);
    } catch (IllegalAccessException e) {
      throw new RuntimeException(e);
    }
  }

I would like to use ReflectionUtils according to your suggestion:

  public WritableComparable newKey() {
    try {
      return
(WritableComparable)ReflectionUtils.newInstance(keyClass,null);  // <-
changed line #73
    } catch (InstantiationException e) {
      throw new RuntimeException(e);
    } catch (IllegalAccessException e) {
      throw new RuntimeException(e);
    }
  }

The second argument should be Configuration but looking into ReflectionUtils
implementation it should also work with null (should not directly throw any
exception). But I am not sure if it is recommended.

Anyway, do you want me to create a new JIRA ticket?

Regards,
Lukas

On Wed, Feb 20, 2008 at 5:33 PM, Owen O'Malley <oo...@yahoo-inc.com> wrote:

>
> On Feb 20, 2008, at 6:23 AM, Lukas Vlcek wrote:
>
> > Hi,
> >
> > WritableComparator, line# 73 (trunk version) is using Class'
> > newInstance()
> > method which can not work for singletons like NullWritable.
>
> *SIgh* I thought I had removed all of those direct calls to
> Class.newInstance. It really should be using
> ReflectionUtils.newInstance, which would work. NullWritable is mostly
> a singleton, but it isn't really required to be a singleton.
>
> > Should this be changed to:
> >
> >   /** Construct a new {@link WritableComparable} instance. */
> >   public WritableComparable newKey() {
> >     try {
> >       if (keyClass instanceof NullWritable) return NullWritable.get
> > ();    //
> > <--- this is new line
> >       return (WritableComparable)keyClass.newInstance();
> >     } catch (InstantiationException e) {
> >       throw new RuntimeException(e);
> >     } catch (IllegalAccessException e) {
> >       throw new RuntimeException(e);
> >     }
> >   }
>
> I think you mean:
>
>    if (keyClass == NullWritable.class) {
>      return NullWritable.get();
>   }
>
> and in reality, it would be better to put that fix in
> ReflectionUtils.newInstance and just call it from here.
>
> Thanks,
>     Owen
>



-- 
http://blog.lukas-vlcek.com/

Re: FileOutputFormat which does not write key value?

Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Feb 20, 2008, at 6:23 AM, Lukas Vlcek wrote:

> Hi,
>
> WritableComparator, line# 73 (trunk version) is using Class'  
> newInstance()
> method which can not work for singletons like NullWritable.

*SIgh* I thought I had removed all of those direct calls to  
Class.newInstance. It really should be using  
ReflectionUtils.newInstance, which would work. NullWritable is mostly  
a singleton, but it isn't really required to be a singleton.

> Should this be changed to:
>
>   /** Construct a new {@link WritableComparable} instance. */
>   public WritableComparable newKey() {
>     try {
>       if (keyClass instanceof NullWritable) return NullWritable.get 
> ();    //
> <--- this is new line
>       return (WritableComparable)keyClass.newInstance();
>     } catch (InstantiationException e) {
>       throw new RuntimeException(e);
>     } catch (IllegalAccessException e) {
>       throw new RuntimeException(e);
>     }
>   }

I think you mean:

    if (keyClass == NullWritable.class) {
      return NullWritable.get();
   }

and in reality, it would be better to put that fix in  
ReflectionUtils.newInstance and just call it from here.

Thanks,
    Owen

Re: FileOutputFormat which does not write key value?

Posted by Lukas Vlcek <lu...@gmail.com>.
Hi,

WritableComparator, line# 73 (trunk version) is using Class' newInstance()
method which can not work for singletons like NullWritable.

  /** Construct a new {@link WritableComparable} instance. */
  public WritableComparable newKey() {
    try {
      return (WritableComparable)keyClass.newInstance();
    } catch (InstantiationException e) {
      throw new RuntimeException(e);
    } catch (IllegalAccessException e) {
      throw new RuntimeException(e);
    }
  }

Should this be changed to:

  /** Construct a new {@link WritableComparable} instance. */
  public WritableComparable newKey() {
    try {
      if (keyClass instanceof NullWritable) return NullWritable.get();    //
<--- this is new line
      return (WritableComparable)keyClass.newInstance();
    } catch (InstantiationException e) {
      throw new RuntimeException(e);
    } catch (IllegalAccessException e) {
      throw new RuntimeException(e);
    }
  }

I didn't have a chance to check all WritableComparable classes whether they
have public constructor (or implicit one) but I am facing an issue with
NullWritable used as a key in reducer and I believe it is caused by the
newInstance() method being called on singleton.

Regards,
Lukas

On Feb 20, 2008 2:56 PM, Lukas Vlcek <lu...@gmail.com> wrote:

> Owen,
>
> This is still not clear to me. I see the following code in
> TextOutputFormat class:
>
> ...
> public synchronized void write(K key, V value)
>       throws IOException {
>
>       boolean nullKey = key == null || key instanceof NullWritable;
>       boolean nullValue = value == null || value instanceof NullWritable;
>       if (nullKey && nullValue) {
>         return;
>       }
>       if (!nullKey) {
>         writeObject(key);
>       }
>       if (!(nullKey || nullValue)) {
>         out.write(tab);
>       }
>       if (!nullValue) {
>         writeObject(value);
>       }
>       out.write(newline);
>     }
>
> This seems to me as if the tab is always output if the value is not null
> regardless key being null or not.
> Am I missing something?
>
> Lukas
>
>
> On Feb 19, 2008 11:55 PM, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>
> >
> > On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
> >
> > > Hi,
> > >
> > > I don't care about key value in the output file. Is there any way
> > > how I can
> > > suppress key in the output?
> > > Is there a way how to tell (Text)OutputFormat not to write key but
> > > value
> > > only? Or can I pass my own implementation of RecordWriter into
> > > FileOutputFormat?
> >
> > The easiest way is to put either null or a NullWritable in for the
> > key coming out of the reduce. The TextOutputFormat will drop the tab
> > character. You can also define your own OutputFormat and encode them
> > as you wish.
> >
> > -- Owen
> >
>
>
>
> --
> http://blog.lukas-vlcek.com/
>



-- 
http://blog.lukas-vlcek.com/

Re: FileOutputFormat which does not write key value?

Posted by Lukas Vlcek <lu...@gmail.com>.
Owen,

you are correct. The boolean logic can be tricky :-)

Lukas

On Wed, Feb 20, 2008 at 8:47 PM, Owen O'Malley <oo...@yahoo-inc.com> wrote:

>
> On Feb 20, 2008, at 5:56 AM, Lukas Vlcek wrote:
>
> > Owen,
> >
> > This is still not clear to me. I see the following code in
> > TextOutputFormat
> > class:
> >
> > ...
> > public synchronized void write(K key, V value)
> >       throws IOException {
> >
> >       boolean nullKey = key == null || key instanceof NullWritable;
> >       boolean nullValue = value == null || value instanceof
> > NullWritable;
> >       if (nullKey && nullValue) {
> >         return;
> >       }
> >       if (!nullKey) {
> >         writeObject(key);
> >       }
> >       if (!(nullKey || nullValue)) {
> >         out.write(tab);
> >       }
> >       if (!nullValue) {
> >         writeObject(value);
> >       }
> >       out.write(newline);
> >     }
> >
> > This seems to me as if the tab is always output if the value is not
> > null
> > regardless key being null or not.
> > Am I missing something?
>
> The condition !(nullKey || nullValue) is true only if the key AND
> value are non-null.
>
> -- Owen
>
>


-- 
http://blog.lukas-vlcek.com/

Re: FileOutputFormat which does not write key value?

Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Feb 20, 2008, at 5:56 AM, Lukas Vlcek wrote:

> Owen,
>
> This is still not clear to me. I see the following code in  
> TextOutputFormat
> class:
>
> ...
> public synchronized void write(K key, V value)
>       throws IOException {
>
>       boolean nullKey = key == null || key instanceof NullWritable;
>       boolean nullValue = value == null || value instanceof  
> NullWritable;
>       if (nullKey && nullValue) {
>         return;
>       }
>       if (!nullKey) {
>         writeObject(key);
>       }
>       if (!(nullKey || nullValue)) {
>         out.write(tab);
>       }
>       if (!nullValue) {
>         writeObject(value);
>       }
>       out.write(newline);
>     }
>
> This seems to me as if the tab is always output if the value is not  
> null
> regardless key being null or not.
> Am I missing something?

The condition !(nullKey || nullValue) is true only if the key AND  
value are non-null.

-- Owen


Re: FileOutputFormat which does not write key value?

Posted by Lukas Vlcek <lu...@gmail.com>.
Owen,

This is still not clear to me. I see the following code in TextOutputFormat
class:

...
public synchronized void write(K key, V value)
      throws IOException {

      boolean nullKey = key == null || key instanceof NullWritable;
      boolean nullValue = value == null || value instanceof NullWritable;
      if (nullKey && nullValue) {
        return;
      }
      if (!nullKey) {
        writeObject(key);
      }
      if (!(nullKey || nullValue)) {
        out.write(tab);
      }
      if (!nullValue) {
        writeObject(value);
      }
      out.write(newline);
    }

This seems to me as if the tab is always output if the value is not null
regardless key being null or not.
Am I missing something?

Lukas

On Feb 19, 2008 11:55 PM, Owen O'Malley <oo...@yahoo-inc.com> wrote:

>
> On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:
>
> > Hi,
> >
> > I don't care about key value in the output file. Is there any way
> > how I can
> > suppress key in the output?
> > Is there a way how to tell (Text)OutputFormat not to write key but
> > value
> > only? Or can I pass my own implementation of RecordWriter into
> > FileOutputFormat?
>
> The easiest way is to put either null or a NullWritable in for the
> key coming out of the reduce. The TextOutputFormat will drop the tab
> character. You can also define your own OutputFormat and encode them
> as you wish.
>
> -- Owen
>



-- 
http://blog.lukas-vlcek.com/

Re: FileOutputFormat which does not write key value?

Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Feb 19, 2008, at 1:52 PM, Lukas Vlcek wrote:

> Hi,
>
> I don't care about key value in the output file. Is there any way  
> how I can
> suppress key in the output?
> Is there a way how to tell (Text)OutputFormat not to write key but  
> value
> only? Or can I pass my own implementation of RecordWriter into
> FileOutputFormat?

The easiest way is to put either null or a NullWritable in for the  
key coming out of the reduce. The TextOutputFormat will drop the tab  
character. You can also define your own OutputFormat and encode them  
as you wish.

-- Owen

Re: FileOutputFormat which does not write key value?

Posted by Ted Dunning <td...@veoh.com>.
Actually, I DID mean for you to pass a null.

And you have provided me a warning about what might break in 16.* when I get
there.


On 2/19/08 2:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:

> I think you didn't mean that I should directly pass a null into a key (this
> is what I did in my example code). I have just found that there is
> NullWritable class in hadoop.io package but still I can not make it work
> correctly.


Re: FileOutputFormat which does not write key value?

Posted by Lukas Vlcek <lu...@gmail.com>.
Ted,

I think you didn't mean that I should directly pass a null into a key (this
is what I did in my example code). I have just found that there is
NullWritable class in hadoop.io package but still I can not make it work
correctly. I am getting the following exception:

java.lang.RuntimeException: java.lang.IllegalAccessException: Class
org.apache.hadoop.io.WritableComparator can not access a member of class
org.apache.hadoop.io.NullWritable with modifiers "private"
    at org.apache.hadoop.io.WritableComparator.newKey(
WritableComparator.java:77)
    at org.apache.hadoop.io.WritableComparator.<init>(
WritableComparator.java:63)
    at org.apache.hadoop.io.WritableComparator.get(WritableComparator.java
:42)
    at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java
:642)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java
:313)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:174)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java
:132)
Caused by: java.lang.IllegalAccessException: Class
org.apache.hadoop.io.WritableComparator can not access a member of class
org.apache.hadoop.io.NullWritable with modifiers "private"
    at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:65)
    at java.lang.Class.newInstance0(Class.java:349)
    at java.lang.Class.newInstance(Class.java:308)
    at org.apache.hadoop.io.WritableComparator.newKey(
WritableComparator.java:73)
    ... 6 more

Is there any test of NullWritable in Hadoop unit test suite?

Lukas

On Feb 19, 2008 11:35 PM, Ted Dunning <td...@veoh.com> wrote:

>
>
> I use 15.1 and it does work there.  Pity if we lost that capability.
>  Having
> to take a structure apart and put together a new one just to move one
> field
> out is a real pain and significantly increases garbage allocations.
>
>
> On 2/19/08 2:08 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
>
> > Hi,
> >
> > Either I am doing something wrong or this does not work (I am using
> 0.16.0):
> >
> > My class:
> >
> > public class PermutationReduce extends MapReduceBase implements
> > Reducer<Text, Text, Text, Text> {
> >
> >     public void reduce(Text key, Iterator<Text> values,
> > OutputCollector<Text, Text> output, Reporter reporter) throws
> IOException {
> >         while (values.hasNext()) {
> >             output.collect(null, values.next());
> >         }
> >     }
> > }
> >
> > the Exception:
> >
> > java.lang.NullPointerException
> >     at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java
> > :948)
> >     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$2.collect(
> > MapTask.java:489)
> >     at org.permutation.PermutationReduce.reduce(PermutationReduce.java
> :16)
> >     at org.permutation.PermutationReduce.reduce(PermutationReduce.java
> :1)
> >     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(
> > MapTask.java:522)
> >     at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(
> > MapTask.java:493)
> >     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(
> MapTask.java
> > :713)
> >     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:209)
> >     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
> LocalJobRunner.java
> > :132)
> > Exception in thread "main" java.io.IOException: Job failed!
> >     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
> >     at org.permutation.Starter.main(Starter.java:37)
> >
> > Since all I need is just to output all mapper emits (every value which
> > enters output collector in Mapper) I thought I could use
> IdentityReducer.
> > But it seems that this will not give me any option to suppress key in
> > output.
> >
> > Regards,
> > Lukas
> >
> > On Feb 19, 2008 11:00 PM, Ted Dunning <td...@veoh.com> wrote:
> >
> >>
> >> Give a key of null to the reducer's output collector.
> >>
> >>
> >> On 2/19/08 1:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I don't care about key value in the output file. Is there any way how
> I
> >> can
> >>> suppress key in the output?
> >>> Is there a way how to tell (Text)OutputFormat not to write key but
> value
> >>> only? Or can I pass my own implementation of RecordWriter into
> >>> FileOutputFormat?
> >>>
> >>> Regards,
> >>> Lukas
> >>
> >>
> >
>
>


-- 
http://blog.lukas-vlcek.com/

Re: FileOutputFormat which does not write key value?

Posted by Ted Dunning <td...@veoh.com>.

I use 15.1 and it does work there.  Pity if we lost that capability.  Having
to take a structure apart and put together a new one just to move one field
out is a real pain and significantly increases garbage allocations.


On 2/19/08 2:08 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:

> Hi,
> 
> Either I am doing something wrong or this does not work (I am using 0.16.0):
> 
> My class:
> 
> public class PermutationReduce extends MapReduceBase implements
> Reducer<Text, Text, Text, Text> {
> 
>     public void reduce(Text key, Iterator<Text> values,
> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
>         while (values.hasNext()) {
>             output.collect(null, values.next());
>         }
>     }
> }
> 
> the Exception:
> 
> java.lang.NullPointerException
>     at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java
> :948)
>     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$2.collect(
> MapTask.java:489)
>     at org.permutation.PermutationReduce.reduce(PermutationReduce.java:16)
>     at org.permutation.PermutationReduce.reduce(PermutationReduce.java:1)
>     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(
> MapTask.java:522)
>     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(
> MapTask.java:493)
>     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java
> :713)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:209)
>     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java
> :132)
> Exception in thread "main" java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
>     at org.permutation.Starter.main(Starter.java:37)
> 
> Since all I need is just to output all mapper emits (every value which
> enters output collector in Mapper) I thought I could use IdentityReducer.
> But it seems that this will not give me any option to suppress key in
> output.
> 
> Regards,
> Lukas
> 
> On Feb 19, 2008 11:00 PM, Ted Dunning <td...@veoh.com> wrote:
> 
>> 
>> Give a key of null to the reducer's output collector.
>> 
>> 
>> On 2/19/08 1:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> I don't care about key value in the output file. Is there any way how I
>> can
>>> suppress key in the output?
>>> Is there a way how to tell (Text)OutputFormat not to write key but value
>>> only? Or can I pass my own implementation of RecordWriter into
>>> FileOutputFormat?
>>> 
>>> Regards,
>>> Lukas
>> 
>> 
> 


Re: FileOutputFormat which does not write key value?

Posted by Lukas Vlcek <lu...@gmail.com>.
Hi,

Either I am doing something wrong or this does not work (I am using 0.16.0):

My class:

public class PermutationReduce extends MapReduceBase implements
Reducer<Text, Text, Text, Text> {

    public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        while (values.hasNext()) {
            output.collect(null, values.next());
        }
    }
}

the Exception:

java.lang.NullPointerException
    at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java
:948)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$2.collect(
MapTask.java:489)
    at org.permutation.PermutationReduce.reduce(PermutationReduce.java:16)
    at org.permutation.PermutationReduce.reduce(PermutationReduce.java:1)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(
MapTask.java:522)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(
MapTask.java:493)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java
:713)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:209)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java
:132)
Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
    at org.permutation.Starter.main(Starter.java:37)

Since all I need is just to output all mapper emits (every value which
enters output collector in Mapper) I thought I could use IdentityReducer.
But it seems that this will not give me any option to suppress key in
output.

Regards,
Lukas

On Feb 19, 2008 11:00 PM, Ted Dunning <td...@veoh.com> wrote:

>
> Give a key of null to the reducer's output collector.
>
>
> On 2/19/08 1:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:
>
> > Hi,
> >
> > I don't care about key value in the output file. Is there any way how I
> can
> > suppress key in the output?
> > Is there a way how to tell (Text)OutputFormat not to write key but value
> > only? Or can I pass my own implementation of RecordWriter into
> > FileOutputFormat?
> >
> > Regards,
> > Lukas
>
>


-- 
http://blog.lukas-vlcek.com/

Re: FileOutputFormat which does not write key value?

Posted by Ted Dunning <td...@veoh.com>.
Give a key of null to the reducer's output collector.


On 2/19/08 1:52 PM, "Lukas Vlcek" <lu...@gmail.com> wrote:

> Hi,
> 
> I don't care about key value in the output file. Is there any way how I can
> suppress key in the output?
> Is there a way how to tell (Text)OutputFormat not to write key but value
> only? Or can I pass my own implementation of RecordWriter into
> FileOutputFormat?
> 
> Regards,
> Lukas