You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by himanshu chandola <hi...@yahoo.com> on 2009/09/06 00:53:17 UTC

printing only values and not keys in hadoop output files

Hi Everyone,
I'm a new bie to hadoop and had this question.
Normally hadoop spits output data in files in the form : 
key value

If i would like it to only write the value and not the key, what do I have to do ?



thanks a lot in advance,

Himanshu
Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.



      

Re: printing only values and not keys in hadoop output files

Posted by Monika Moser <mo...@googlemail.com>.
You can use the NullWritable Class in the output format in order to suppress
output either for the key or for the value.
Also in this case there will be no separator written.

Monika


2009/9/6 himanshu chandola <hi...@yahoo.com>

> I actually fixed that by overriding generateActualKey in
> MultipleOutputFormat which I use to return null.
>
> thanks
>
>  Morpheus: Do you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
> ----- Original Message ----
> From: himanshu chandola <hi...@yahoo.com>
> To: general@hadoop.apache.org
> Sent: Saturday, September 5, 2009 7:47:58 PM
> Subject: Re: printing only values and not keys in hadoop output files
>
> Thanks for the suggestion.
>
> I am using TextOutputFormat and I think that should work. But I use
> MultipleOutputFormat to output key,value pairs with certain key values to a
> file with a given name while others to a file with a different name. So if I
> output null for my keys after the reducer, I guess it would create problems.
>
>
> Any other way you could think of ?
> Morpheus: Do you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
> ----- Original Message ----
> From: Stuart White <st...@gmail.com>
> To: general@hadoop.apache.org
> Sent: Saturday, September 5, 2009 7:07:57 PM
> Subject: Re: printing only values and not keys in hadoop output files
>
> Assuming you're writing plain-text output files using TextOutputFormat,
> which normally outputs key/value pairs delimited with a Tab, you can emit a
> null for your key, and neither a key nor the Tab delimiter will get written
> to the output file.  (At least, I'm pretty sure that's right... I'm not
> where I can run a quick test to verify...)
>
>
> On Sat, Sep 5, 2009 at 5:53 PM, himanshu chandola <
> himanshu_coolguy@yahoo.com> wrote:
>
> > Hi Everyone,
> > I'm a new bie to hadoop and had this question.
> > Normally hadoop spits output data in files in the form :
> > key value
> >
> > If i would like it to only write the value and not the key, what do I
> have
> > to do ?
> >
> >
> >
> > thanks a lot in advance,
> >
> > Himanshu
> > Morpheus: Do you believe in fate, Neo?
> > Neo: No.
> > Morpheus: Why Not?
> > Neo: Because I don't like the idea that I'm not in control of my life.
> >
> >
> >
> >
> >
>
>
>
>

Re: printing only values and not keys in hadoop output files

Posted by himanshu chandola <hi...@yahoo.com>.
I actually fixed that by overriding generateActualKey in MultipleOutputFormat which I use to return null.

thanks 

 Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.



----- Original Message ----
From: himanshu chandola <hi...@yahoo.com>
To: general@hadoop.apache.org
Sent: Saturday, September 5, 2009 7:47:58 PM
Subject: Re: printing only values and not keys in hadoop output files

Thanks for the suggestion.

I am using TextOutputFormat and I think that should work. But I use MultipleOutputFormat to output key,value pairs with certain key values to a file with a given name while others to a file with a different name. So if I output null for my keys after the reducer, I guess it would create problems.


Any other way you could think of ?
Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.



----- Original Message ----
From: Stuart White <st...@gmail.com>
To: general@hadoop.apache.org
Sent: Saturday, September 5, 2009 7:07:57 PM
Subject: Re: printing only values and not keys in hadoop output files

Assuming you're writing plain-text output files using TextOutputFormat,
which normally outputs key/value pairs delimited with a Tab, you can emit a
null for your key, and neither a key nor the Tab delimiter will get written
to the output file.  (At least, I'm pretty sure that's right... I'm not
where I can run a quick test to verify...)


On Sat, Sep 5, 2009 at 5:53 PM, himanshu chandola <
himanshu_coolguy@yahoo.com> wrote:

> Hi Everyone,
> I'm a new bie to hadoop and had this question.
> Normally hadoop spits output data in files in the form :
> key value
>
> If i would like it to only write the value and not the key, what do I have
> to do ?
>
>
>
> thanks a lot in advance,
>
> Himanshu
> Morpheus: Do you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
>
>


      

Re: printing only values and not keys in hadoop output files

Posted by himanshu chandola <hi...@yahoo.com>.
Thanks for the suggestion.

I am using TextOutputFormat and I think that should work. But I use MultipleOutputFormat to output key,value pairs with certain key values to a file with a given name while others to a file with a different name. So if I output null for my keys after the reducer, I guess it would create problems.


Any other way you could think of ?
Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.



----- Original Message ----
From: Stuart White <st...@gmail.com>
To: general@hadoop.apache.org
Sent: Saturday, September 5, 2009 7:07:57 PM
Subject: Re: printing only values and not keys in hadoop output files

Assuming you're writing plain-text output files using TextOutputFormat,
which normally outputs key/value pairs delimited with a Tab, you can emit a
null for your key, and neither a key nor the Tab delimiter will get written
to the output file.  (At least, I'm pretty sure that's right... I'm not
where I can run a quick test to verify...)


On Sat, Sep 5, 2009 at 5:53 PM, himanshu chandola <
himanshu_coolguy@yahoo.com> wrote:

> Hi Everyone,
> I'm a new bie to hadoop and had this question.
> Normally hadoop spits output data in files in the form :
> key value
>
> If i would like it to only write the value and not the key, what do I have
> to do ?
>
>
>
> thanks a lot in advance,
>
> Himanshu
> Morpheus: Do you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
>
>



      

Re: printing only values and not keys in hadoop output files

Posted by Stuart White <st...@gmail.com>.
Assuming you're writing plain-text output files using TextOutputFormat,
which normally outputs key/value pairs delimited with a Tab, you can emit a
null for your key, and neither a key nor the Tab delimiter will get written
to the output file.  (At least, I'm pretty sure that's right... I'm not
where I can run a quick test to verify...)


On Sat, Sep 5, 2009 at 5:53 PM, himanshu chandola <
himanshu_coolguy@yahoo.com> wrote:

> Hi Everyone,
> I'm a new bie to hadoop and had this question.
> Normally hadoop spits output data in files in the form :
> key value
>
> If i would like it to only write the value and not the key, what do I have
> to do ?
>
>
>
> thanks a lot in advance,
>
> Himanshu
> Morpheus: Do you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
>
>