You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2011/07/15 14:14:39 UTC

Giving filename as key to mapper ?

Hi,
How can I give filename as key to mapper ?
I want to know the occurence of word in set of docs, so I want to keep key
as filename. Is it possible to give input key as filename in map function ?
Thanks,
Praveenesh

RE: Giving filename as key to mapper ?

Posted by "GOEKE, MATTHEW (AG/1000)" <ma...@monsanto.com>.
If you have the source downloaded (and if you don't I would suggest you get it) you can do a search for *InputFormat.java and you will have all the references you need. Also you might want to check out http://codedemigod.com/blog/?p=120 or take a look at the books "Hadoop in action" or "Hadoop: The Definitive Guide".

Matt

-----Original Message-----
From: praveenesh kumar [mailto:praveenesh@gmail.com] 
Sent: Friday, July 15, 2011 9:42 AM
To: common-user@hadoop.apache.org
Subject: Re: Giving filename as key to mapper ?

I am new to this hadoop API. Can anyone give me some tutorial or code snipet
on how to write your own input format to do these kind of things.
Thanks.

On Fri, Jul 15, 2011 at 8:07 PM, Robert Evans <ev...@yahoo-inc.com> wrote:

> To add to that if you really want the file name to be the key instead of
> just calling a different API in your map to get it you will probably need to
> write your own input format to do it.  It should be fairly simple and you
> can base it off of an existing input format to do it.
>
> --Bobby
>
> On 7/15/11 7:40 AM, "Harsh J" <ha...@cloudera.com> wrote:
>
> You can retrieve the filename in the new API as described here:
>
>
> http://search-hadoop.com/m/ZOmmJ1PZJqt1/map+input+filename&subj=Retrieving+Filename
>
> In the old API, its available in the configuration instance of the
> mapper as key "map.input.file". See the table below this section
>
> http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+JVM+Reuse
> for more such goodies.
>
> On Fri, Jul 15, 2011 at 5:44 PM, praveenesh kumar <pr...@gmail.com>
> wrote:
> > Hi,
> > How can I give filename as key to mapper ?
> > I want to know the occurence of word in set of docs, so I want to keep
> key
> > as filename. Is it possible to give input key as filename in map function
> ?
> > Thanks,
> > Praveenesh
> >
>
>
>
> --
> Harsh J
>
>
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.


Re: Giving filename as key to mapper ?

Posted by praveenesh kumar <pr...@gmail.com>.
I am new to this hadoop API. Can anyone give me some tutorial or code snipet
on how to write your own input format to do these kind of things.
Thanks.

On Fri, Jul 15, 2011 at 8:07 PM, Robert Evans <ev...@yahoo-inc.com> wrote:

> To add to that if you really want the file name to be the key instead of
> just calling a different API in your map to get it you will probably need to
> write your own input format to do it.  It should be fairly simple and you
> can base it off of an existing input format to do it.
>
> --Bobby
>
> On 7/15/11 7:40 AM, "Harsh J" <ha...@cloudera.com> wrote:
>
> You can retrieve the filename in the new API as described here:
>
>
> http://search-hadoop.com/m/ZOmmJ1PZJqt1/map+input+filename&subj=Retrieving+Filename
>
> In the old API, its available in the configuration instance of the
> mapper as key "map.input.file". See the table below this section
>
> http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+JVM+Reuse
> for more such goodies.
>
> On Fri, Jul 15, 2011 at 5:44 PM, praveenesh kumar <pr...@gmail.com>
> wrote:
> > Hi,
> > How can I give filename as key to mapper ?
> > I want to know the occurence of word in set of docs, so I want to keep
> key
> > as filename. Is it possible to give input key as filename in map function
> ?
> > Thanks,
> > Praveenesh
> >
>
>
>
> --
> Harsh J
>
>

Re: Giving filename as key to mapper ?

Posted by Robert Evans <ev...@yahoo-inc.com>.
To add to that if you really want the file name to be the key instead of just calling a different API in your map to get it you will probably need to write your own input format to do it.  It should be fairly simple and you can base it off of an existing input format to do it.

--Bobby

On 7/15/11 7:40 AM, "Harsh J" <ha...@cloudera.com> wrote:

You can retrieve the filename in the new API as described here:

http://search-hadoop.com/m/ZOmmJ1PZJqt1/map+input+filename&subj=Retrieving+Filename

In the old API, its available in the configuration instance of the
mapper as key "map.input.file". See the table below this section
http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+JVM+Reuse
for more such goodies.

On Fri, Jul 15, 2011 at 5:44 PM, praveenesh kumar <pr...@gmail.com> wrote:
> Hi,
> How can I give filename as key to mapper ?
> I want to know the occurence of word in set of docs, so I want to keep key
> as filename. Is it possible to give input key as filename in map function ?
> Thanks,
> Praveenesh
>



--
Harsh J


Re: Giving filename as key to mapper ?

Posted by Harsh J <ha...@cloudera.com>.
You can retrieve the filename in the new API as described here:

http://search-hadoop.com/m/ZOmmJ1PZJqt1/map+input+filename&subj=Retrieving+Filename

In the old API, its available in the configuration instance of the
mapper as key "map.input.file". See the table below this section
http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+JVM+Reuse
for more such goodies.

On Fri, Jul 15, 2011 at 5:44 PM, praveenesh kumar <pr...@gmail.com> wrote:
> Hi,
> How can I give filename as key to mapper ?
> I want to know the occurence of word in set of docs, so I want to keep key
> as filename. Is it possible to give input key as filename in map function ?
> Thanks,
> Praveenesh
>



-- 
Harsh J