You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Boyu Zhang <bo...@gmail.com> on 2009/10/26 01:47:08 UTC

How To Pass Parameters To Mapper Through Main Method

Dear All,

I am implementing a clustering algorithm in which I need to compare each
line to two specific lines (they all have the same format ) and output two
scores denoting the similarity between each line to the two specific lines.

Can I define two global variables (the 2 specific lines) in the main[]
method and pass those two variables to the mapper class?
Or can I store the two lines in a separate file (say Centric )and have
mapper class read the file and compare each lines (from other files, say
Data in which the data need to be processed) with the two from the separate
file Centric?

Thanks a lot for reading my email, really appreciate any help!

Boyu Zhang(Emma)
University of Delaware

Re: How To Pass Parameters To Mapper Through Main Method

Posted by Boyu Zhang <bo...@gmail.com>.
Dear Amogh,

Thank you for the tip, I tried with jobconf and configure, it worked! Thanks
a lot!

Boyu

On Mon, Oct 26, 2009 at 12:09 AM, Amogh Vasekar <am...@yahoo-inc.com> wrote:

> Hi,
> Many options available here. You can use jobconf (0.18 ) / context.conf
> (0.20) to pass these lines across all tasks ( assuming the size isnt
> relatively large ) and use configure / setup to retrieve these.. Or use
> distributed cache to read a file containing these lines ( possibly with jvm
> reuse if you want that extra bit as well. )
>
> Thanks,
> Amogh
>
> On 10/26/09 6:17 AM, "Boyu Zhang" <bo...@gmail.com> wrote:
>
> Dear All,
>
> I am implementing a clustering algorithm in which I need to compare each
> line to two specific lines (they all have the same format ) and output two
> scores denoting the similarity between each line to the two specific lines.
>
> Can I define two global variables (the 2 specific lines) in the main[]
> method and pass those two variables to the mapper class?
> Or can I store the two lines in a separate file (say Centric )and have
> mapper class read the file and compare each lines (from other files, say
> Data in which the data need to be processed) with the two from the separate
> file Centric?
>
> Thanks a lot for reading my email, really appreciate any help!
>
> Boyu Zhang(Emma)
> University of Delaware
>
>

Re: How To Pass Parameters To Mapper Through Main Method

Posted by Amogh Vasekar <am...@yahoo-inc.com>.
Hi,
Many options available here. You can use jobconf (0.18 ) / context.conf (0.20) to pass these lines across all tasks ( assuming the size isnt relatively large ) and use configure / setup to retrieve these.. Or use distributed cache to read a file containing these lines ( possibly with jvm reuse if you want that extra bit as well. )

Thanks,
Amogh

On 10/26/09 6:17 AM, "Boyu Zhang" <bo...@gmail.com> wrote:

Dear All,

I am implementing a clustering algorithm in which I need to compare each
line to two specific lines (they all have the same format ) and output two
scores denoting the similarity between each line to the two specific lines.

Can I define two global variables (the 2 specific lines) in the main[]
method and pass those two variables to the mapper class?
Or can I store the two lines in a separate file (say Centric )and have
mapper class read the file and compare each lines (from other files, say
Data in which the data need to be processed) with the two from the separate
file Centric?

Thanks a lot for reading my email, really appreciate any help!

Boyu Zhang(Emma)
University of Delaware