You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Bai Shen <ba...@gmail.com> on 2012/10/11 16:14:19 UTC

Referencing files in job file from code

I'm trying to reference a directory inside my job file from my code.  I
have a third party library that I need to pass a File object to which
references the directory in the job file.

How do I go about doing this?  If I just do new File("dir") it looks for
the directory on the client machine that I'm calling the job from instead
of the directory in the actual job file itself.

Thanks.

Re: Referencing files in job file from code

Posted by Bai Shen <ba...@gmail.com>.
I'm writing a plugin for Nutch 2.  When built, it creates a job file which
is then executed via Hadoop.

My plugin uses a third party library that requires a File object pointing
to a directory of files.

I looked at DistributedCache, but I'm not sure how to use it to get a File
object.

On Thu, Oct 11, 2012 at 10:26 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi Bai,
>
> What exactly do you mean by a 'job file' and have you considered using
> DistributedCache, as detailed at
> http://hadoop.apache.org/docs/stable/mapred_tutorial.html#DistributedCache
> ?
>
> On Thu, Oct 11, 2012 at 7:44 PM, Bai Shen <ba...@gmail.com> wrote:
> > I'm trying to reference a directory inside my job file from my code.  I
> > have a third party library that I need to pass a File object to which
> > references the directory in the job file.
> >
> > How do I go about doing this?  If I just do new File("dir") it looks for
> > the directory on the client machine that I'm calling the job from instead
> > of the directory in the actual job file itself.
> >
> > Thanks.
>
>
>
> --
> Harsh J
>

Re: Referencing files in job file from code

Posted by Harsh J <ha...@cloudera.com>.
Hi Bai,

What exactly do you mean by a 'job file' and have you considered using
DistributedCache, as detailed at
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#DistributedCache?

On Thu, Oct 11, 2012 at 7:44 PM, Bai Shen <ba...@gmail.com> wrote:
> I'm trying to reference a directory inside my job file from my code.  I
> have a third party library that I need to pass a File object to which
> references the directory in the job file.
>
> How do I go about doing this?  If I just do new File("dir") it looks for
> the directory on the client machine that I'm calling the job from instead
> of the directory in the actual job file itself.
>
> Thanks.



-- 
Harsh J