You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Taeho Kang <tk...@gmail.com> on 2008/05/22 07:45:03 UTC

Questions on how to use DistributedCache

Dear all,

I am trying to use DistributedCache class for distributing files required
for running my jobs.

While API documentation provides good guidelines,
Is there any tips or usage examples (e.g. sample codes)?

If you could share your experience with me, I would really appreciate it.

Thank you in advance,

/Taeho

Re: Questions on how to use DistributedCache

Posted by Taeho Kang <tk...@gmail.com>.
Thank you for your clarification!

One more question here,

The API doc says...
"DistributedCache is a facility provided by the Map-Reduce framework to
cache files (text, archives, jars etc.) needed by applications."

My question is...
Is it also possible distribute some some binary files (to be executed in
slave nodes in a MapReduce job)??

p.s. I have tried it, it's not been successful. Is this normal?

/Taeho


On Thu, May 22, 2008 at 7:15 PM, Devaraj Das <dd...@yahoo-inc.com> wrote:

>
>
> > -----Original Message-----
> > From: Taeho Kang [mailto:tkang1@gmail.com]
> > Sent: Thursday, May 22, 2008 3:41 PM
> > To: core-user@hadoop.apache.org
> > Subject: Re: Questions on how to use DistributedCache
> >
> > Thanks for your reply.
> >
> > Just one more thing to ask..
> >
> > From what I see from the source code,
> > it looks like the files/jars registered in DistributedCache
> > gets uploaded to DFS and then downloaded to slave nodes.
> >
> > Is there a way I can specify "the path in the slave nodes"
> > where files/jars get downloaded to?
>
> No that is not possible. They get localized to specific directories (as per
> mapred.local.dir). The files are optionally symlinked in the current
> working
> directory of the task.
>
> >
> > /Taeho
> >
> >
> > On Thu, May 22, 2008 at 4:20 PM, Arun C Murthy
> > <ar...@yahoo-inc.com> wrote:
> >
> > >
> > > On May 21, 2008, at 10:45 PM, Taeho Kang wrote:
> > >
> > >  Dear all,
> > >>
> > >> I am trying to use DistributedCache class for distributing files
> > >> required for running my jobs.
> > >>
> > >> While API documentation provides good guidelines, Is there
> > any tips
> > >> or usage examples (e.g. sample codes)?
> > >>
> > >>
> > > http://hadoop.apache.org/core/docs/current/
> > > mapred_tutorial.html#DistributedCache
> > > and
> > > http://hadoop.apache.org/core/docs/current/
> > > mapred_tutorial.html#Example%3A+WordCount+v2.0
> > >
> > > Arun
> > >
> > >
> > >  If you could share your experience with me, I would really
> > appreciate it.
> > >>
> > >> Thank you in advance,
> > >>
> > >> /Taeho
> > >>
> > >
> > >
> >
>
>

RE: Questions on how to use DistributedCache

Posted by Devaraj Das <dd...@yahoo-inc.com>.
 

> -----Original Message-----
> From: Taeho Kang [mailto:tkang1@gmail.com] 
> Sent: Thursday, May 22, 2008 3:41 PM
> To: core-user@hadoop.apache.org
> Subject: Re: Questions on how to use DistributedCache
> 
> Thanks for your reply.
> 
> Just one more thing to ask..
> 
> From what I see from the source code,
> it looks like the files/jars registered in DistributedCache 
> gets uploaded to DFS and then downloaded to slave nodes.
> 
> Is there a way I can specify "the path in the slave nodes" 
> where files/jars get downloaded to?

No that is not possible. They get localized to specific directories (as per
mapred.local.dir). The files are optionally symlinked in the current working
directory of the task.

> 
> /Taeho
> 
> 
> On Thu, May 22, 2008 at 4:20 PM, Arun C Murthy 
> <ar...@yahoo-inc.com> wrote:
> 
> >
> > On May 21, 2008, at 10:45 PM, Taeho Kang wrote:
> >
> >  Dear all,
> >>
> >> I am trying to use DistributedCache class for distributing files 
> >> required for running my jobs.
> >>
> >> While API documentation provides good guidelines, Is there 
> any tips 
> >> or usage examples (e.g. sample codes)?
> >>
> >>
> > http://hadoop.apache.org/core/docs/current/
> > mapred_tutorial.html#DistributedCache
> > and
> > http://hadoop.apache.org/core/docs/current/
> > mapred_tutorial.html#Example%3A+WordCount+v2.0
> >
> > Arun
> >
> >
> >  If you could share your experience with me, I would really 
> appreciate it.
> >>
> >> Thank you in advance,
> >>
> >> /Taeho
> >>
> >
> >
> 


Re: Questions on how to use DistributedCache

Posted by Taeho Kang <tk...@gmail.com>.
Thanks for your reply.

Just one more thing to ask..

>From what I see from the source code,
it looks like the files/jars registered in DistributedCache gets uploaded to
DFS and then downloaded to slave nodes.

Is there a way I can specify "the path in the slave nodes" where files/jars
get downloaded to?

/Taeho


On Thu, May 22, 2008 at 4:20 PM, Arun C Murthy <ar...@yahoo-inc.com> wrote:

>
> On May 21, 2008, at 10:45 PM, Taeho Kang wrote:
>
>  Dear all,
>>
>> I am trying to use DistributedCache class for distributing files required
>> for running my jobs.
>>
>> While API documentation provides good guidelines,
>> Is there any tips or usage examples (e.g. sample codes)?
>>
>>
> http://hadoop.apache.org/core/docs/current/
> mapred_tutorial.html#DistributedCache
> and
> http://hadoop.apache.org/core/docs/current/
> mapred_tutorial.html#Example%3A+WordCount+v2.0
>
> Arun
>
>
>  If you could share your experience with me, I would really appreciate it.
>>
>> Thank you in advance,
>>
>> /Taeho
>>
>
>

Re: Questions on how to use DistributedCache

Posted by "Edward J. Yoon" <ed...@udanax.org>.
Long time no see, T
If you do this on your own, please contribute it back to Hadoop! *smile*

Edward

On Thu, May 22, 2008 at 4:20 PM, Arun C Murthy <ar...@yahoo-inc.com> wrote:
>
> On May 21, 2008, at 10:45 PM, Taeho Kang wrote:
>
>> Dear all,
>>
>> I am trying to use DistributedCache class for distributing files required
>> for running my jobs.
>>
>> While API documentation provides good guidelines,
>> Is there any tips or usage examples (e.g. sample codes)?
>>
>
> http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#DistributedCache
> and
> http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Example%3A+WordCount+v2.0
>
> Arun
>
>> If you could share your experience with me, I would really appreciate it.
>>
>> Thank you in advance,
>>
>> /Taeho
>
>



-- 
Best regards,
Edward J. Yoon,
http://blog.udanax.org

Re: Questions on how to use DistributedCache

Posted by Arun C Murthy <ar...@yahoo-inc.com>.
On May 21, 2008, at 10:45 PM, Taeho Kang wrote:

> Dear all,
>
> I am trying to use DistributedCache class for distributing files  
> required
> for running my jobs.
>
> While API documentation provides good guidelines,
> Is there any tips or usage examples (e.g. sample codes)?
>

http://hadoop.apache.org/core/docs/current/ 
mapred_tutorial.html#DistributedCache
and
http://hadoop.apache.org/core/docs/current/ 
mapred_tutorial.html#Example%3A+WordCount+v2.0

Arun

> If you could share your experience with me, I would really  
> appreciate it.
>
> Thank you in advance,
>
> /Taeho