You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by "yingnan.ma" <yi...@ipinyou.com> on 2012/11/13 08:46:17 UTC
distributed cache
Hi ,
I used the distributed cache in the hadoop though the "setup" and "static" store an hashset in the mem;
and I try to use the distributed cache in the Pig, and I don't know how to store an hashset in the mem,I just can cache the file in the mem.
Any advise would be fine, Thank you so much!
Best Regards
Malone
2012-11-13
Re: Re: Re: distributed cache
Posted by "yingnan.ma" <yi...@ipinyou.com>.
when I use the distributed cache , I found that when the file is more than 100MB or the number of records are more than 10 million , the file can not be cache in the memory; and I try to set the io.sort.mb is 200MB ; it still can not work, Any suggestion would be fine! Thank you !
2012-11-16
发件人: yingnan.ma
发送时间: 2012-11-15 11:48:04
收件人: user
抄送:
主题: Re: Re: distributed cache
Thank you so much! Both Replicated join and UDF to use
distributed cache are useful for me, I am already done it , Thank you again.
2012-11-15
yingnan.ma
发件人: Prashant Kommireddi
发送时间: 2012-11-15 03:52:09
收件人: user@pig.apache.org
抄送:
主题: Re: distributed cache
If it's for purposes other than a Join, you could write a UDF to use
distributed cache. Look at the section "Loading the Distributed Cache"
http://ofps.oreilly.com/titles/9781449302641/writing_udfs.html
On Wed, Nov 14, 2012 at 11:44 AM, Ruslan Al-Fakikh <me...@gmail.com>wrote:
> Maybe this is what you are looking for:
> http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html
> see "Replicated join"
>
>
> On Tue, Nov 13, 2012 at 11:46 AM, yingnan.ma <yi...@ipinyou.com>
> wrote:
>
> > Hi ,
> >
> > I used the distributed cache in the hadoop though the "setup" and
> "static"
> > store an hashset in the mem;
> >
> > and I try to use the distributed cache in the Pig, and I don't know how
> to
> > store an hashset in the mem,I just can cache the file in the mem.
> >
> > Any advise would be fine, Thank you so much!
> >
> > Best Regards
> >
> > Malone
> >
> > 2012-11-13
> >
> >
> >
>
Re: Re: distributed cache
Posted by "yingnan.ma" <yi...@ipinyou.com>.
Thank you so much! Both Replicated join and UDF to use
distributed cache are useful for me, I am already done it , Thank you again.
2012-11-15
yingnan.ma
发件人: Prashant Kommireddi
发送时间: 2012-11-15 03:52:09
收件人: user@pig.apache.org
抄送:
主题: Re: distributed cache
If it's for purposes other than a Join, you could write a UDF to use
distributed cache. Look at the section "Loading the Distributed Cache"
http://ofps.oreilly.com/titles/9781449302641/writing_udfs.html
On Wed, Nov 14, 2012 at 11:44 AM, Ruslan Al-Fakikh <me...@gmail.com>wrote:
> Maybe this is what you are looking for:
> http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html
> see "Replicated join"
>
>
> On Tue, Nov 13, 2012 at 11:46 AM, yingnan.ma <yi...@ipinyou.com>
> wrote:
>
> > Hi ,
> >
> > I used the distributed cache in the hadoop though the "setup" and
> "static"
> > store an hashset in the mem;
> >
> > and I try to use the distributed cache in the Pig, and I don't know how
> to
> > store an hashset in the mem,I just can cache the file in the mem.
> >
> > Any advise would be fine, Thank you so much!
> >
> > Best Regards
> >
> > Malone
> >
> > 2012-11-13
> >
> >
> >
>
Re: distributed cache
Posted by Prashant Kommireddi <pr...@gmail.com>.
If it's for purposes other than a Join, you could write a UDF to use
distributed cache. Look at the section "Loading the Distributed Cache"
http://ofps.oreilly.com/titles/9781449302641/writing_udfs.html
On Wed, Nov 14, 2012 at 11:44 AM, Ruslan Al-Fakikh <me...@gmail.com>wrote:
> Maybe this is what you are looking for:
> http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html
> see "Replicated join"
>
>
> On Tue, Nov 13, 2012 at 11:46 AM, yingnan.ma <yi...@ipinyou.com>
> wrote:
>
> > Hi ,
> >
> > I used the distributed cache in the hadoop though the "setup" and
> "static"
> > store an hashset in the mem;
> >
> > and I try to use the distributed cache in the Pig, and I don't know how
> to
> > store an hashset in the mem,I just can cache the file in the mem.
> >
> > Any advise would be fine, Thank you so much!
> >
> > Best Regards
> >
> > Malone
> >
> > 2012-11-13
> >
> >
> >
>
Re: distributed cache
Posted by Ruslan Al-Fakikh <me...@gmail.com>.
Maybe this is what you are looking for:
http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html
see "Replicated join"
On Tue, Nov 13, 2012 at 11:46 AM, yingnan.ma <yi...@ipinyou.com> wrote:
> Hi ,
>
> I used the distributed cache in the hadoop though the "setup" and "static"
> store an hashset in the mem;
>
> and I try to use the distributed cache in the Pig, and I don't know how to
> store an hashset in the mem,I just can cache the file in the mem.
>
> Any advise would be fine, Thank you so much!
>
> Best Regards
>
> Malone
>
> 2012-11-13
>
>
>