You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Charles Gonçalves <ch...@gmail.com> on 2011/02/19 22:12:55 UTC

Cases of Work using pig on Industry to cite on my MSc

Guys,

I'm working on my MSc now using pig/hadoop to process logs.
I'm basically using it to do some characterizations on a traffic analysis
from some of the greatest Media groups from Brazil.
One of my dissertation chapters will be from case studies where that
environment (pig/hadoop) is needed due to difficult techniques to handle
great amount of data.

I'm wonder if someone could help me and point works (academical, published,
technical reports or whatever) or even talk (privately or not) about their
works  and how pig/hadoop helped on that.

I will gladly put the results of that chapter on pig wiki later!

Thanks in advance!

-- 
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.:  55 31 34741485
Lab.: 55 31 34095840

Re: Cases of Work using pig on Industry to cite on my MSc

Posted by Charles Gonçalves <ch...@gmail.com>.
Ok Guys, I thank you all for your reponses, I will start searching about it
and will keep you update on this.


On Sun, Feb 20, 2011 at 2:14 AM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Twitter uses Pig for analyzing log data. The uses cases are wide-ranging,
> from performing statistical analysis on the results of feature a/b tests,
> to
> examining usage patterns on the website and the wider platform, to building
> background models for trending topics.
> You can look online for slide decks from me (should be on the yahoo dev
> blog
> Alan linked to) and Kevin Weil, those should have some additional details.
>
> D
>
> On Sat, Feb 19, 2011 at 7:28 PM, Alan Gates <ga...@yahoo-inc.com> wrote:
>
> > There have been talks given at the Bay Area HUGs about how people use
> Pig.
> >  I know for example Yahoo Mail did one on how it uses Pig for spam
> > detection.  Presentations for those talks are posted to Yahoo's Hadoop
> blog:
> >  http://developer.yahoo.com/blogs/hadoop/
> >
> > Alan.
> >
> >
> > On Feb 19, 2011, at 1:12 PM, Charles Gonçalves wrote:
> >
> >  Guys,
> >>
> >> I'm working on my MSc now using pig/hadoop to process logs.
> >> I'm basically using it to do some characterizations on a traffic
> analysis
> >> from some of the greatest Media groups from Brazil.
> >> One of my dissertation chapters will be from case studies where that
> >> environment (pig/hadoop) is needed due to difficult techniques to handle
> >> great amount of data.
> >>
> >> I'm wonder if someone could help me and point works (academical,
> >> published,
> >> technical reports or whatever) or even talk (privately or not) about
> their
> >> works  and how pig/hadoop helped on that.
> >>
> >> I will gladly put the results of that chapter on pig wiki later!
> >>
> >> Thanks in advance!
> >>
> >> --
> >> *Charles Ferreira Gonçalves *
> >> http://homepages.dcc.ufmg.br/~charles/
> >> UFMG - ICEx - Dcc
> >> Cel.: 55 31 87741485
> >> Tel.:  55 31 34741485
> >> Lab.: 55 31 34095840
> >>
> >
> >
>



-- 
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.:  55 31 34741485
Lab.: 55 31 34095840

Re: Cases of Work using pig on Industry to cite on my MSc

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Twitter uses Pig for analyzing log data. The uses cases are wide-ranging,
from performing statistical analysis on the results of feature a/b tests, to
examining usage patterns on the website and the wider platform, to building
background models for trending topics.
You can look online for slide decks from me (should be on the yahoo dev blog
Alan linked to) and Kevin Weil, those should have some additional details.

D

On Sat, Feb 19, 2011 at 7:28 PM, Alan Gates <ga...@yahoo-inc.com> wrote:

> There have been talks given at the Bay Area HUGs about how people use Pig.
>  I know for example Yahoo Mail did one on how it uses Pig for spam
> detection.  Presentations for those talks are posted to Yahoo's Hadoop blog:
>  http://developer.yahoo.com/blogs/hadoop/
>
> Alan.
>
>
> On Feb 19, 2011, at 1:12 PM, Charles Gonçalves wrote:
>
>  Guys,
>>
>> I'm working on my MSc now using pig/hadoop to process logs.
>> I'm basically using it to do some characterizations on a traffic analysis
>> from some of the greatest Media groups from Brazil.
>> One of my dissertation chapters will be from case studies where that
>> environment (pig/hadoop) is needed due to difficult techniques to handle
>> great amount of data.
>>
>> I'm wonder if someone could help me and point works (academical,
>> published,
>> technical reports or whatever) or even talk (privately or not) about their
>> works  and how pig/hadoop helped on that.
>>
>> I will gladly put the results of that chapter on pig wiki later!
>>
>> Thanks in advance!
>>
>> --
>> *Charles Ferreira Gonçalves *
>> http://homepages.dcc.ufmg.br/~charles/
>> UFMG - ICEx - Dcc
>> Cel.: 55 31 87741485
>> Tel.:  55 31 34741485
>> Lab.: 55 31 34095840
>>
>
>

Re: Cases of Work using pig on Industry to cite on my MSc

Posted by Alan Gates <ga...@yahoo-inc.com>.
There have been talks given at the Bay Area HUGs about how people use  
Pig.  I know for example Yahoo Mail did one on how it uses Pig for  
spam detection.  Presentations for those talks are posted to Yahoo's  
Hadoop blog:  http://developer.yahoo.com/blogs/hadoop/

Alan.

On Feb 19, 2011, at 1:12 PM, Charles Gonçalves wrote:

> Guys,
>
> I'm working on my MSc now using pig/hadoop to process logs.
> I'm basically using it to do some characterizations on a traffic  
> analysis
> from some of the greatest Media groups from Brazil.
> One of my dissertation chapters will be from case studies where that
> environment (pig/hadoop) is needed due to difficult techniques to  
> handle
> great amount of data.
>
> I'm wonder if someone could help me and point works (academical,  
> published,
> technical reports or whatever) or even talk (privately or not) about  
> their
> works  and how pig/hadoop helped on that.
>
> I will gladly put the results of that chapter on pig wiki later!
>
> Thanks in advance!
>
> -- 
> *Charles Ferreira Gonçalves *
> http://homepages.dcc.ufmg.br/~charles/
> UFMG - ICEx - Dcc
> Cel.: 55 31 87741485
> Tel.:  55 31 34741485
> Lab.: 55 31 34095840


Re: Cases of Work using pig on Industry to cite on my MSc

Posted by Russell Jurney <ru...@gmail.com>.
Twitter uses Pig, but I'll let Dmitriy tell you about that.  Basically
everyone is using Pig.  Perl was the duct tape of the internet.  Pig/Hadoop
are the duct tape of 'big data.'

Russ

On Sat, Feb 19, 2011 at 3:26 PM, Charles Gonçalves <ch...@gmail.com>wrote:

> Thanks Russel I will search around and read about it.
>
> I really like your linkedin summary about "career-long obsession with
> bringing analytics to the people to enrich their lives",  cool!
>
> On Sat, Feb 19, 2011 at 8:16 PM, Russell Jurney <russell.jurney@gmail.com
> >wrote:
>
> > LinkedIn uses Pig for all its data products.  If you google around, you
> can
> > find talks about this.
> >
> > I wrote this, but unfortunately I never became much of a Pig contributor
> :(
> >  http://blog.linkedin.com/2010/07/01/linkedin-apache-pig/  We presented
> > these slides at a contributor's meeting:
> > http://www.slideshare.net/rjurney/azkaban-pig-5057793  Chris Riccomini
> > presented Pig at LinkedIn here:
> > http://www.slideshare.net/hadoopusergroup/pig-at-linkedin
> >
> > On Sat, Feb 19, 2011 at 1:12 PM, Charles Gonçalves <charles.fg@gmail.com
> > >wrote:
> >
> > > Guys,
> > >
> > > I'm working on my MSc now using pig/hadoop to process logs.
> > > I'm basically using it to do some characterizations on a traffic
> analysis
> > > from some of the greatest Media groups from Brazil.
> > > One of my dissertation chapters will be from case studies where that
> > > environment (pig/hadoop) is needed due to difficult techniques to
> handle
> > > great amount of data.
> > >
> > > I'm wonder if someone could help me and point works (academical,
> > published,
> > > technical reports or whatever) or even talk (privately or not) about
> > their
> > > works  and how pig/hadoop helped on that.
> > >
> > > I will gladly put the results of that chapter on pig wiki later!
> > >
> > > Thanks in advance!
> > >
> > > --
> > > *Charles Ferreira Gonçalves *
> > > http://homepages.dcc.ufmg.br/~charles/
> > > UFMG - ICEx - Dcc
> > > Cel.: 55 31 87741485
> > > Tel.:  55 31 34741485
> > > Lab.: 55 31 34095840
> > >
> >
>
>
>
> --
> *Charles Ferreira Gonçalves *
> http://homepages.dcc.ufmg.br/~charles/
> UFMG - ICEx - Dcc
> Cel.: 55 31 87741485
> Tel.:  55 31 34741485
> Lab.: 55 31 34095840
>

Re: Cases of Work using pig on Industry to cite on my MSc

Posted by Charles Gonçalves <ch...@gmail.com>.
Thanks Russel I will search around and read about it.

I really like your linkedin summary about "career-long obsession with
bringing analytics to the people to enrich their lives",  cool!

On Sat, Feb 19, 2011 at 8:16 PM, Russell Jurney <ru...@gmail.com>wrote:

> LinkedIn uses Pig for all its data products.  If you google around, you can
> find talks about this.
>
> I wrote this, but unfortunately I never became much of a Pig contributor :(
>  http://blog.linkedin.com/2010/07/01/linkedin-apache-pig/  We presented
> these slides at a contributor's meeting:
> http://www.slideshare.net/rjurney/azkaban-pig-5057793  Chris Riccomini
> presented Pig at LinkedIn here:
> http://www.slideshare.net/hadoopusergroup/pig-at-linkedin
>
> On Sat, Feb 19, 2011 at 1:12 PM, Charles Gonçalves <charles.fg@gmail.com
> >wrote:
>
> > Guys,
> >
> > I'm working on my MSc now using pig/hadoop to process logs.
> > I'm basically using it to do some characterizations on a traffic analysis
> > from some of the greatest Media groups from Brazil.
> > One of my dissertation chapters will be from case studies where that
> > environment (pig/hadoop) is needed due to difficult techniques to handle
> > great amount of data.
> >
> > I'm wonder if someone could help me and point works (academical,
> published,
> > technical reports or whatever) or even talk (privately or not) about
> their
> > works  and how pig/hadoop helped on that.
> >
> > I will gladly put the results of that chapter on pig wiki later!
> >
> > Thanks in advance!
> >
> > --
> > *Charles Ferreira Gonçalves *
> > http://homepages.dcc.ufmg.br/~charles/
> > UFMG - ICEx - Dcc
> > Cel.: 55 31 87741485
> > Tel.:  55 31 34741485
> > Lab.: 55 31 34095840
> >
>



-- 
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.:  55 31 34741485
Lab.: 55 31 34095840

Re: Cases of Work using pig on Industry to cite on my MSc

Posted by Russell Jurney <ru...@gmail.com>.
LinkedIn uses Pig for all its data products.  If you google around, you can
find talks about this.

I wrote this, but unfortunately I never became much of a Pig contributor :(
 http://blog.linkedin.com/2010/07/01/linkedin-apache-pig/  We presented
these slides at a contributor's meeting:
http://www.slideshare.net/rjurney/azkaban-pig-5057793  Chris Riccomini
presented Pig at LinkedIn here:
http://www.slideshare.net/hadoopusergroup/pig-at-linkedin

On Sat, Feb 19, 2011 at 1:12 PM, Charles Gonçalves <ch...@gmail.com>wrote:

> Guys,
>
> I'm working on my MSc now using pig/hadoop to process logs.
> I'm basically using it to do some characterizations on a traffic analysis
> from some of the greatest Media groups from Brazil.
> One of my dissertation chapters will be from case studies where that
> environment (pig/hadoop) is needed due to difficult techniques to handle
> great amount of data.
>
> I'm wonder if someone could help me and point works (academical, published,
> technical reports or whatever) or even talk (privately or not) about their
> works  and how pig/hadoop helped on that.
>
> I will gladly put the results of that chapter on pig wiki later!
>
> Thanks in advance!
>
> --
> *Charles Ferreira Gonçalves *
> http://homepages.dcc.ufmg.br/~charles/
> UFMG - ICEx - Dcc
> Cel.: 55 31 87741485
> Tel.:  55 31 34741485
> Lab.: 55 31 34095840
>