You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Matthew Hayes <mh...@linkedin.com> on 2011/09/27 19:19:34 UTC
DataFu library
Hi Pig users,
I wanted to share with you all that we recently open sourced a library we have been developing at LinkedIn. In it we have collected many of the useful UDFs we have developed for products such as "People You May Know" and "Skills". There are UDFs for median, quantiles, set operations, bag operations, pagerank, etc. All the UDFs are pretty well documented and unit tested (also tracking code coverage). It would be great to get people's feedback on it. Also if anyone would like to contribute please let us know :)
Project page:
http://sna-projects.com/datafu/
Github:
https://github.com/linkedin/datafu
Thanks,
Matt Hayes
Re: DataFu library
Posted by Norbert Burger <no...@gmail.com>.
Hi Matt and the rest of the LinkedIn team:
Thanks for open-sourcing this project!
Norbert
On Tue, Sep 27, 2011 at 1:35 PM, Lakshminarayana Motamarri <
narayana.gupta123@gmail.com> wrote:
> Thank you for sharing the UDF's...
>
> On Tue, Sep 27, 2011 at 10:19 AM, Matthew Hayes <mh...@linkedin.com>
> wrote:
>
> > Hi Pig users,
> >
> > I wanted to share with you all that we recently open sourced a library we
> > have been developing at LinkedIn. In it we have collected many of the
> > useful UDFs we have developed for products such as "People You May Know"
> and
> > "Skills". There are UDFs for median, quantiles, set operations, bag
> > operations, pagerank, etc. All the UDFs are pretty well documented and
> unit
> > tested (also tracking code coverage). It would be great to get people's
> > feedback on it. Also if anyone would like to contribute please let us
> know
> > :)
> >
> > Project page:
> > http://sna-projects.com/datafu/
> >
> > Github:
> > https://github.com/linkedin/datafu
> >
> > Thanks,
> > Matt Hayes
> >
> >
>
Re: DataFu library
Posted by Lakshminarayana Motamarri <na...@gmail.com>.
Thank you for sharing the UDF's...
On Tue, Sep 27, 2011 at 10:19 AM, Matthew Hayes <mh...@linkedin.com> wrote:
> Hi Pig users,
>
> I wanted to share with you all that we recently open sourced a library we
> have been developing at LinkedIn. In it we have collected many of the
> useful UDFs we have developed for products such as "People You May Know" and
> "Skills". There are UDFs for median, quantiles, set operations, bag
> operations, pagerank, etc. All the UDFs are pretty well documented and unit
> tested (also tracking code coverage). It would be great to get people's
> feedback on it. Also if anyone would like to contribute please let us know
> :)
>
> Project page:
> http://sna-projects.com/datafu/
>
> Github:
> https://github.com/linkedin/datafu
>
> Thanks,
> Matt Hayes
>
>