You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pig.apache.org by Olga Natkovich <ol...@yahoo-inc.com> on 2009/06/26 02:17:15 UTC

Pig 0.3.0 is released!

Pig Team is happy to announce Pig 0.3.0 release!
 
Pig is a Hadoop subproject that provides high-level data-flow language
and an execution framework for parallel computation on a Hadoop cluster.
More details about Pig can be found at http://hadoop.apache.org/pig/.
 
The highlight of this release is multiquery performance optimization
that allows to share computation accross multiple queries within the
same Pig script.The details of the release can be found at
http://hadoop.apache.org/pig/releases.html
 
Olga

RE: Pig 0.3.0 is released!

Posted by Olga Natkovich <ol...@yahoo-inc.com>.

In our tests we see from 20% to about 8x speedup depending on a query.

You can see the numbers at the bottom of 

http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification

Olga

> -----Original Message-----
> From: Alan Gates [mailto:gates@yahoo-inc.com] 
> Sent: Friday, June 26, 2009 12:55 PM
> To: pig-user@hadoop.apache.org
> Subject: Re: Pig 0.3.0 is released!
> 
> On the PigMix page (http://wiki.apache.org/pig/PigMix) the 
> numbers published for 5/27 are from the code base that become 
> 0.3.0  (I've updated the page to say that).  As soon as I 
> have some time I'll rerun those tests on the released 0.3 
> code base and publish them.
> 
> Alan.
> 
> On Jun 26, 2009, at 12:32 PM, Dmitriy Ryaboy wrote:
> 
> > George --
> > It will be vastly faster if you have scripts that can share some of
> > the computation.
> > Some of the internals have been improved, as well. The PigMix page
> > hasn't been updated yet, but perhaps you could take a look at the
> > information on http://wiki.apache.org/pig/PigMix and run 
> some tests on
> > your cluster?  I, for one, would be interested in hearing your
> > results.  You might have to adjust some of the scripts in PIG-200 to
> > match your environment -- if you feel like making them more general,
> > that would be a welcome contribution.
> >
> > -Dmitriy
> >
> > On Fri, Jun 26, 2009 at 12:12 PM, George 
> Pang<p0...@gmail.com> wrote:
> >> Hi Pig Team,
> >> Is Pig 0.3.0 faster than Pig 0.2.0?
> >>
> >> Thanks,
> >>
> >> George
> >>
> >> 2009/6/25 Olga Natkovich <ol...@yahoo-inc.com>
> >>
> >>> Pig Team is happy to announce Pig 0.3.0 release!
> >>>
> >>> Pig is a Hadoop subproject that provides high-level data-flow  
> >>> language
> >>> and an execution framework for parallel computation on a Hadoop  
> >>> cluster.
> >>> More details about Pig can be found at http://hadoop.apache.org/ 
> >>> pig/.
> >>>
> >>> The highlight of this release is multiquery performance 
> optimization
> >>> that allows to share computation accross multiple queries 
> within the
> >>> same Pig script.The details of the release can be found at
> >>> http://hadoop.apache.org/pig/releases.html
> >>>
> >>> Olga
> >>>
> >>
> 
>

Re: Pig 0.3.0 is released!

Posted by Alan Gates <ga...@yahoo-inc.com>.

On the PigMix page (http://wiki.apache.org/pig/PigMix) the numbers  
published for 5/27 are from the code base that become 0.3.0  (I've  
updated the page to say that).  As soon as I have some time I'll rerun  
those tests on the released 0.3 code base and publish them.

Alan.

On Jun 26, 2009, at 12:32 PM, Dmitriy Ryaboy wrote:

> George --
> It will be vastly faster if you have scripts that can share some of
> the computation.
> Some of the internals have been improved, as well. The PigMix page
> hasn't been updated yet, but perhaps you could take a look at the
> information on http://wiki.apache.org/pig/PigMix and run some tests on
> your cluster?  I, for one, would be interested in hearing your
> results.  You might have to adjust some of the scripts in PIG-200 to
> match your environment -- if you feel like making them more general,
> that would be a welcome contribution.
>
> -Dmitriy
>
> On Fri, Jun 26, 2009 at 12:12 PM, George Pang<p0...@gmail.com> wrote:
>> Hi Pig Team,
>> Is Pig 0.3.0 faster than Pig 0.2.0?
>>
>> Thanks,
>>
>> George
>>
>> 2009/6/25 Olga Natkovich <ol...@yahoo-inc.com>
>>
>>> Pig Team is happy to announce Pig 0.3.0 release!
>>>
>>> Pig is a Hadoop subproject that provides high-level data-flow  
>>> language
>>> and an execution framework for parallel computation on a Hadoop  
>>> cluster.
>>> More details about Pig can be found at http://hadoop.apache.org/ 
>>> pig/.
>>>
>>> The highlight of this release is multiquery performance optimization
>>> that allows to share computation accross multiple queries within the
>>> same Pig script.The details of the release can be found at
>>> http://hadoop.apache.org/pig/releases.html
>>>
>>> Olga
>>>
>>

Re: Pig 0.3.0 is released!

Posted by Dmitriy Ryaboy <dv...@gmail.com>.

George --
It will be vastly faster if you have scripts that can share some of
the computation.
Some of the internals have been improved, as well. The PigMix page
hasn't been updated yet, but perhaps you could take a look at the
information on http://wiki.apache.org/pig/PigMix and run some tests on
your cluster?  I, for one, would be interested in hearing your
results.  You might have to adjust some of the scripts in PIG-200 to
match your environment -- if you feel like making them more general,
that would be a welcome contribution.

-Dmitriy

On Fri, Jun 26, 2009 at 12:12 PM, George Pang<p0...@gmail.com> wrote:
> Hi Pig Team,
> Is Pig 0.3.0 faster than Pig 0.2.0?
>
> Thanks,
>
> George
>
> 2009/6/25 Olga Natkovich <ol...@yahoo-inc.com>
>
>> Pig Team is happy to announce Pig 0.3.0 release!
>>
>> Pig is a Hadoop subproject that provides high-level data-flow language
>> and an execution framework for parallel computation on a Hadoop cluster.
>> More details about Pig can be found at http://hadoop.apache.org/pig/.
>>
>> The highlight of this release is multiquery performance optimization
>> that allows to share computation accross multiple queries within the
>> same Pig script.The details of the release can be found at
>> http://hadoop.apache.org/pig/releases.html
>>
>> Olga
>>
>

Re: Pig 0.3.0 is released!

Posted by George Pang <p0...@gmail.com>.

Hi Pig Team,
Is Pig 0.3.0 faster than Pig 0.2.0?

Thanks,

George

2009/6/25 Olga Natkovich <ol...@yahoo-inc.com>

> Pig Team is happy to announce Pig 0.3.0 release!
>
> Pig is a Hadoop subproject that provides high-level data-flow language
> and an execution framework for parallel computation on a Hadoop cluster.
> More details about Pig can be found at http://hadoop.apache.org/pig/.
>
> The highlight of this release is multiquery performance optimization
> that allows to share computation accross multiple queries within the
> same Pig script.The details of the release can be found at
> http://hadoop.apache.org/pig/releases.html
>
> Olga
>