You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Jonathan Coveney <jc...@gmail.com> on 2011/10/22 00:37:04 UTC

Large pig script takes forever to parse?

Pig is version 0.9. I have a script that under version 0.8, begins running
quickly...but under version 0.9, it takes 10 minutes to parse (I did -debug
ALL and it builds a big old AST). I am curious if there is anything that
could cause this difference? It is a very long script (almost 2k lines!),
but I'm wondering why the difference.

I'm going to try and replicate with a script I can share, and then run it
against trunk....just curious if anyone has seen this before.

Re: Large pig script takes forever to parse?

Posted by Jonathan Coveney <jc...@gmail.com>.
I'm going to do some work to try and get a version that can be shared.
Something odd has to be going on.

2011/10/25 Daniel Dai <da...@hortonworks.com>

> I just tested a > 4k script. Pig 0.8.1 takes 3m59.592s and Pig 0.9.1 takes
> 0m56.553s to compile (use -c option in command line). Seems Pig 0.9
> compiles
> much faster in large queries. I also did some tests on small queries
> before,
> Pig 0.9 is about 15% slower than Pig 0.8. The slow down is less significant
> since compilation time only takes a fraction of total runtime for those
> queries. If there are some cases Pig 0.9 lag behind a lot, please share
> with
> me.
>
> Thanks,
> Daniel
>
> On Tue, Oct 25, 2011 at 9:20 AM, Thejas Nair <th...@hortonworks.com>
> wrote:
>
> > This is not expected. Daniel had some experiments done with large queries
> > (> 2k lines) and the parsing was actually faster with them.
> > There was some small slow down for smaller queries.
> > If you can replicate with a script that you can share, that would be
> great.
> >
> > -Thejas
> >
> >
> >
> > On 10/21/11 3:37 PM, Jonathan Coveney wrote:
> >
> >> Pig is version 0.9. I have a script that under version 0.8, begins
> running
> >> quickly...but under version 0.9, it takes 10 minutes to parse (I did
> >> -debug
> >> ALL and it builds a big old AST). I am curious if there is anything that
> >> could cause this difference? It is a very long script (almost 2k
> lines!),
> >> but I'm wondering why the difference.
> >>
> >> I'm going to try and replicate with a script I can share, and then run
> it
> >> against trunk....just curious if anyone has seen this before.
> >>
> >>
> >
>

Re: Large pig script takes forever to parse?

Posted by Daniel Dai <da...@hortonworks.com>.
I just tested a > 4k script. Pig 0.8.1 takes 3m59.592s and Pig 0.9.1 takes
0m56.553s to compile (use -c option in command line). Seems Pig 0.9 compiles
much faster in large queries. I also did some tests on small queries before,
Pig 0.9 is about 15% slower than Pig 0.8. The slow down is less significant
since compilation time only takes a fraction of total runtime for those
queries. If there are some cases Pig 0.9 lag behind a lot, please share with
me.

Thanks,
Daniel

On Tue, Oct 25, 2011 at 9:20 AM, Thejas Nair <th...@hortonworks.com> wrote:

> This is not expected. Daniel had some experiments done with large queries
> (> 2k lines) and the parsing was actually faster with them.
> There was some small slow down for smaller queries.
> If you can replicate with a script that you can share, that would be great.
>
> -Thejas
>
>
>
> On 10/21/11 3:37 PM, Jonathan Coveney wrote:
>
>> Pig is version 0.9. I have a script that under version 0.8, begins running
>> quickly...but under version 0.9, it takes 10 minutes to parse (I did
>> -debug
>> ALL and it builds a big old AST). I am curious if there is anything that
>> could cause this difference? It is a very long script (almost 2k lines!),
>> but I'm wondering why the difference.
>>
>> I'm going to try and replicate with a script I can share, and then run it
>> against trunk....just curious if anyone has seen this before.
>>
>>
>

Re: Large pig script takes forever to parse?

Posted by Thejas Nair <th...@hortonworks.com>.
This is not expected. Daniel had some experiments done with large 
queries (> 2k lines) and the parsing was actually faster with them.
There was some small slow down for smaller queries.
If you can replicate with a script that you can share, that would be great.

-Thejas


On 10/21/11 3:37 PM, Jonathan Coveney wrote:
> Pig is version 0.9. I have a script that under version 0.8, begins running
> quickly...but under version 0.9, it takes 10 minutes to parse (I did -debug
> ALL and it builds a big old AST). I am curious if there is anything that
> could cause this difference? It is a very long script (almost 2k lines!),
> but I'm wondering why the difference.
>
> I'm going to try and replicate with a script I can share, and then run it
> against trunk....just curious if anyone has seen this before.
>