You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Tamir Kamara <ta...@gmail.com> on 2009/04/01 10:14:08 UTC

Order & Limit

Hi,

I'm tried to use LIMIT right after ORDER to get the top x lines in the file:
a = LOAD 'file' AS (domain: chararray, score: double);
b = ORDER a BY score DESC;
c = LIMIT b 2500;
STORE c into 'file-top2500';

But this doesn't produce what I intended (seem to get unordered domains). I
saw that in the reference manual there's a similar example of how to use
order by to control the result of a limit command but it doesn't seem to
work here.

Can you help ?

thanks,
tamir

Re: Order & Limit

Posted by Tamir Kamara <ta...@gmail.com>.
Hi,

I'm using hadoop 0.18.3 with pig trunk as of march 29. Map output
compression is turned on with hadoop and using lzo native library.
Pig script is lunched from a desktop machine not in the cluster.

Tamir.


On Wed, Apr 1, 2009 at 5:59 PM, zjffdu <zj...@gmail.com> wrote:

> Can you describe your environment in details, I run your example, and it
> works in my machine.
>
>
>
> -----Original Message-----
> From: Tamir Kamara [mailto:tamirkamara@gmail.com]
> Sent: Wednesday, April 01, 2009 4:14 PM
> To: pig-user@hadoop.apache.org
> Subject: Order & Limit
>
> Hi,
>
> I'm tried to use LIMIT right after ORDER to get the top x lines in the
> file:
> a = LOAD 'file' AS (domain: chararray, score: double);
> b = ORDER a BY score DESC;
> c = LIMIT b 2500;
> STORE c into 'file-top2500';
>
> But this doesn't produce what I intended (seem to get unordered domains). I
> saw that in the reference manual there's a similar example of how to use
> order by to control the result of a limit command but it doesn't seem to
> work here.
>
> Can you help ?
>
> thanks,
> tamir
>
>

RE: Order & Limit

Posted by zjffdu <zj...@gmail.com>.
Can you describe your environment in details, I run your example, and it
works in my machine.



-----Original Message-----
From: Tamir Kamara [mailto:tamirkamara@gmail.com] 
Sent: Wednesday, April 01, 2009 4:14 PM
To: pig-user@hadoop.apache.org
Subject: Order & Limit

Hi,

I'm tried to use LIMIT right after ORDER to get the top x lines in the file:
a = LOAD 'file' AS (domain: chararray, score: double);
b = ORDER a BY score DESC;
c = LIMIT b 2500;
STORE c into 'file-top2500';

But this doesn't produce what I intended (seem to get unordered domains). I
saw that in the reference manual there's a similar example of how to use
order by to control the result of a limit command but it doesn't seem to
work here.

Can you help ?

thanks,
tamir


Re: Order & Limit

Posted by Thejas Nair <te...@yahoo-inc.com>.
Did you mean (to do) "b  = ORDER a BY domain DESC;" ?


On 4/1/09 1:14 AM, "Tamir Kamara" <ta...@gmail.com> wrote:

> Hi,
> 
> I'm tried to use LIMIT right after ORDER to get the top x lines in the file:
> a = LOAD 'file' AS (domain: chararray, score: double);
> b = ORDER a BY score DESC;
> c = LIMIT b 2500;
> STORE c into 'file-top2500';
> 
> But this doesn't produce what I intended (seem to get unordered domains). I
> saw that in the reference manual there's a similar example of how to use
> order by to control the result of a limit command but it doesn't seem to
> work here.
> 
> Can you help ?
> 
> thanks,
> tamir