You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by elein <el...@varlena.com> on 2010/07/28 23:45:38 UTC
UNION -- Ordered
I've got
A = FOREACH ...
B = FOREACH ...
C = FOREACH ...
...
X = UNION A, B, C,...
Each of the A, B, C data is a single tuple. I want X ordered
by the order specified in the UNION. The data in A, B, C, ... is not
necessarily in explicit sort order so ORDER X by field does not work. I've tried breaking
the union into only unioning two pieces then that union plus another piece, etc.
That does not work either.
Anyone have any ideas how to do this
elein
elein@varlena.com
Re: UNION -- Ordered
Posted by elein <el...@varlena.com>.
Yes, Thank you. I was trying to avoid adding a sort column.
On Jul 28, 2010, at 6:05 PM, Thejas M Nair wrote:
> As you observed, union does not guarantee the ordering . You will need to project an additional column indicating the order you want, so that you can do an order-by on it.
>
> -Thejas
>
>
>
> On 7/28/10 2:45 PM, "elein" <el...@varlena.com> wrote:
>
>
>
> I've got
> A = FOREACH ...
> B = FOREACH ...
> C = FOREACH ...
> ...
>
> X = UNION A, B, C,...
>
> Each of the A, B, C data is a single tuple. I want X ordered
> by the order specified in the UNION. The data in A, B, C, ... is not
> necessarily in explicit sort order so ORDER X by field does not work. I've tried breaking
> the union into only unioning two pieces then that union plus another piece, etc.
> That does not work either.
>
> Anyone have any ideas how to do this
>
>
> elein
> elein@varlena.com
>
>
>
>
>
>
elein
elein@varlena.com
Re: parallism level
Posted by Thejas M Nair <te...@yahoo-inc.com>.
Please see http://hadoop.apache.org/pig/docs/r0.7.0/piglatin_ref2.html .
You can use Oset default_parallel 10¹ to ask query to use 10 reducers for
all MR jobs, or specify Oparallel x¹ in the pig statement to ask pig to use
x number of reducers for that operation (for operations like group, order-by
, join that usually result in a separate MR job).
-Thejas
On 7/28/10 7:00 PM, "Gang Luo" <lg...@yahoo.com.cn> wrote:
> Hi all,
> by default the parallism (number of reducers) of a pig query is 1. How to
> change
> this value? If I set the value to 10, does that mean all the MR jobs for this
> query will run with 10 reducers?
>
>
> Thanks,
> -Gang
>
>
>
>
parallism level
Posted by Gang Luo <lg...@yahoo.com.cn>.
Hi all,
by default the parallism (number of reducers) of a pig query is 1. How to change
this value? If I set the value to 10, does that mean all the MR jobs for this
query will run with 10 reducers?
Thanks,
-Gang
Re: UNION -- Ordered
Posted by Thejas M Nair <te...@yahoo-inc.com>.
As you observed, union does not guarantee the ordering . You will need to project an additional column indicating the order you want, so that you can do an order-by on it.
-Thejas
On 7/28/10 2:45 PM, "elein" <el...@varlena.com> wrote:
I've got
A = FOREACH ...
B = FOREACH ...
C = FOREACH ...
...
X = UNION A, B, C,...
Each of the A, B, C data is a single tuple. I want X ordered
by the order specified in the UNION. The data in A, B, C, ... is not
necessarily in explicit sort order so ORDER X by field does not work. I've tried breaking
the union into only unioning two pieces then that union plus another piece, etc.
That does not work either.
Anyone have any ideas how to do this
elein
elein@varlena.com