You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Patcharee Thongtra <Pa...@uni.no> on 2014/06/05 10:36:12 UTC

extract tuple from bag in an order

Hi,

I have the following data

(2009-09-09,2,1,{(70)},{(80)},{(90)})
(2010-10-10,2,12,{(71),(75)},{(81),(85)},{(91),(95)})
(2012-12-12,2,9,{(76),(77),(78)},{(86),(87),(88)},{(96),(97),(98)})

which is in the format

{date: chararray, zone: int, z: int, uTmp: {(varvalue: int)}, vTmp: 
{(varvalue: int)}, thTmp: {(varvalue: int)} }

How can I get:

(2009-09-09,2,1,70,80,90)
(2010-10-10,2,12,71,81,91)
(2010-10-10,2,12,75,85,95)
(2012-12-12,2,9,76,86,96)
(2012-12-12,2,9,77,87,97)
(2012-12-12,2,9,78,88,98)

Any suggestion is appreciated.

Patcharee





Re: extract tuple from bag in an order

Posted by Mehmet Tepedelenlioglu <me...@yahoo.com.INVALID>.
You could separate the inner bags, flatten, rank, and join. It would be ugly and inefficient though. It is best to just write a udf which basically does what the python’s zip function does.

On Jun 5, 2014, at 1:36 AM, Patcharee Thongtra <Pa...@uni.no> wrote:

> Hi,
> 
> I have the following data
> 
> (2009-09-09,2,1,{(70)},{(80)},{(90)})
> (2010-10-10,2,12,{(71),(75)},{(81),(85)},{(91),(95)})
> (2012-12-12,2,9,{(76),(77),(78)},{(86),(87),(88)},{(96),(97),(98)})
> 
> which is in the format
> 
> {date: chararray, zone: int, z: int, uTmp: {(varvalue: int)}, vTmp: {(varvalue: int)}, thTmp: {(varvalue: int)} }
> 
> How can I get:
> 
> (2009-09-09,2,1,70,80,90)
> (2010-10-10,2,12,71,81,91)
> (2010-10-10,2,12,75,85,95)
> (2012-12-12,2,9,76,86,96)
> (2012-12-12,2,9,77,87,97)
> (2012-12-12,2,9,78,88,98)
> 
> Any suggestion is appreciated.
> 
> Patcharee
> 
> 
> 
>