You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Pete Warden <pe...@petewarden.com> on 2011/10/15 07:53:53 UTC
Converting an inner bag to an outer bag/relation?
Newbie question - I have an inner bag of tuples that I'd like to convert
into an outer bag/relation and I'm struggling to figure out how
For example if I have
({(1,2),(3,4),(5,6)}
({(7,8),(9,10)}
I'd like it to become
(1,2)
(3,4)
(5,6)
(7,8)
(9,10)
The motivation behind that is a Cassandra field that contains a packed,
variable-length data structure, a bit like a CSV string encoding multiple
rows of data
I can convert the raw char array into an inner bag of tuples but I need to
'explode' it to work properly with it
I'm open to "don't do that, here's why it's a dumb idea", but it feels like
I'm missing an operator that could be used to implement this. I have a
partially-working solution using streaming, but the presence of new lines in
the chararray makes that approach tough. Any advice much appreciated.
cheers,
Pete
Re: Converting an inner bag to an outer bag/relation?
Posted by Jeremy Hanna <je...@gmail.com>.
One of the reasons why we did pygmalion here was to facilitate working with tabular data - extracting out values (with FromCassandraBag) using specified column names. Not sure if it works with your use case, but just to mention it - it doesn't work as easily with dynamic column names.
https://github.com/jeromatron/pygmalion/
On Oct 15, 2011, at 12:58 AM, Pete Warden wrote:
> Never mind, it looks like the FLATTEN operator should do the trick. I'd only
> seen it with tuples, didn't realize it did what I needed with inner bags
> until I RTFM-ed again.
>
> On Fri, Oct 14, 2011 at 10:53 PM, Pete Warden <pe...@petewarden.com> wrote:
>
>> Newbie question - I have an inner bag of tuples that I'd like to convert
>> into an outer bag/relation and I'm struggling to figure out how
>> For example if I have
>> ({(1,2),(3,4),(5,6)}
>> ({(7,8),(9,10)}
>> I'd like it to become
>> (1,2)
>> (3,4)
>> (5,6)
>> (7,8)
>> (9,10)
>> The motivation behind that is a Cassandra field that contains a packed,
>> variable-length data structure, a bit like a CSV string encoding multiple
>> rows of data
>> I can convert the raw char array into an inner bag of tuples but I need to
>> 'explode' it to work properly with it
>>
>> I'm open to "don't do that, here's why it's a dumb idea", but it feels like
>> I'm missing an operator that could be used to implement this. I have a
>> partially-working solution using streaming, but the presence of new lines in
>> the chararray makes that approach tough. Any advice much appreciated.
>>
>> cheers,
>> Pete
>>
Re: Converting an inner bag to an outer bag/relation?
Posted by Pete Warden <pe...@petewarden.com>.
Never mind, it looks like the FLATTEN operator should do the trick. I'd only
seen it with tuples, didn't realize it did what I needed with inner bags
until I RTFM-ed again.
On Fri, Oct 14, 2011 at 10:53 PM, Pete Warden <pe...@petewarden.com> wrote:
> Newbie question - I have an inner bag of tuples that I'd like to convert
> into an outer bag/relation and I'm struggling to figure out how
> For example if I have
> ({(1,2),(3,4),(5,6)}
> ({(7,8),(9,10)}
> I'd like it to become
> (1,2)
> (3,4)
> (5,6)
> (7,8)
> (9,10)
> The motivation behind that is a Cassandra field that contains a packed,
> variable-length data structure, a bit like a CSV string encoding multiple
> rows of data
> I can convert the raw char array into an inner bag of tuples but I need to
> 'explode' it to work properly with it
>
> I'm open to "don't do that, here's why it's a dumb idea", but it feels like
> I'm missing an operator that could be used to implement this. I have a
> partially-working solution using streaming, but the presence of new lines in
> the chararray makes that approach tough. Any advice much appreciated.
>
> cheers,
> Pete
>