You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Raviv M-G <ra...@post.harvard.edu> on 2010/07/06 22:31:05 UTC
GROUP_CONCAT function
Hi all,
Is there a way to use the built-in functions of Pig (or has someone
already written a UDF) to create a similar result to SQL's
GROUP_CONCAT?
The idea is that I have a long list of book ISBN numbers and author names:
123 John Doe
123 Jane Doe
and I would like to be able to group by the ISBN number and then
concatenate them for export to the format:
123 John Doe; Jane Doe
Thanks,
Raviv
Re: GROUP_CONCAT function
Posted by hc busy <hc...@gmail.com>.
Yeah, you can definitely accomplish that in an UDF. it would take one
parameter which is a bag and performs string concatenation on the members of
the bag. the UDF would be like a reducer that is applied to a bag of outputs
from mapper. (the mapper could do other things, like putting quotes around
the name:
123 "John Doe", "Doe, John"
)
On Tue, Jul 6, 2010 at 1:31 PM, Raviv M-G <ra...@post.harvard.edu> wrote:
> Hi all,
>
> Is there a way to use the built-in functions of Pig (or has someone
> already written a UDF) to create a similar result to SQL's
> GROUP_CONCAT?
>
> The idea is that I have a long list of book ISBN numbers and author names:
>
> 123 John Doe
> 123 Jane Doe
>
> and I would like to be able to group by the ISBN number and then
> concatenate them for export to the format:
>
> 123 John Doe; Jane Doe
>
> Thanks,
> Raviv
>
Re: GROUP_CONCAT function
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
There isn't one that I am aware of, but it'd be trivial to write. Take a
look at StringConcat builtin, which does something similar (but for tuples,
and without delimiters).
-D
On Tue, Jul 6, 2010 at 1:31 PM, Raviv M-G <ra...@post.harvard.edu> wrote:
> Hi all,
>
> Is there a way to use the built-in functions of Pig (or has someone
> already written a UDF) to create a similar result to SQL's
> GROUP_CONCAT?
>
> The idea is that I have a long list of book ISBN numbers and author names:
>
> 123 John Doe
> 123 Jane Doe
>
> and I would like to be able to group by the ISBN number and then
> concatenate them for export to the format:
>
> 123 John Doe; Jane Doe
>
> Thanks,
> Raviv
>