You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ike Walker <ik...@flite.com> on 2013/05/28 16:56:12 UTC
Combining 2 JSON objects in Hive
Hello,
I have two JSON objects stored as strings in a Hive table.
I would like to combine them into a single JSON object in Hive.
I'm running Hive 0.7, but am planning to upgrade soon so a solution that works in Hive 0.8 could be acceptable as well.
For example, here's the data now:
+-------------------------------------+-------------------------------+
| col1 | col2 |
+-------------------------------------+-------------------------------+
| {"age":"Over 30","gender":"female"} | {"counter":"0","version":"1" |
| {"age":"Over 30","gender":"female"} | {"counter":"4","version":"1" |
| {"age":"Over 30","gender":"male"} | {"counter":"10","version":"1" |
+-------------------------------------+-------------------------------+
And here's what I want to select:
{"age":"Over 30","gender":"female","counter":"0","version":"1"}
{"age":"Over 30","gender":"female","counter":"4","version":"1"}
{"age":"Over 30","gender":"male","counter":"10","version":"1"}
Thanks,
Ike Walker
Re: Combining 2 JSON objects in Hive
Posted by Stephen Sprague <sp...@gmail.com>.
I know of no way to do this purely natively within hive, however, don't let
that stop you. Enter the transform() function. Write your JSON merge
using python, perl, ruby or whatever floats your boat.
Don't let the gnarly syntax on this page scare you:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Transform
basically its just this
SELECT TRANSFORM(stuff) USING 'script' AS (thing1 INT, thing2 INT)
as quoted from the last example on that page.
On Tue, May 28, 2013 at 7:56 AM, Ike Walker <ik...@flite.com> wrote:
> Hello,
>
> I have two JSON objects stored as strings in a Hive table.
>
> I would like to combine them into a single JSON object in Hive.
>
> I'm running Hive 0.7, but am planning to upgrade soon so a solution that
> works in Hive 0.8 could be acceptable as well.
>
> For example, here's the data now:
> +-------------------------------------+-------------------------------+
> | col1 | col2 |
> +-------------------------------------+-------------------------------+
> | {"age":"Over 30","gender":"female"} | {"counter":"0","version":"1" |
> | {"age":"Over 30","gender":"female"} | {"counter":"4","version":"1" |
> | {"age":"Over 30","gender":"male"} | {"counter":"10","version":"1" |
> +-------------------------------------+-------------------------------+
>
> And here's what I want to select:
>
> {"age":"Over 30","gender":"female","counter":"0","version":"1"}
> {"age":"Over 30","gender":"female","counter":"4","version":"1"}
> {"age":"Over 30","gender":"male","counter":"10","version":"1"}
>
> Thanks,
> Ike Walker
>