You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by hbaseuser hbaseuser <hb...@gmail.com> on 2014/05/22 00:48:13 UTC
JSON Data with "\n" character..
Hi,
I'm trying to process JSON data in hive (0.12) with "\n" inside some of the
keys & values. It is messed up and I have no control over changing the
input.
What is the best way to process this data in hdfs?
Thanks!
Re: JSON Data with "\n" character..
Posted by Andrew Mains <an...@kontagent.com>.
Hi,
We've run into this issue as well, and it is indeed annoying. As I
recall, the issue comes in not when the records are read off disk but
when hive deals with the records further down the line (I forget exactly
where).
I believe this issue is relevant:
https://issues.apache.org/jira/browse/HIVE-1898 . If you can't
preprocess the input to clean it up, the suggestion there of using
regexp_replace(<my_column>, "\n", "")
might be useful.
Our (rather clunky) workaround was to do the replacement in our SerDe
(we were already using a custom SerDe, so this wasn't a huge burden for
us).
What does your CREATE TABLE statement look like?
Andrew
On 5/21/14, 3:48 PM, hbaseuser hbaseuser wrote:
> Hi,
>
> I'm trying to process JSON data in hive (0.12) with "\n" inside some
> of the keys & values. It is messed up and I have no control over
> changing the input.
>
> What is the best way to process this data in hdfs?
>
> Thanks!