You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Tom Brown <to...@gmail.com> on 2012/08/08 17:04:06 UTC
Question about querying JSON data
I have a large amount of data JSON data that was generated using
periods in the key names, e.g., {"category.field": "value"}. I know
that's not the best way to do JSON but for better or worse, it's the
data I have to deal with.
I have tried using get_json_object, but I am concerned that it's JSON
path expressions interpret "." as a special character. I am also
concerned about the overhead of repeatedly parsing each record (each
record is about 2K, so not tiny, but not huge either).
I have tried using Hive-JSON-Serde but it seems to require that my
column names be named the same as my JSON field names.
I had heard that there was a serde somewhere that will allow me to
specify a JSON path to map to each specific field name, but other than
vague references on the mailing list, I haven't found any concrete
info about it.
I would to use existing code, but I can write my own serde if I have to.
What do you recommend?
Thanks in advance!
--Tom