You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by jamal sasha <ja...@gmail.com> on 2013/10/07 19:59:03 UTC

Parsing flexing json in pig

Hi,
  I have a semi-structured json:
For example:
{"id":1,"name":"foo"}
{"id":1,"name":"foo","address":"foobar"}
{"id":1,"name":"foo","address":"foobar","phone":[123,133}
{"id":2,"name":"foobar","address":"foobar"}

And so on.


So, what I want to do is , read this file

and select "id" and count "address" for each id
If "address" field is not there, then count it as 0
So, the output of above is:
id, count_address
1,2

Also, I want to use python udf to parse this json?