You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Saravanan Nagarajan <sa...@gmail.com> on 2014/03/14 09:30:13 UTC

Problem while Converting from JSON=>Avro=>JSON

HI,

I successfully converted the JSON file to avro format and  i cloud able to
see the json format using AVRO tool.

But not i am trying to show only  selected fields from the json file using
java program and i cloud able to select specific column from the SIMPLE
json file.

In case of complex json file, i am not able to select column.

For example:

Assume, Employee records contain complex column with department details.
Now i need to generate the JSON from avro with few column from employee and
few column from departments.

My program printed the selected column from the employee table, but not
able to select from department columns. I used GenericDatumReader for
reading the avro file.

Please let me know if you have any suggestions.

if you need the program, i can share separate mail.

Thanks,
Saravanan

Re: Problem while Converting from JSON=>Avro=>JSON

Posted by Saravanan Nagarajan <sa...@gmail.com>.
HI Doug,

Thank you very much for your useful information. I used this approach in
our project to support both with schema and with out schema to get the
selected fields from the .avro file.

Can i update this information in the avro tool?

Thanks,
Saravanan




On Fri, Mar 14, 2014 at 10:48 PM, Doug Cutting <cu...@apache.org> wrote:

> To generate a file with a subset of fields you can specify a 'reader'
> schema that contains only the desired fields.  For example, if you
> have a schema like:
>
> {"type":"record","name":"Event","fields":[
>   {"name":"id","type":"int"},
>   {"name":"url","type":"string"},
>
> {"name":"props","type":{"type":"array","items":{"type":"record","name":"Property","fields":[
>       {"name":"key","type":"int"},
>       {"name":"value","type":"string"}
> ]}]}
>
> And you only want the ids and property values, then you can specify
> the following when you create your GenericDatumReader:
>
> {"type":"record","name":"Event","fields":[
>   {"name":"id","type":"int"},
>
> {"name":"props","type":{"type":"array","items":{"type":"record","name":"Property","fields":[
>       {"name":"value","type":"string"}
> ]}]}
>
> Perhaps we should add a --schema parameter to the tojson command-line
> tool that does this?
>
> Doug
>
> On Fri, Mar 14, 2014 at 1:30 AM, Saravanan Nagarajan
> <sa...@gmail.com> wrote:
> > HI,
> >
> > I successfully converted the JSON file to avro format and  i cloud able
> to
> > see the json format using AVRO tool.
> >
> > But not i am trying to show only  selected fields from the json file
> using
> > java program and i cloud able to select specific column from the SIMPLE
> json
> > file.
> >
> > In case of complex json file, i am not able to select column.
> >
> > For example:
> >
> > Assume, Employee records contain complex column with department details.
> Now
> > i need to generate the JSON from avro with few column from employee and
> few
> > column from departments.
> >
> > My program printed the selected column from the employee table, but not
> able
> > to select from department columns. I used GenericDatumReader for reading
> the
> > avro file.
> >
> > Please let me know if you have any suggestions.
> >
> > if you need the program, i can share separate mail.
> >
> > Thanks,
> > Saravanan
> >
>

Re: Problem while Converting from JSON=>Avro=>JSON

Posted by Doug Cutting <cu...@apache.org>.
To generate a file with a subset of fields you can specify a 'reader'
schema that contains only the desired fields.  For example, if you
have a schema like:

{"type":"record","name":"Event","fields":[
  {"name":"id","type":"int"},
  {"name":"url","type":"string"},
  {"name":"props","type":{"type":"array","items":{"type":"record","name":"Property","fields":[
      {"name":"key","type":"int"},
      {"name":"value","type":"string"}
]}]}

And you only want the ids and property values, then you can specify
the following when you create your GenericDatumReader:

{"type":"record","name":"Event","fields":[
  {"name":"id","type":"int"},
  {"name":"props","type":{"type":"array","items":{"type":"record","name":"Property","fields":[
      {"name":"value","type":"string"}
]}]}

Perhaps we should add a --schema parameter to the tojson command-line
tool that does this?

Doug

On Fri, Mar 14, 2014 at 1:30 AM, Saravanan Nagarajan
<sa...@gmail.com> wrote:
> HI,
>
> I successfully converted the JSON file to avro format and  i cloud able to
> see the json format using AVRO tool.
>
> But not i am trying to show only  selected fields from the json file using
> java program and i cloud able to select specific column from the SIMPLE json
> file.
>
> In case of complex json file, i am not able to select column.
>
> For example:
>
> Assume, Employee records contain complex column with department details. Now
> i need to generate the JSON from avro with few column from employee and few
> column from departments.
>
> My program printed the selected column from the employee table, but not able
> to select from department columns. I used GenericDatumReader for reading the
> avro file.
>
> Please let me know if you have any suggestions.
>
> if you need the program, i can share separate mail.
>
> Thanks,
> Saravanan
>