You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by sourabh chaki <ch...@gmail.com> on 2013/03/20 12:11:14 UTC

How to convert .avpr/.avdl file to .avsc file.

Hi All,

In my application I am getting events in avro serialized format.These data
are serialized using .avdl file in java.

In my application I have to parse those events in hive. In web tutorial I
can see ,hive understands .avsc format.

https://cwiki.apache.org/Hive/avroserde-working-with-avro-from-hive.html

Is there any way to convert .avpr to .avsc ?

Alternately can I directly use .avpr/.avdl in hive? Please provide example.

Thanks in advance.

Sourabh

Re: How to convert .avpr/.avdl file to .avsc file.

Posted by Martin Kleppmann <ma...@rapportive.com>.
Hi Sourabh,

As far as I know, there is no automatic conversion between .avpr/.avdl
and .avsc, because they are different things: avpr/avdl describes a
protocol, i.e. various messages that can be exchanged over an API,
which can include any number of schemas. avsc only contains a single
record schema (with possible nested schemas).

Your best bet is probably to use the IDL command-line tool to convert
.avdl to .avpr (http://avro.apache.org/docs/current/idl.html), then to
manually edit the .avpr file and extract the schema you want. It's
just JSON, so it's not too hard to edit. Find the schema for the event
type you have in hive, which should be of the form:

{"type": "record", "name": "MyEventType", "fields": [...]}

and discard the rest of the .avpr file. That record definition is your
.avsc file.

Martin

(dev@avro to BCC)

On 20 March 2013 11:11, sourabh chaki <ch...@gmail.com> wrote:
> Hi All,
>
> In my application I am getting events in avro serialized format.These data
> are serialized using .avdl file in java.
>
> In my application I have to parse those events in hive. In web tutorial I
> can see ,hive understands .avsc format.
>
> https://cwiki.apache.org/Hive/avroserde-working-with-avro-from-hive.html
>
> Is there any way to convert .avpr to .avsc ?
>
> Alternately can I directly use .avpr/.avdl in hive? Please provide example.
>
> Thanks in advance.
>
> Sourabh

Re: How to convert .avpr/.avdl file to .avsc file.

Posted by Martin Kleppmann <ma...@rapportive.com>.
Sounds like you have a problem with Hive, not with Avro — please ask
on the Hive mailing list.

You can check whether a file in HDFS is a valid Avro data file by
copying it to your local disk (hadoop fs -copyToLocal ...) and
displaying it as JSON (java -jar avro-tools-$VERSION.jar tojson
file.avro). If that doesn't work, you'll have to adjust whatever
you're doing with Hive to load the data.

Martin

On 20 March 2013 14:29, sourabh chaki <ch...@gmail.com> wrote:
> Hi All,
>
> In my application I am getting avro events. I have to process those in
> hive. Using avro schema I have created hive table. But I am not able to
> load those avro events to the hive table(created by same avro schema).
>
> I am using: load data inpath '/user/test/xyz.avro' into table xyz;
>
> When I execute: select * from xyz;
> Failed with exception java.io.IOException:java.io.IOException: Not a data
> file.
>
> Please advice what should I do?
>
> Thanks in advance.
>
> Sourabh

Re: How to convert .avpr/.avdl file to .avsc file.

Posted by Martin Kleppmann <ma...@rapportive.com>.
Sounds like you have a problem with Hive, not with Avro — please ask
on the Hive mailing list.

You can check whether a file in HDFS is a valid Avro data file by
copying it to your local disk (hadoop fs -copyToLocal ...) and
displaying it as JSON (java -jar avro-tools-$VERSION.jar tojson
file.avro). If that doesn't work, you'll have to adjust whatever
you're doing with Hive to load the data.

Martin

On 20 March 2013 14:29, sourabh chaki <ch...@gmail.com> wrote:
> Hi All,
>
> In my application I am getting avro events. I have to process those in
> hive. Using avro schema I have created hive table. But I am not able to
> load those avro events to the hive table(created by same avro schema).
>
> I am using: load data inpath '/user/test/xyz.avro' into table xyz;
>
> When I execute: select * from xyz;
> Failed with exception java.io.IOException:java.io.IOException: Not a data
> file.
>
> Please advice what should I do?
>
> Thanks in advance.
>
> Sourabh

Re: How to convert .avpr/.avdl file to .avsc file.

Posted by sourabh chaki <ch...@gmail.com>.
Hi All,

In my application I am getting avro events. I have to process those in
hive. Using avro schema I have created hive table. But I am not able to
load those avro events to the hive table(created by same avro schema).

I am using: load data inpath '/user/test/xyz.avro' into table xyz;

When I execute: select * from xyz;
Failed with exception java.io.IOException:java.io.IOException: Not a data
file.

Please advice what should I do?

Thanks in advance.

Sourabh

Re: How to convert .avpr/.avdl file to .avsc file.

Posted by sourabh chaki <ch...@gmail.com>.
Hi All,

In my application I am getting avro events. I have to process those in
hive. Using avro schema I have created hive table. But I am not able to
load those avro events to the hive table(created by same avro schema).

I am using: load data inpath '/user/test/xyz.avro' into table xyz;

When I execute: select * from xyz;
Failed with exception java.io.IOException:java.io.IOException: Not a data
file.

Please advice what should I do?

Thanks in advance.

Sourabh

Re: How to convert .avpr/.avdl file to .avsc file.

Posted by Martin Kleppmann <ma...@rapportive.com>.
Hi Sourabh,

As far as I know, there is no automatic conversion between .avpr/.avdl
and .avsc, because they are different things: avpr/avdl describes a
protocol, i.e. various messages that can be exchanged over an API,
which can include any number of schemas. avsc only contains a single
record schema (with possible nested schemas).

Your best bet is probably to use the IDL command-line tool to convert
.avdl to .avpr (http://avro.apache.org/docs/current/idl.html), then to
manually edit the .avpr file and extract the schema you want. It's
just JSON, so it's not too hard to edit. Find the schema for the event
type you have in hive, which should be of the form:

{"type": "record", "name": "MyEventType", "fields": [...]}

and discard the rest of the .avpr file. That record definition is your
.avsc file.

Martin

(dev@avro to BCC)

On 20 March 2013 11:11, sourabh chaki <ch...@gmail.com> wrote:
> Hi All,
>
> In my application I am getting events in avro serialized format.These data
> are serialized using .avdl file in java.
>
> In my application I have to parse those events in hive. In web tutorial I
> can see ,hive understands .avsc format.
>
> https://cwiki.apache.org/Hive/avroserde-working-with-avro-from-hive.html
>
> Is there any way to convert .avpr to .avsc ?
>
> Alternately can I directly use .avpr/.avdl in hive? Please provide example.
>
> Thanks in advance.
>
> Sourabh