You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by julianpeeters <ju...@gmail.com> on 2014/07/14 07:22:38 UTC

Error when trying to convert a local datafile to plain text with Avro Tools

Hi, 

I'm exploring the human-readable avro options in the avro-tools jar, namely
`tojson` and `totext`.

`tojson` works fine, but I try `totext` with:

`$ java -jar avro-tools-1.7.6.jar totext twitter.avro twitter.txt`,

then twitter.txt is empty and I get this error:

    Jul 13, 2014 8:41:19 PM org.apache.hadoop.util.NativeCodeLoader <clinit>
    WARNING: Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
    Avro file is not generic text schema


What am I doing wrong?

Thanks for looking,
-Julian

PS (Looking into the source, it looks like this error is thrown when the
schema in the datafile is not equal to the string "\"bytes"\", but I have a
hard time understanding why the datafile's schema would ever be that.)





--
View this message in context: http://apache-avro.679487.n3.nabble.com/Error-when-trying-to-convert-a-local-datafile-to-plain-text-with-Avro-Tools-tp4030458.html
Sent from the Avro - Users mailing list archive at Nabble.com.

Re: Error when trying to convert a local datafile to plain text with Avro Tools

Posted by Doug Cutting <cu...@apache.org>.
On Sun, Jul 13, 2014 at 10:22 PM, julianpeeters <ju...@gmail.com> wrote:
> I'm exploring the human-readable avro options in the avro-tools jar, namely
> `tojson` and `totext`.

Have you tried 'tojson' instead?  That's human-readable and works with
any schema.  It also supports a --pretty option that writes complex
Json structures in a multi-line, indented format.

Doug

Re: Error when trying to convert a local datafile to plain text with Avro Tools

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Have you looked into the code for the toText Java?
We construct the Schema.Parser().parse method as follows

if (!fileReader.getSchema().equals(new
Schema.Parser().parse(TEXT_FILE_SCHEMA))) {

TEXT_FILE_SCHEMA has a static value of

private static final String TEXT_FILE_SCHEMA =
        "\"bytes\"";

We can therefore conclude that your Avro data has a Schema which does not
match this... therefore benehviour is correct.
Are you able to try doing fileReader.getSchema() to see what thre Schema is
like?
hth
Lewis




On Mon, Jul 14, 2014 at 1:22 AM, julianpeeters <ju...@gmail.com>
wrote:

> Hi,
>
> I'm exploring the human-readable avro options in the avro-tools jar, namely
> `tojson` and `totext`.
>
> `tojson` works fine, but I try `totext` with:
>
> `$ java -jar avro-tools-1.7.6.jar totext twitter.avro twitter.txt`,
>
> then twitter.txt is empty and I get this error:
>
>     Jul 13, 2014 8:41:19 PM org.apache.hadoop.util.NativeCodeLoader
> <clinit>
>     WARNING: Unable to load native-hadoop library for your platform...
> using
> builtin-java classes where applicable
>     Avro file is not generic text schema
>
>
> What am I doing wrong?
>
> Thanks for looking,
> -Julian
>
> PS (Looking into the source, it looks like this error is thrown when the
> schema in the datafile is not equal to the string "\"bytes"\", but I have a
> hard time understanding why the datafile's schema would ever be that.)
>
>
>
>
>
> --
> View this message in context:
> http://apache-avro.679487.n3.nabble.com/Error-when-trying-to-convert-a-local-datafile-to-plain-text-with-Avro-Tools-tp4030458.html
> Sent from the Avro - Users mailing list archive at Nabble.com.
>



-- 
*Lewis*

Re: Error when trying to convert a local datafile to plain text with Avro Tools

Posted by julianpeeters <ju...@gmail.com>.
Ah, I see now. Thanks very much for the clarifications.



--
View this message in context: http://apache-avro.679487.n3.nabble.com/Error-when-trying-to-convert-a-local-datafile-to-plain-text-with-Avro-Tools-tp4030458p4030511.html
Sent from the Avro - Users mailing list archive at Nabble.com.

Re: Error when trying to convert a local datafile to plain text with Avro Tools

Posted by Harsh J <ha...@cloudera.com>.
Its useful to have plaintext data in compressed avro files in HDFS for
MR/etc. processing, since the container format allows splitting. The
feature of 'totext'/'from'text' was added originally via AVRO-567.

You may instead be looking for the avro (avrocat) tool? You can obtain
it by installing the Python 'avro' package (easy_install avro, or pip
install avro) and by then running the 'avro' command. It allows
configurable forms of text transformation from regular Avro schema
files.

On Mon, Jul 14, 2014 at 10:52 AM, julianpeeters <ju...@gmail.com> wrote:
> Hi,
>
> I'm exploring the human-readable avro options in the avro-tools jar, namely
> `tojson` and `totext`.
>
> `tojson` works fine, but I try `totext` with:
>
> `$ java -jar avro-tools-1.7.6.jar totext twitter.avro twitter.txt`,
>
> then twitter.txt is empty and I get this error:
>
>     Jul 13, 2014 8:41:19 PM org.apache.hadoop.util.NativeCodeLoader <clinit>
>     WARNING: Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
>     Avro file is not generic text schema
>
>
> What am I doing wrong?
>
> Thanks for looking,
> -Julian
>
> PS (Looking into the source, it looks like this error is thrown when the
> schema in the datafile is not equal to the string "\"bytes"\", but I have a
> hard time understanding why the datafile's schema would ever be that.)
>
>
>
>
>
> --
> View this message in context: http://apache-avro.679487.n3.nabble.com/Error-when-trying-to-convert-a-local-datafile-to-plain-text-with-Avro-Tools-tp4030458.html
> Sent from the Avro - Users mailing list archive at Nabble.com.



-- 
Harsh J