You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hcatalog-commits@incubator.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2011/06/20 05:17:50 UTC

[jira] [Updated] (HCATALOG-49) Support Avro Data File Format in HCatalog

     [ https://issues.apache.org/jira/browse/HCATALOG-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HCATALOG-49:
------------------------------

    Attachment: HCATALOG-49.patch

Here is an initial attempt to support Avro in HCatalog.

Some notes:

* For output, an Avro schema is computed for the HCatalog schema by the Avro output storage driver. The current patch does not allow you to specify a custom Avro schema - this would be a natural extension.
* Avro map keys must be strings, wheres they can be any type in HCatalog. The current implementation assumes that HCatalog maps have string types, and fails if this is not true. It might be possible to relax this restriction in the future by doing type conversion. 
* In HCatalog, values can be null, whereas this is not true for simple schemas in Avro. It would be possible to generate null unions in Avro, but this isn't done here. This could be a future enhancement.
* For the Avro input storage driver, the Avro schema in the Avro Data File is checked for compatibility with the HCatalog schema, and an exception is thrown if there's a mismatch.
* Byte arrays can not be represented in HCatalog, so there is no way to read byte arrays from Avro files. (Pig has the same limitation.)



> Support Avro Data File Format in HCatalog
> -----------------------------------------
>
>                 Key: HCATALOG-49
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-49
>             Project: HCatalog
>          Issue Type: New Feature
>            Reporter: Tom White
>         Attachments: HCATALOG-49.patch
>
>
> Add input and output drivers for Avro.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira