You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Russell Jurney (JIRA)" <ji...@apache.org> on 2012/12/31 21:48:12 UTC

[jira] [Created] (PIG-3111) ToAvro to convert any Pig record to an Avro bytearray

Russell Jurney created PIG-3111:
-----------------------------------

             Summary: ToAvro to convert any Pig record to an Avro bytearray
                 Key: PIG-3111
                 URL: https://issues.apache.org/jira/browse/PIG-3111
             Project: Pig
          Issue Type: New Feature
          Components: data, internal-udfs
    Affects Versions: 0.12
            Reporter: Russell Jurney
            Assignee: Russell Jurney
             Fix For: 0.12


I want to create a ToAvro() builtin that converts arbitrary pig fields, including complex types (bags, tuples, maps) to avro format as bytearrays.

This would enable storing Avro records in arbitrary data stores, for example HBaseAvroStorage in PIG-2889

See PIG-2641 for ToJson

This points to a greater need for customizable/pluggable serialization that plugin to storefuncs and do serialization independently. For example, we might do these operations:

a = load 'my_data' as (some_schema);
b = foreach a generate ToJson(*);
c = foreach a generate ToAvro(*);
store b into 'hbase://JsonValueTable' using HBaseStorage(...);
store c into 'hbase://AvroValueTable' using HBaseStorage(...);

I'll make a ticket for pluggable serialization separately.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira