You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Ratandeep Ratti (JIRA)" <ji...@apache.org> on 2015/03/05 17:58:38 UTC

[jira] [Created] (PIG-4447) Pig Cannot handle nullable values (arrays and records) in avro records

Ratandeep Ratti created PIG-4447:
------------------------------------

             Summary: Pig Cannot handle nullable values (arrays and records) in avro records
                 Key: PIG-4447
                 URL: https://issues.apache.org/jira/browse/PIG-4447
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.13.0
            Reporter: Ratandeep Ratti
            Assignee: Ratandeep Ratti
             Fix For: 0.15.0


Here's an example of an avro schema containing nullable values in a map
{noformat}
{
    "name" : "nullableRecordInMap",
    "namespace" : "org.apache.pig.test.builtin",
    "type" : "record",
    "fields" : [
        {"name" : "key", "type" : "string"},
        {"name" : "value", "type" : "int"},
        {
            "name" : "parameters",
            "type": [
                "null",
                {
                    "type": "map",
                    "values": [
                        "null",
                        {
                            "type": "record",
                            "name": "nullable_record",
                            "fields": [
                                {
                                    "name": "id",
                                    "type": [
                                        "null",
                                        "string"
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}
{noformat}

Here's the corresponding Pig resource schema on running it through org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities
{noformat}
key:chararray,value:int,parameters:[nullable_record:(union:(id:chararray))]
{noformat}

Note that Pig should unpack the underlying schema from the nullable union and the Pig schema should be
{noformat}
key:chararray,value:int,parameters:[nullable_record:(id:chararray)]
{noformat}

There's similar behavior if the nullal map value is of type Avro

I've created a patch with a few testcases written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)