You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by John Smith <le...@gmail.com> on 2016/01/07 12:37:11 UTC

AvroStorage.java

Hi,

can someone flash the light on AvroStorage class?  Im passing JSON Avro
schema as parameter to AvroStorage, the schema contains additional
attribute *"extraAttribute"*. Problem is that avro file produced by Pig
doesnt contain that attribute within AVRO Schema.

When I pass the same schema to AVRO API * Schema.parse
<https://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/Schema.Parser.html>()
*the AVRO file contains that extraAttribute inside AVRO Schema.


e.g.

STORE A INTO 'testOutput'
USING org.apache.pig.piggybank.storage.avro.AvroStorage(
'schema',
' {"type":"record","name":"X",
"fields":[
{"name":"b1"},
{"name":"b2"},
{"name":"b3", *"extraAttribute" : "value"*}
]}');


Where is that extraAttribute removed inside the AvroStorage.java? I cant
see any additional attributes removal. outAvroSchema uses method
Schema.parse()

AvroStorage.java

    @SuppressWarnings("rawtypes")
    @Override
    public OutputFormat getOutputFormat() throws IOException {
        AvroStorageLog.funcCall("getOutputFormat");

        Properties property = getUDFProperties();
        String allSchemaStr =
property.getProperty(AVRO_OUTPUT_SCHEMA_PROPERTY);
        Map<String, String> map = (allSchemaStr != null)  ?
parseSchemaMap(allSchemaStr) : null;

        String key = getSchemaKey();
        Schema schema = (map == null || !map.containsKey(key))  ?
*outputAvroSchema
: Schema.parse(map.get(key));*

        if (schema == null)
            throw new IOException("Output schema is null!");
        AvroStorageLog.details("Output schema=" + schema);

        return new PigAvroOutputFormat(*schema*);
    }



Any idea?


Thank you!