You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Mateusz Kobos (JIRA)" <ji...@apache.org> on 2013/11/10 23:36:17 UTC

[jira] [Created] (AVRO-1395) Order of fields returned by DataFileReader should match the order defined in Avro schema

Mateusz Kobos created AVRO-1395:
-----------------------------------

             Summary: Order of fields returned by DataFileReader should match the order defined in Avro schema
                 Key: AVRO-1395
                 URL: https://issues.apache.org/jira/browse/AVRO-1395
             Project: Avro
          Issue Type: Improvement
          Components: python
    Affects Versions: 1.7.5
            Reporter: Mateusz Kobos
            Priority: Minor


Python `DataFileReader` class allows to iterate over records of Avro file where each record is represented by a standard Python dictionary - each element corresponds to a single field of the record. Note that this dictionary does not define any particular order of its elements. I claim that it would be better if the order of elements followed the order of fields as defined in file's Avro schema. In such case, presentation of the record would be more human-friendly, which is important e.g. for a user that wants to view the record as a JSON string. Consider that order of fields in Avro schema usually has some significance, i.e., the most important fields, (like ID), are at the beginning.

Implementing the functionality of having the representation of the record follow the order of fields defined in Avro schema seems to be pretty easy, i.e., as of Avro 1.7.5 in `io.DatumReader` class in `read_record` method, you would only have to change the second line of code from `read_record = {}` to `read_record = OrderedDict()`.



--
This message was sent by Atlassian JIRA
(v6.1#6144)