You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Uri Laserson (JIRA)" <ji...@apache.org> on 2013/04/23 22:15:16 UTC

[jira] [Updated] (AVRO-1304) Python Avro match_schemas called redundantly

     [ https://issues.apache.org/jira/browse/AVRO-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uri Laserson updated AVRO-1304:
-------------------------------

    Status: Patch Available  (was: Open)

DatumReader now immutable.  Changes interface just slightly.  Can't instantiate DatumReader().  Must give it schema.  (It doesn't make sense for it to have no schema.)  DataFileReader now instantiates a DatumReader internally if it should pick up the schema from the file, so it is not necessary to provide one.
                
> Python Avro match_schemas called redundantly
> --------------------------------------------
>
>                 Key: AVRO-1304
>                 URL: https://issues.apache.org/jira/browse/AVRO-1304
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.7.4
>            Reporter: Uri Laserson
>         Attachments: AVRO-1304.patch
>
>
> DatumReader.match_schemas(writers_schema, readers_schema) is called on every single read from the DatumReader.  However, for almost every read, the schemas used are the object members self.writers_schema and self.readers_schema.  match_schemas should be checked only once in this case, and only when the object members are modified.  This takes up 20% of my parse time upon profiling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira