You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Ryan Skraba (Jira)" <ji...@apache.org> on 2020/11/03 16:13:00 UTC

[jira] [Resolved] (AVRO-2906) Replace recursive validation with traversal-based solution for Python avro

     [ https://issues.apache.org/jira/browse/AVRO-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan Skraba resolved AVRO-2906.
-------------------------------
    Resolution: Fixed

> Replace recursive validation with traversal-based solution for Python avro
> --------------------------------------------------------------------------
>
>                 Key: AVRO-2906
>                 URL: https://issues.apache.org/jira/browse/AVRO-2906
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: python
>         Environment: [github pr|[https://github.com/apache/avro/pull/936]]
>            Reporter: Cristopher Ewing
>            Priority: Major
>             Fix For: 1.11.0
>
>
> The existing validation scheme for the Python implementation of avro is recursive.  This is problematic in Python because language support for recursion is not great, and because for more deeply nested schemas, recursion is inefficient.  Another issue with the current scheme is that error reporting for validation problems is generic. Unless a global variable in code is changed to allow errors for sub-schemas to be reported directly, the only report one gets is an exception that says the entire schema is invalid, which is not particularly useful when hunting bugs in serializing very large schemas.
> My proposal is to replace this existing validation approach with a new approach that uses breadth-first traversal of the schema for validation.  The approach solves the inefficiencies of recursion, and at the same time, allows for errors to be reported for the exact spot in the over-all schema where they happened.
> My implementation, in [this PR in github|[https://github.com/apache/avro/pull/936]] also moves validation from a mapping of type/logical_type to lambda functions into a validate method on each schema type, ensuring that a schema is responsible for validating itself.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)