You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Igor Bernstein (JIRA)" <ji...@apache.org> on 2016/03/05 07:08:40 UTC

[jira] [Created] (PARQUET-557) Enums are incorrectly handled by parquet-avro when using GenericRecords

Igor Bernstein created PARQUET-557:
--------------------------------------

             Summary: Enums are incorrectly handled by parquet-avro when using GenericRecords
                 Key: PARQUET-557
                 URL: https://issues.apache.org/jira/browse/PARQUET-557
             Project: Parquet
          Issue Type: Bug
            Reporter: Igor Bernstein
            Priority: Minor


It appears that enums are handled incorrectly when reading parquet as generic records.

Looking at the code:
https://github.com/apache/parquet-mr/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroIndexedRecordConverter.java#L236-L238

FieldEnumConverter falls back to a string representation when it can't find the corresponding enum class.  This is problematic when trying to read parquet files generically without specific records on the classpath because the records will no longer match the schema. I believe a more correct approach would be to wrap the enums in GenericData.EnumSymbol:
https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java#L397



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)