You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/22 20:08:00 UTC

[jira] [Commented] (PARQUET-1778) Do Not Consider Class for Avro Generic Record Reader

    [ https://issues.apache.org/jira/browse/PARQUET-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163045#comment-17163045 ] 

ASF GitHub Bot commented on PARQUET-1778:
-----------------------------------------

Fokko merged pull request #751:
URL: https://github.com/apache/parquet-mr/pull/751


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Do Not Consider Class for Avro Generic Record Reader
> ----------------------------------------------------
>
>                 Key: PARQUET-1778
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1778
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Major
>              Labels: pull-request-available
>
>  
> {code:java|title=Example Code}
> final ParquetReader<GenericRecord> reader = AvroParquetReader.<GenericRecord>builder(path).build();
> final GenericRecord genericRecord = reader.read();
> {code}
> It fails with...
> {code:none}
> java.lang.NoSuchMethodException: io.github.belugabehr.app.Record.<init>()
> 	at java.lang.Class.getConstructor0(Class.java:3082) ~[na:1.8.0_232]
> 	at java.lang.Class.getDeclaredConstructor(Class.java:2178) ~[na:1.8.0_232]
> 	at org.apache.avro.specific.SpecificData$1.computeValue(SpecificData.java:63) ~[avro-1.9.1.jar:1.9.1]
> 	at org.apache.avro.specific.SpecificData$1.computeValue(SpecificData.java:58) ~[avro-1.9.1.jar:1.9.1]
> 	at java.lang.ClassValue.getFromHashMap(ClassValue.java:227) ~[na:1.8.0_232]
> 	at java.lang.ClassValue.getFromBackup(ClassValue.java:209) ~[na:1.8.0_232]
> 	at java.lang.ClassValue.get(ClassValue.java:115) ~[na:1.8.0_232]
> 	at org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:470) ~[avro-1.9.1.jar:1.9.1]
> 	at org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:491) ~[avro-1.9.1.jar:1.9.1]
> 	at org.apache.parquet.avro.AvroRecordConverter.start(AvroRecordConverter.java:404) ~[parquet-avro-1.11.0.jar:1.11.0]
> 	at org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:392) ~[parquet-column-1.11.0.jar:1.11.0]
> 	at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) ~[parquet-hadoop-1.11.0.jar:1.11.0]
> 	at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132) ~[parquet-hadoop-1.11.0.jar:1.11.0]
> 	at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) ~[parquet-hadoop-1.11.0.jar:1.11.0]
> {code}
> I was surprised because it should just load a {{GenericRecord}} view of the data. But alas, I have the Avro Schema defined with the {{namespace}} and {{name}} fields pointing to {{io.github.belugabehr.app.Record}} which just so happens to be a real class on the class path, so it is trying to call the public constructor on the class and this constructor does does not exist.  Regardless, the {{GenericRecordReader}} should just ignore this Avro Schema namespace information.
> I am putting {{GenericRecords}} into the Parquet file, I expect to get {{GenericRecords}} back out when I read it.
> If I hack the information in a Schema and change the {{namespace}} or {{name}} fields to something bogus, it works as I would expect it to.  It successfully reads and returns a {{GenericRecord}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)