You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Tongjie Chen <to...@gmail.com> on 2014/11/22 21:22:03 UTC
ArrayIndexOutOfBoundsException occurs when reading parquet files
Hi,
Does anyone find the following message familiar? It seems like a data
corruption issue but when we wrote that parquet file, it did not have
any error. We are using Parquet version 1.6.0rc3.
Thanks,
Tongjie
2014-11-22 18:55:28,970 WARN [main]
org.apache.hadoop.mapred.YarnChild: Exception running child :
java.io.IOException: java.io.IOException:
parquet.io.ParquetDecodingException: Can not read value at 511538 in
block 0 in file
s3n://..../dateint=20141122/hour=16/batchid=merged_20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.io.IOException: parquet.io.ParquetDecodingException:
Can not read value at 511538 in block 0 in file
s3n://..../dateint=20141122/hour=16/batchid=merged_20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
... 11 more
Caused by: parquet.io.ParquetDecodingException: Can not read value at
511538 in block 0 in file
s3n://..../dateint=20141122/hour=16/batchid=merged_20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:213)
at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:204)
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next(ParquetRecordReaderWrapper.java:157)
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next(ParquetRecordReaderWrapper.java:45)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
... 15 more
Caused by: parquet.io.ParquetDecodingException: Can't read value in
column [other_properties, map, value] BINARY at value 20433392 out of
27896945, 19072 out of 36318 in currentPage. repetition level: 1,
definition level: 3
at parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:450)
at parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:352)
at parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:402)
at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:194)
... 19 more
Caused by: parquet.io.ParquetDecodingException: could not read bytes
at offset 1599090621
at parquet.column.values.plain.BinaryPlainValuesReader.readBytes(BinaryPlainValuesReader.java:43)
at parquet.column.impl.ColumnReaderImpl$2$6.read(ColumnReaderImpl.java:295)
at parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:446)
... 22 more
Caused by: *java.lang.ArrayIndexOutOfBoundsException*: 1599090621
at parquet.bytes.BytesUtils.readIntLittleEndian(BytesUtils.java:54)
at parquet.column.values.plain.BinaryPlainValuesReader.readBytes(BinaryPlainValuesReader.java:36)
... 24 more
Re: ArrayIndexOutOfBoundsException occurs when reading parquet files
Posted by Tongjie Chen <to...@gmail.com>.
Is there a tool that I can read specific rowgroup/column/page ?
Thanks,
Tongjie
On Sat, Nov 22, 2014 at 5:56 PM, Tongjie Chen <to...@gmail.com>
wrote:
> Actually stack trace looks different.
>
> In my case, there seems to be a bad entry in the parquet file (although I
> can successfully write it ), at some row group , some page, 19072 out
> of 36318 in that currentPage, that entry cannot be read.
>
> On Sat, Nov 22, 2014 at 5:48 PM, Cheng Lian <li...@gmail.com> wrote:
>
>> The problem mentioned in [this thread] [1] looks similar to yours.
>>
>> [1]: http://apache-spark-user-list.1001560.n3.nabble.com/
>> SparkSQL-exception-on-cached-parquet-table-tt18978.html#a19020
>>
>>
>> On 11/23/14 4:22 AM, Tongjie Chen wrote:
>>
>>> Hi,
>>>
>>>
>>> Does anyone find the following message familiar? It seems like a data
>>> corruption issue but when we wrote that parquet file, it did not have
>>> any error. We are using Parquet version 1.6.0rc3.
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Tongjie
>>>
>>>
>>>
>>>
>>>
>>> 2014-11-22 18:55:28,970 WARN [main]
>>> org.apache.hadoop.mapred.YarnChild: Exception running child :
>>> java.io.IOException: java.io.IOException:
>>> parquet.io.ParquetDecodingException: Can not read value at 511538 in
>>> block 0 in file
>>> s3n://..../dateint=20141122/hour=16/batchid=merged_
>>> 20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
>>> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.
>>> handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>>> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.
>>> handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>>> at org.apache.hadoop.hive.shims.HadoopShimsSecure$
>>> CombineFileRecordReader.doNextWithExceptionHandler(
>>> HadoopShimsSecure.java:302)
>>> at org.apache.hadoop.hive.shims.HadoopShimsSecure$
>>> CombineFileRecordReader.next(HadoopShimsSecure.java:218)
>>> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.
>>> moveToNext(MapTask.java:199)
>>> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.
>>> next(MapTask.java:185)
>>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
>>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.
>>> java:432)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at org.apache.hadoop.security.UserGroupInformation.doAs(
>>> UserGroupInformation.java:1548)
>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>> Caused by: java.io.IOException: parquet.io.ParquetDecodingException:
>>> Can not read value at 511538 in block 0 in file
>>> s3n://..../dateint=20141122/hour=16/batchid=merged_
>>> 20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
>>> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.
>>> handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>>> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.
>>> handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>>> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.
>>> doNext(HiveContextAwareRecordReader.java:276)
>>> at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(
>>> CombineHiveRecordReader.java:101)
>>> at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(
>>> CombineHiveRecordReader.java:41)
>>> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.
>>> next(HiveContextAwareRecordReader.java:108)
>>> at org.apache.hadoop.hive.shims.HadoopShimsSecure$
>>> CombineFileRecordReader.doNextWithExceptionHandler(
>>> HadoopShimsSecure.java:300)
>>> ... 11 more
>>> Caused by: parquet.io.ParquetDecodingException: Can not read value at
>>> 511538 in block 0 in file
>>> s3n://..../dateint=20141122/hour=16/batchid=merged_
>>> 20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
>>> at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(
>>> InternalParquetRecordReader.java:213)
>>> at parquet.hadoop.ParquetRecordReader.nextKeyValue(
>>> ParquetRecordReader.java:204)
>>> at org.apache.hadoop.hive.ql.io.parquet.read.
>>> ParquetRecordReaderWrapper.next(ParquetRecordReaderWrapper.java:157)
>>> at org.apache.hadoop.hive.ql.io.parquet.read.
>>> ParquetRecordReaderWrapper.next(ParquetRecordReaderWrapper.java:45)
>>> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.
>>> doNext(HiveContextAwareRecordReader.java:274)
>>> ... 15 more
>>> Caused by: parquet.io.ParquetDecodingException: Can't read value in
>>> column [other_properties, map, value] BINARY at value 20433392 out of
>>> 27896945, 19072 out of 36318 in currentPage. repetition level: 1,
>>> definition level: 3
>>> at parquet.column.impl.ColumnReaderImpl.readValue(
>>> ColumnReaderImpl.java:450)
>>> at parquet.column.impl.ColumnReaderImpl.
>>> writeCurrentValueToConverter(ColumnReaderImpl.java:352)
>>> at parquet.io.RecordReaderImplementation.read(
>>> RecordReaderImplementation.java:402)
>>> at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(
>>> InternalParquetRecordReader.java:194)
>>> ... 19 more
>>> Caused by: parquet.io.ParquetDecodingException: could not read bytes
>>> at offset 1599090621
>>> at parquet.column.values.plain.BinaryPlainValuesReader.
>>> readBytes(BinaryPlainValuesReader.java:43)
>>> at parquet.column.impl.ColumnReaderImpl$2$6.read(
>>> ColumnReaderImpl.java:295)
>>> at parquet.column.impl.ColumnReaderImpl.readValue(
>>> ColumnReaderImpl.java:446)
>>> ... 22 more
>>> Caused by: *java.lang.ArrayIndexOutOfBoundsException*: 1599090621
>>> at parquet.bytes.BytesUtils.readIntLittleEndian(
>>> BytesUtils.java:54)
>>> at parquet.column.values.plain.BinaryPlainValuesReader.
>>> readBytes(BinaryPlainValuesReader.java:36)
>>> ... 24 more
>>>
>>>
>>
>
Re: ArrayIndexOutOfBoundsException occurs when reading parquet files
Posted by Tongjie Chen <to...@gmail.com>.
Actually stack trace looks different.
In my case, there seems to be a bad entry in the parquet file (although I
can successfully write it ), at some row group , some page, 19072 out of
36318 in that currentPage, that entry cannot be read.
On Sat, Nov 22, 2014 at 5:48 PM, Cheng Lian <li...@gmail.com> wrote:
> The problem mentioned in [this thread] [1] looks similar to yours.
>
> [1]: http://apache-spark-user-list.1001560.n3.nabble.com/
> SparkSQL-exception-on-cached-parquet-table-tt18978.html#a19020
>
>
> On 11/23/14 4:22 AM, Tongjie Chen wrote:
>
>> Hi,
>>
>>
>> Does anyone find the following message familiar? It seems like a data
>> corruption issue but when we wrote that parquet file, it did not have
>> any error. We are using Parquet version 1.6.0rc3.
>>
>>
>> Thanks,
>>
>>
>> Tongjie
>>
>>
>>
>>
>>
>> 2014-11-22 18:55:28,970 WARN [main]
>> org.apache.hadoop.mapred.YarnChild: Exception running child :
>> java.io.IOException: java.io.IOException:
>> parquet.io.ParquetDecodingException: Can not read value at 511538 in
>> block 0 in file
>> s3n://..../dateint=20141122/hour=16/batchid=merged_
>> 20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
>> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.
>> handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.
>> handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>> at org.apache.hadoop.hive.shims.HadoopShimsSecure$
>> CombineFileRecordReader.doNextWithExceptionHandler(
>> HadoopShimsSecure.java:302)
>> at org.apache.hadoop.hive.shims.HadoopShimsSecure$
>> CombineFileRecordReader.next(HadoopShimsSecure.java:218)
>> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.
>> moveToNext(MapTask.java:199)
>> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.
>> next(MapTask.java:185)
>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.
>> java:432)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGroupInformation.java:1548)
>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>> Caused by: java.io.IOException: parquet.io.ParquetDecodingException:
>> Can not read value at 511538 in block 0 in file
>> s3n://..../dateint=20141122/hour=16/batchid=merged_
>> 20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
>> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.
>> handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.
>> handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.
>> doNext(HiveContextAwareRecordReader.java:276)
>> at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(
>> CombineHiveRecordReader.java:101)
>> at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(
>> CombineHiveRecordReader.java:41)
>> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.
>> next(HiveContextAwareRecordReader.java:108)
>> at org.apache.hadoop.hive.shims.HadoopShimsSecure$
>> CombineFileRecordReader.doNextWithExceptionHandler(
>> HadoopShimsSecure.java:300)
>> ... 11 more
>> Caused by: parquet.io.ParquetDecodingException: Can not read value at
>> 511538 in block 0 in file
>> s3n://..../dateint=20141122/hour=16/batchid=merged_
>> 20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
>> at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(
>> InternalParquetRecordReader.java:213)
>> at parquet.hadoop.ParquetRecordReader.nextKeyValue(
>> ParquetRecordReader.java:204)
>> at org.apache.hadoop.hive.ql.io.parquet.read.
>> ParquetRecordReaderWrapper.next(ParquetRecordReaderWrapper.java:157)
>> at org.apache.hadoop.hive.ql.io.parquet.read.
>> ParquetRecordReaderWrapper.next(ParquetRecordReaderWrapper.java:45)
>> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.
>> doNext(HiveContextAwareRecordReader.java:274)
>> ... 15 more
>> Caused by: parquet.io.ParquetDecodingException: Can't read value in
>> column [other_properties, map, value] BINARY at value 20433392 out of
>> 27896945, 19072 out of 36318 in currentPage. repetition level: 1,
>> definition level: 3
>> at parquet.column.impl.ColumnReaderImpl.readValue(
>> ColumnReaderImpl.java:450)
>> at parquet.column.impl.ColumnReaderImpl.
>> writeCurrentValueToConverter(ColumnReaderImpl.java:352)
>> at parquet.io.RecordReaderImplementation.read(
>> RecordReaderImplementation.java:402)
>> at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(
>> InternalParquetRecordReader.java:194)
>> ... 19 more
>> Caused by: parquet.io.ParquetDecodingException: could not read bytes
>> at offset 1599090621
>> at parquet.column.values.plain.BinaryPlainValuesReader.readBytes(
>> BinaryPlainValuesReader.java:43)
>> at parquet.column.impl.ColumnReaderImpl$2$6.read(
>> ColumnReaderImpl.java:295)
>> at parquet.column.impl.ColumnReaderImpl.readValue(
>> ColumnReaderImpl.java:446)
>> ... 22 more
>> Caused by: *java.lang.ArrayIndexOutOfBoundsException*: 1599090621
>> at parquet.bytes.BytesUtils.readIntLittleEndian(
>> BytesUtils.java:54)
>> at parquet.column.values.plain.BinaryPlainValuesReader.readBytes(
>> BinaryPlainValuesReader.java:36)
>> ... 24 more
>>
>>
>
Re: ArrayIndexOutOfBoundsException occurs when reading parquet files
Posted by Cheng Lian <li...@gmail.com>.
The problem mentioned in [this thread] [1] looks similar to yours.
[1]:
http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-exception-on-cached-parquet-table-tt18978.html#a19020
On 11/23/14 4:22 AM, Tongjie Chen wrote:
> Hi,
>
>
> Does anyone find the following message familiar? It seems like a data
> corruption issue but when we wrote that parquet file, it did not have
> any error. We are using Parquet version 1.6.0rc3.
>
>
> Thanks,
>
>
> Tongjie
>
>
>
>
>
> 2014-11-22 18:55:28,970 WARN [main]
> org.apache.hadoop.mapred.YarnChild: Exception running child :
> java.io.IOException: java.io.IOException:
> parquet.io.ParquetDecodingException: Can not read value at 511538 in
> block 0 in file
> s3n://..../dateint=20141122/hour=16/batchid=merged_20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
> at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.io.IOException: parquet.io.ParquetDecodingException:
> Can not read value at 511538 in block 0 in file
> s3n://..../dateint=20141122/hour=16/batchid=merged_20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276)
> at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
> at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
> at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
> ... 11 more
> Caused by: parquet.io.ParquetDecodingException: Can not read value at
> 511538 in block 0 in file
> s3n://..../dateint=20141122/hour=16/batchid=merged_20141122T171928_1/542f393b-57f8-441b-8591-2c0169f15d14_000072
> at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:213)
> at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:204)
> at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next(ParquetRecordReaderWrapper.java:157)
> at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next(ParquetRecordReaderWrapper.java:45)
> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
> ... 15 more
> Caused by: parquet.io.ParquetDecodingException: Can't read value in
> column [other_properties, map, value] BINARY at value 20433392 out of
> 27896945, 19072 out of 36318 in currentPage. repetition level: 1,
> definition level: 3
> at parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:450)
> at parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:352)
> at parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:402)
> at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:194)
> ... 19 more
> Caused by: parquet.io.ParquetDecodingException: could not read bytes
> at offset 1599090621
> at parquet.column.values.plain.BinaryPlainValuesReader.readBytes(BinaryPlainValuesReader.java:43)
> at parquet.column.impl.ColumnReaderImpl$2$6.read(ColumnReaderImpl.java:295)
> at parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:446)
> ... 22 more
> Caused by: *java.lang.ArrayIndexOutOfBoundsException*: 1599090621
> at parquet.bytes.BytesUtils.readIntLittleEndian(BytesUtils.java:54)
> at parquet.column.values.plain.BinaryPlainValuesReader.readBytes(BinaryPlainValuesReader.java:36)
> ... 24 more
>