You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Alexander Filipchik <af...@gmail.com> on 2020/03/18 06:57:33 UTC

IndexOutOfBoundsException in MessageColumnIORecordConsumer.addBinary

Hi,

I've being chasing a weird bug in Apache Hudi when some writes fail with
java.lang.IndexOutOfBoundsException : Invalid array range: X to X inside
MessageColumnIORecordConsumer.addBinary call. Specifically:
getColumnWriter().write(value, r[currentLevel],
currentColumnIO.getDefinitionLevel());

fails as size of r is the same as current level. What can be causing it? It
gets executed via:
ParquetWriter.write(IndexedRecord)

Library version: 1.10.1

Avro is a very complex object (~2.5k columns, highly nested). But what is
surprising is that it fails to write top level field:
PrimitiveColumnIO _hoodie_commit_time r:0 d:1 [_hoodie_commit_time]
which is the first top level field in Avro:
{"_hoodie_commit_time": "20200317215711", "_hoodie_commit_seqno":
"20200317215711_0_650",

Can it be that currentLevel is not being reset bt writes? What else can it
be?

Thank you,
Alex