You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2021/11/03 02:50:00 UTC

[jira] [Updated] (HUDI-2655) Non partitioned dataset with metadata fails

     [ https://issues.apache.org/jira/browse/HUDI-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinoth Chandar updated HUDI-2655:
---------------------------------
    Story Points: 10

> Non partitioned dataset with metadata fails
> -------------------------------------------
>
>                 Key: HUDI-2655
>                 URL: https://issues.apache.org/jira/browse/HUDI-2655
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: sivabalan narayanan
>            Assignee: Vinoth Chandar
>            Priority: Blocker
>             Fix For: 0.10.0
>
>
> likely when compaction kicks in within metadata table, record key is empty. When I tried w/ deltastreamer job, I hit the exception after 10+ commits in non-partitioned data table
>  
> {code:java}
> Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record into new file for key  from old file file:/tmp/hudi-deltastreamer-op/impressions_cow/.hoodie/metadata/files/files-0000_0-782-654_20211029073337001.hfile to new file file:/tmp/hudi-deltastreamer-op/impressions_cow/.hoodie/metadata/files/files-0000_0-1086-913_20211029073358001.hfile with writerSchema {
>   "type" : "record",
>   "name" : "HoodieMetadataRecord",
>   "namespace" : "org.apache.hudi.avro.model",
>   "doc" : "A record saved within the Metadata Table",
>   "fields" : [ {
>     "name" : "_hoodie_commit_time",
>     "type" : [ "null", "string" ],
>     "doc" : "",
>     "default" : null
>   }, {
>     "name" : "_hoodie_commit_seqno",
>     "type" : [ "null", "string" ],
>     "doc" : "",
>     "default" : null
>   }, {
>     "name" : "_hoodie_record_key",
>     "type" : [ "null", "string" ],
>     "doc" : "",
>     "default" : null
>   }, {
>     "name" : "_hoodie_partition_path",
>     "type" : [ "null", "string" ],
>     "doc" : "",
>     "default" : null
>   }, {
>     "name" : "_hoodie_file_name",
>     "type" : [ "null", "string" ],
>     "doc" : "",
>     "default" : null
>   }, {
>     "name" : "key",
>     "type" : {
>       "type" : "string",
>       "avro.java.string" : "String"
>     }
>   }, {
>     "name" : "type",
>     "type" : "int",
>     "doc" : "Type of the metadata record"
>   }, {
>     "name" : "filesystemMetadata",
>     "type" : [ "null", {
>       "type" : "map",
>       "values" : {
>         "type" : "record",
>         "name" : "HoodieMetadataFileInfo",
>         "fields" : [ {
>           "name" : "size",
>           "type" : "long",
>           "doc" : "Size of the file"
>         }, {
>           "name" : "isDeleted",
>           "type" : "boolean",
>           "doc" : "True if this file has been deleted"
>         } ]
>       },
>       "avro.java.string" : "String"
>     } ],
>     "doc" : "Contains information about partitions and files within the dataset"
>   } ]
> }
> 	at org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:349)
> 	at org.apache.hudi.io.HoodieSortedMergeHandle.write(HoodieSortedMergeHandle.java:104)
> 	at org.apache.hudi.table.action.commit.AbstractMergeHelper$UpdateHandler.consumeOneRecord(AbstractMergeHelper.java:122)
> 	at org.apache.hudi.table.action.commit.AbstractMergeHelper$UpdateHandler.consumeOneRecord(AbstractMergeHelper.java:112)
> 	at org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:37)
> 	at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:121)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	... 3 more
> Caused by: java.lang.IllegalArgumentException: key length must be > 0
> 	at org.apache.hadoop.util.bloom.HashFunction.hash(HashFunction.java:114)
> 	at org.apache.hadoop.util.bloom.BloomFilter.add(BloomFilter.java:122)
> 	at org.apache.hudi.common.bloom.InternalDynamicBloomFilter.add(InternalDynamicBloomFilter.java:94)
> 	at org.apache.hudi.common.bloom.HoodieDynamicBoundedBloomFilter.add(HoodieDynamicBoundedBloomFilter.java:81)
> 	at org.apache.hudi.io.storage.HoodieHFileWriter.writeAvro(HoodieHFileWriter.java:119)
> 	at org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:344)
> 	... 9 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)