You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2022/04/23 07:32:00 UTC

[jira] [Created] (HUDI-3952) Presto query throws UnsupportedOperationException when reading metadata table by disabling full scan

Ethan Guo created HUDI-3952:
-------------------------------

             Summary: Presto query throws UnsupportedOperationException when reading metadata table by disabling full scan
                 Key: HUDI-3952
                 URL: https://issues.apache.org/jira/browse/HUDI-3952
             Project: Apache Hudi
          Issue Type: Task
            Reporter: Ethan Guo
             Fix For: 0.12.0


When switching DEFAULT_METADATA_ENABLE_FOR_READERS to true and setting "hoodie.metadata.enable.full.scan.log.files" to false, running presto queries with hudi-presto-bundle on HDFS in docker demo throws UnsupportedOperationException during HFile log merging, because HadoopExtendedFileSystem does not implement getScheme().
{code:java}
2022-04-23T07:26:13.085Z INFO hive-hive-0 org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader Reading a data block from file hdfs://namenode:8020/user/hive/warehouse/stock_ticks_cow/.hoodie/metadata/files/.files-0000_00000000000000.log.1_0-10-10 at instant 202204230723040192022-04-23T07:26:13.086Z INFO hive-hive-0 org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader Merging the final data blocks2022-04-23T07:26:13.086Z INFO hive-hive-0 org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader Number of remaining logblocks to merge 32022-04-23T07:26:13.185Z INFO hive-hive-0 org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader Number of remaining logblocks to merge 22022-04-23T07:26:13.190Z ERROR hive-hive-0 org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader Got exception when reading log filejava.lang.UnsupportedOperationException: Not implemented by the HadoopExtendedFileSystem FileSystem implementationat org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:219)at org.apache.hudi.common.table.log.block.HoodieHFileDataBlock.lookupRecords(HoodieHFileDataBlock.java:205)at org.apache.hudi.common.table.log.block.HoodieDataBlock.getRecordIterator(HoodieDataBlock.java:168)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.getRecordsIterator(AbstractHoodieLogRecordReader.java:488)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processDataBlock(AbstractHoodieLogRecordReader.java:378)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:466)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:342)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:195)at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.getRecordsByKeys(HoodieMetadataMergedLogRecordReader.java:124)at org.apache.hudi.metadata.HoodieBackedTableMetadata.readLogRecords(HoodieBackedTableMetadata.java:257)at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$0(HoodieBackedTableMetadata.java:213)at java.util.HashMap.forEach(HashMap.java:1289)at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:200)at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:140)at org.apache.hudi.metadata.BaseTableMetadata.fetchAllFilesInPartition(BaseTableMetadata.java:312)at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:135)at org.apache.hudi.metadata.HoodieMetadataFileSystemView.listPartition(HoodieMetadataFileSystemView.java:65)at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$ensurePartitionLoadedCorrectly$9(AbstractTableFileSystemView.java:304)at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)at org.apache.hudi.common.table.view.AbstractTableFileSystemView.ensurePartitionLoadedCorrectly(AbstractTableFileSystemView.java:295)at org.apache.hudi.common.table.view.AbstractTableFileSystemView.getLatestBaseFiles(AbstractTableFileSystemView.java:478)at org.apache.hudi.hadoop.HoodieROTablePathFilter.accept(HoodieROTablePathFilter.java:189)at com.facebook.presto.hive.util.HiveFileIterator.lambda$getLocatedFileStatusRemoteIterator$0(HiveFileIterator.java:103)at com.google.common.collect.Iterators$5.computeNext(Iterators.java:639)at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:69)at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:40)at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811)at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:195)at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:40)at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:121)at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:47)at com.facebook.presto.hive.util.ResumableTasks.access$000(ResumableTasks.java:20)at com.facebook.presto.hive.util.ResumableTasks$1.run(ResumableTasks.java:35)at com.facebook.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748)2022-04-23T07:26:13.215Z ERROR hive-hive-0 org.apache.hudi.hadoop.HoodieROTablePathFilter Error checking path :hdfs://namenode:8020/user/hive/warehouse/stock_ticks_cow/2018/08/31/.hoodie_partition_metadata, under folder: hdfs://namenode:8020/user/hive/warehouse/stock_ticks_cow/2018/08/31org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve files in partition hdfs://namenode:8020/user/hive/warehouse/stock_ticks_cow/2018/08/31 from metadataat org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:137)at org.apache.hudi.metadata.HoodieMetadataFileSystemView.listPartition(HoodieMetadataFileSystemView.java:65)at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$ensurePartitionLoadedCorrectly$9(AbstractTableFileSystemView.java:304)at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)at org.apache.hudi.common.table.view.AbstractTableFileSystemView.ensurePartitionLoadedCorrectly(AbstractTableFileSystemView.java:295)at org.apache.hudi.common.table.view.AbstractTableFileSystemView.getLatestBaseFiles(AbstractTableFileSystemView.java:478)at org.apache.hudi.hadoop.HoodieROTablePathFilter.accept(HoodieROTablePathFilter.java:189)at com.facebook.presto.hive.util.HiveFileIterator.lambda$getLocatedFileStatusRemoteIterator$0(HiveFileIterator.java:103)at com.google.common.collect.Iterators$5.computeNext(Iterators.java:639)at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:69)at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:40)at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811)at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:195)at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:40)at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:121)at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:47)at com.facebook.presto.hive.util.ResumableTasks.access$000(ResumableTasks.java:20)at com.facebook.presto.hive.util.ResumableTasks$1.run(ResumableTasks.java:35)at com.facebook.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748)Caused by: org.apache.hudi.exception.HoodieException: Exception when reading log fileĀ at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:351)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:195)at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.getRecordsByKeys(HoodieMetadataMergedLogRecordReader.java:124)at org.apache.hudi.metadata.HoodieBackedTableMetadata.readLogRecords(HoodieBackedTableMetadata.java:257)at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$0(HoodieBackedTableMetadata.java:213)at java.util.HashMap.forEach(HashMap.java:1289)at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:200)at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:140)at org.apache.hudi.metadata.BaseTableMetadata.fetchAllFilesInPartition(BaseTableMetadata.java:312)at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:135)... 30 moreCaused by: java.lang.UnsupportedOperationException: Not implemented by the HadoopExtendedFileSystem FileSystem implementationat org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:219)at org.apache.hudi.common.table.log.block.HoodieHFileDataBlock.lookupRecords(HoodieHFileDataBlock.java:205)at org.apache.hudi.common.table.log.block.HoodieDataBlock.getRecordIterator(HoodieDataBlock.java:168)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.getRecordsIterator(AbstractHoodieLogRecordReader.java:488)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processDataBlock(AbstractHoodieLogRecordReader.java:378)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:466)at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:342)... 39 more {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)