You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2022/04/01 15:29:00 UTC
[jira] [Closed] (HUDI-3763) Enabling point look ups fails to recognize InlineFileSystem
[ https://issues.apache.org/jira/browse/HUDI-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sivabalan narayanan closed HUDI-3763.
-------------------------------------
Resolution: Fixed
> Enabling point look ups fails to recognize InlineFileSystem
> -----------------------------------------------------------
>
> Key: HUDI-3763
> URL: https://issues.apache.org/jira/browse/HUDI-3763
> Project: Apache Hudi
> Issue Type: Bug
> Components: metadata, writer-core
> Reporter: sivabalan narayanan
> Assignee: sivabalan narayanan
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Tried quick start guide by enabling `hoodie.enable.full.scan.log.files`. Query succeeds. but the writer path fails.
>
> {code:java}
>
> scala> val tripsSnapshotDF = spark.
> | read.
> | format("hudi").
> | load(basePath)
> tripsSnapshotDF: org.apache.spark.sql.DataFrame = [_hoodie_commit_time: string, _hoodie_commit_seqno: string ... 13 more fields]
> scala> //load(basePath) use "/partitionKey=partitionValue" folder structure for Spark auto partition discovery
> scala> tripsSnapshotDF.createOrReplaceTempView("hudi_trips_snapshot")
> scala>
> scala> spark.sql("select fare, begin_lon, begin_lat, ts from hudi_trips_snapshot where fare > 20.0").show()
> 22/03/31 13:08:54 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
> +------------------+-------------------+-------------------+-------------+
> | fare| begin_lon| begin_lat| ts|
> +------------------+-------------------+-------------------+-------------+
> | 41.06290929046368| 0.8192868687714224| 0.651058505660742|1648539162291|
> | 66.62084366450246|0.03844104444445928| 0.0750588760043035|1648637987188|
> |34.158284716382845|0.46157858450465483| 0.4726905879569653|1648386611902|
> | 43.4923811219014| 0.8779402295427752| 0.6100070562136587|1648376814117|
> | 93.56018115236618|0.14285051259466197|0.21624150367601136|1648563178430|
> | 33.92216483948643| 0.9694586417848392| 0.1856488085068272|1648311253513|
> | 64.27696295884016| 0.4923479652912024| 0.5731835407930634|1648508390250|
> | 27.79478688582596| 0.6273212202489661|0.11488393157088261|1648542869649|
> +------------------+-------------------+-------------------+-------------+
> {code}
>
> Upsert fails:
>
> {code:java}
> scala> df.write.format("hudi"). | options(getQuickstartWriteConfigs). | option(PRECOMBINE_FIELD_OPT_KEY, "ts"). | option(RECORDKEY_FIELD_OPT_KEY, "uuid"). | option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath"). | option(TABLE_NAME, tableName). | mode(Append). | save(basePath)warning: there was one deprecation warning; re-run with -deprecation for details22/03/31 13:08:59 ERROR AbstractHoodieLogRecordReader: Got exception when reading log filejava.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hudi.common.fs.inline.InLineFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hudi.common.table.log.block.HoodieHFileDataBlock.lookupRecords(HoodieHFileDataBlock.java:210) at org.apache.hudi.common.table.log.block.HoodieDataBlock.getRecordItr(HoodieDataBlock.java:146) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processDataBlock(AbstractHoodieLogRecordReader.java:363) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:430) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:242) at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.getRecordsByKeys(HoodieMetadataMergedLogRecordReader.java:128) at org.apache.hudi.metadata.HoodieBackedTableMetadata.readLogRecords(HoodieBackedTableMetadata.java:185) at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$0(HoodieBackedTableMetadata.java:151) at java.util.HashMap.forEach(HashMap.java:1289) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:138) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:128) at org.apache.hudi.metadata.BaseTableMetadata.fetchAllFilesInPartition(BaseTableMetadata.java:312) at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:135) at org.apache.hudi.metadata.HoodieMetadataFileSystemView.listPartition(HoodieMetadataFileSystemView.java:65) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$ensurePartitionLoadedCorrectly$9(AbstractTableFileSystemView.java:304) at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.ensurePartitionLoadedCorrectly(AbstractTableFileSystemView.java:295) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.getLatestBaseFilesBeforeOrOn(AbstractTableFileSystemView.java:502) at org.apache.hudi.timeline.service.handlers.BaseFileHandler.getLatestDataFilesBeforeOrOn(BaseFileHandler.java:60) at org.apache.hudi.timeline.service.RequestHandler.lambda$registerDataFilesAPI$6(RequestHandler.java:272) at org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:501) at io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22) at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606) at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46) at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17) at io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143) at io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41) at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107) at io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72) at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) at org.apache.hudi.org.apache.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480) at org.apache.hudi.org.apache.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668) at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) at org.apache.hudi.org.apache.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247) at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.apache.hudi.org.apache.jetty.server.handler.HandlerList.handle(HandlerList.java:61) at org.apache.hudi.org.apache.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174) at org.apache.hudi.org.apache.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.apache.hudi.org.apache.jetty.server.Server.handle(Server.java:502) at org.apache.hudi.org.apache.jetty.server.HttpChannel.handle(HttpChannel.java:370) at org.apache.hudi.org.apache.jetty.server.HttpConnection.onFillable(HttpConnection.java:267) at org.apache.hudi.org.apache.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.apache.hudi.org.apache.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.apache.hudi.org.apache.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132) at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) at java.lang.Thread.run(Thread.java:748)Caused by: java.lang.ClassNotFoundException: Class org.apache.hudi.common.fs.inline.InLineFileSystem not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193) ... 58 more22/03/31 13:08:59 ERROR RequestHandler: Got runtime exception servicing request partition=asia%2Findia%2Fchennai&maxinstant=20220331130836362&basepath=file%3A%2Ftmp%2Fhudi_trips_cow&lastinstantts=20220331130836362&timelinehash=6037efcf3987981a82bf3e803fe57df94206a98fdaf87d5975a15af131467fceorg.apache.hudi.exception.HoodieMetadataException: Failed to retrieve files in partition file:/tmp/hudi_trips_cow/asia/india/chennai from metadata at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:137) at org.apache.hudi.metadata.HoodieMetadataFileSystemView.listPartition(HoodieMetadataFileSystemView.java:65) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$ensurePartitionLoadedCorrectly$9(AbstractTableFileSystemView.java:304) at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.ensurePartitionLoadedCorrectly(AbstractTableFileSystemView.java:295) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.getLatestBaseFilesBeforeOrOn(AbstractTableFileSystemView.java:502) at org.apache.hudi.timeline.service.handlers.BaseFileHandler.getLatestDataFilesBeforeOrOn(BaseFileHandler.java:60) at org.apache.hudi.timeline.service.RequestHandler.lambda$registerDataFilesAPI$6(RequestHandler.java:272) at org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:501) at io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22) at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606) at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46) at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17) at io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143) at io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41) at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107) at io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72) at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) at org.apache.hudi.org.apache.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480) at org.apache.hudi.org.apache.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668) at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) at org.apache.hudi.org.apache.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247) at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.apache.hudi.org.apache.jetty.server.handler.HandlerList.handle(HandlerList.java:61) at org.apache.hudi.org.apache.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174) at org.apache.hudi.org.apache.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.apache.hudi.org.apache.jetty.server.Server.handle(Server.java:502) at org.apache.hudi.org.apache.jetty.server.HttpChannel.handle(HttpChannel.java:370) at org.apache.hudi.org.apache.jetty.server.HttpConnection.onFillable(HttpConnection.java:267) at org.apache.hudi.org.apache.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.apache.hudi.org.apache.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.apache.hudi.org.apache.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132) at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) at java.lang.Thread.run(Thread.java:748)Caused by: org.apache.hudi.exception.HoodieException: Exception when reading log file at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:336) at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.getRecordsByKeys(HoodieMetadataMergedLogRecordReader.java:128) {code}
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)