You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/09/18 13:06:26 UTC

[GitHub] [incubator-hudi] HariprasadAllaka1612 opened a new issue #905: S3 folder paths messed up when running from Windows

HariprasadAllaka1612 opened a new issue #905: S3 folder paths messed up when running from Windows
URL: https://github.com/apache/incubator-hudi/issues/905
 
 
   I am running my code from Windows machine to push data to S3. When i am trying to write the data i am having an error where stats were not able to be found as i am passing stats as null in 
   
   public SizeAwareFSDataOutputStream(FSDataOutputStream out, Runnable 
   closeCallback)
         throws IOException {
       super(out, null);
       this.closeCallback = closeCallback;
     }
   
   I am correcting this issue by commenting the metrics collection part of HoodieWriteClient.finalizemetrics
   
   But the problem is due to this failure the cleanFailedWrites method is failing. This is expecting the path to be Linux based.
   
   SLF4J: Class path contains multiple SLF4J bindings.
   SLF4J: Found binding in [jar:file:/C:/Users/HariprasadAllaka/.m2/repository/org/slf4j/slf4j-log4j12/1.7.16/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
   SLF4J: Found binding in [jar:file:/C:/Users/HariprasadAllaka/.m2/repository/com/github/HariprasadAllaka1612/incubator-hudi/hudi-timeline-server-bundle/playngoplatform-hoodie-0.4.7-gcde16ad-114/hudi-timeline-server-bundle-playngoplatform-hoodie-0.4.7-gcde16ad-114.jar!/org/slf4j/impl/StaticLoggerBinder.class]
   SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
   SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
   Exception in thread "main" java.lang.reflect.InvocationTargetException
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at com.intellij.rt.execution.CommandLineWrapper.main(CommandLineWrapper.java:66)
   Caused by: org.apache.hudi.exception.HoodieCommitException: Failed to complete commit 20190918145332 due to finalize errors.
   	at org.apache.hudi.HoodieWriteClient.finalizeWrite(HoodieWriteClient.java:1312)
   	at org.apache.hudi.HoodieWriteClient.commit(HoodieWriteClient.java:529)
   	at org.apache.hudi.HoodieWriteClient.commit(HoodieWriteClient.java:510)
   	at org.apache.hudi.HoodieWriteClient.commit(HoodieWriteClient.java:501)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:152)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:91)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
   	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
   	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
   	at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
   	at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
   	at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
   	at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:276)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:270)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:228)
   	at com.playngodataengg.scala.dao.DataAccessS3.writeDataToRefinedHudiS3(DataAccessS3.scala:38)
   	at com.playngodataengg.scala.controller.GameAndProviderDataTransform.processData(GameAndProviderDataTransform.scala:48)
   	at com.playngodataengg.scala.action.GameAndProviderData$.main(GameAndProviderData.scala:10)
   	at com.playngodataengg.scala.action.GameAndProviderData.main(GameAndProviderData.scala)
   	... 5 more
   Caused by: org.apache.hudi.exception.HoodieIOException: No such file or directory: s3a://gat-datalake-raw-dev/Games2/.hoodie/.temp/20190918145332/asp
   	at org.apache.hudi.table.HoodieTable.cleanFailedWrites(HoodieTable.java:391)
   	at org.apache.hudi.table.HoodieTable.finalizeWrite(HoodieTable.java:295)
   	at org.apache.hudi.table.HoodieMergeOnReadTable.finalizeWrite(HoodieMergeOnReadTable.java:331)
   	at org.apache.hudi.HoodieWriteClient.finalizeWrite(HoodieWriteClient.java:1303)
   	... 35 more
   Caused by: java.io.FileNotFoundException: No such file or directory: s3a://gat-datalake-raw-dev/Games2/.hoodie/.temp/20190918145332/asp
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2269)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2163)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2102)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListFiles(S3AFileSystem.java:3101)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.listFiles(S3AFileSystem.java:3082)
   	at org.apache.hudi.common.io.storage.HoodieWrapperFileSystem.listFiles(HoodieWrapperFileSystem.java:531)
   	at org.apache.hudi.common.util.FSUtils.processFiles(FSUtils.java:245)
   	at org.apache.hudi.common.util.FSUtils.getAllDataFilesForMarkers(FSUtils.java:213)
   	at org.apache.hudi.table.HoodieTable.cleanFailedWrites(HoodieTable.java:340)
   	... 38 more

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services