You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/23 17:30:00 UTC
[GitHub] [iceberg] rajarshisarkar commented on issue #2991: No such file or directory when using multiple Spark executors + Iceberg in EMR

rajarshisarkar commented on issue #2991:
URL: https://github.com/apache/iceberg/issues/2991#issuecomment-926014364


   @alex-shchetkov @fcvr1010 @jackye1995 I was able to reproduce the issue. When `--conf spark.driver.extraJavaOptions=-Djava.io.tmpdir=/tmp/driver` is not passed then the `tmp` container folder looks something like this: 
   
   ```
   /mnt1/yarn/usercache/hadoop/appcache/application_1632368478936_0025/container_1632368478936_0025_02_000021/tmp
   /mnt1/yarn/usercache/hadoop/appcache/application_1632368478936_0025/container_1632368478936_0025_02_000021/tmp/liblz4-java-4965470198489888535.so.lck
   /mnt1/yarn/usercache/hadoop/appcache/application_1632368478936_0025/container_1632368478936_0025_02_000021/tmp/liblz4-java-4965470198489888535.so
   ```
   
   However, I still got the issue even though the folder and temp files were there:
   ```
   System.getProperty("java.io.tmpdir"): /mnt1/yarn/usercache/hadoop/appcache/application_1632368478936_0025/container_1632368478936_0025_02_000021/tmp
   	at org.apache.iceberg.aws.s3.S3OutputFile.createOrOverwrite(S3OutputFile.java:61)
   	at org.apache.iceberg.parquet.ParquetIO$ParquetOutputFile.createOrOverwrite(ParquetIO.java:153)
   	at org.apache.iceberg.shaded.org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:293)
   	at org.apache.iceberg.shaded.org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:259)
   	at org.apache.iceberg.parquet.ParquetWriter.<init>(ParquetWriter.java:101)
   	at org.apache.iceberg.parquet.Parquet$WriteBuilder.build(Parquet.java:250)
   	at org.apache.iceberg.spark.source.SparkAppenderFactory.newAppender(SparkAppenderFactory.java:110)
   	at org.apache.iceberg.spark.source.SparkAppenderFactory.newDataWriter(SparkAppenderFactory.java:139)
   	at org.apache.iceberg.io.BaseTaskWriter$RollingFileWriter.newWriter(BaseTaskWriter.java:310)
   	at org.apache.iceberg.io.BaseTaskWriter$RollingFileWriter.newWriter(BaseTaskWriter.java:303)
   	at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.openCurrent(BaseTaskWriter.java:271)
   	at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.<init>(BaseTaskWriter.java:233)
   	at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.<init>(BaseTaskWriter.java:223)
   	at org.apache.iceberg.io.BaseTaskWriter$RollingFileWriter.<init>(BaseTaskWriter.java:305)
   	at org.apache.iceberg.io.PartitionedWriter.write(PartitionedWriter.java:73)
   	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.$anonfun$run$7(WriteToDataSourceV2Exec.scala:441)
   	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)
   	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:477)
   	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:385)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   	at org.apache.spark.scheduler.Task.run(Task.scala:127)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: java.io.IOException: No such file or directory
   	at java.io.UnixFileSystem.createFileExclusively(Native Method)
   	at java.io.File.createTempFile(File.java:2026)
   	at org.apache.iceberg.aws.s3.S3OutputStream.newStream(S3OutputStream.java:181)
   	at org.apache.iceberg.aws.s3.S3OutputStream.<init>(S3OutputStream.java:115)
   	at org.apache.iceberg.aws.s3.S3OutputFile.createOrOverwrite(S3OutputFile.java:58)
   	... 26 more
   ```
   
   Also, the executor tasks running on the node where the driver is running do not fail. I will continue the analysis.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org