You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "pin_zhang (JIRA)" <ji...@apache.org> on 2018/08/21 07:26:00 UTC

[jira] [Created] (SPARK-25169) Multiple DataFrames cannot write to the same folder concurrently

pin_zhang created SPARK-25169:
---------------------------------

             Summary: Multiple DataFrames cannot write to the same folder concurrently
                 Key: SPARK-25169
                 URL: https://issues.apache.org/jira/browse/SPARK-25169
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.3.1
            Reporter: pin_zhang


 

Seems DataFrame writer cannot support write to the same folder concurrently.

Steps to reproduce
val sc = new SparkContext(conf)
val hiveContext = new HiveContext(sc)
val source="file:///G:/home/json"
val target ="file:///G:/home/oad"
new Thread(new Runnable {
 override def run(): Unit = {
 hiveContext.jsonFile(source).write.mode(SaveMode.Append).json(target)
 Thread.sleep(1000L)
 }
}).start()
new Thread(new Runnable {
 override def run(): Unit = {
 hiveContext.jsonFile(source).write.mode(SaveMode.Append).json(target)
 Thread.sleep(1000L)
 }
}).start()
new Thread(new Runnable {
 override def run(): Unit = {
 hiveContext.jsonFile(source).write.mode(SaveMode.Append).json(target)
 Thread.sleep(1000L)
 }
}).start()

 

Meet exceptions

java.io.FileNotFoundException: File file:/G:/home/oad/_temporary/0/task_20180821151921_0004_m_000001/.part-00001-463ee671-0ef0-42ff-8968-1d960bc87996-c000.json.crc does not exist
 at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
 at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org