You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Rishi Shah <ri...@gmail.com> on 2019/07/01 11:17:58 UTC

[pyspark 2.4.0] write with partitionBy fails due to file already exits

Hi All,

I have a simple partition write like below:

df = spark.read.parquet('read-location')
df.write.partitionBy('col1').mode('overwrite').parquet('write-location')

this fails after an hr with "file already exists (in .staging directory)"
error. Not sure what am I doing wrong here..

-- 
Regards,

Rishi Shah