You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Dan Bikle <bi...@gmail.com> on 2016/09/23 22:45:24 UTC

With spark DataFrame, how to write to existing folder?

spark-world,

I am walking through the example here:

https://github.com/databricks/spark-csv#scala-api

The example complains if I try to write a DataFrame to an existing folder:





*val selectedData = df.select("year", "model")selectedData.write
.format("com.databricks.spark.csv")    .option("header", "true")
.save("newcars.csv")*

I used google to look for DataFrame.write() API.

It sent me here:
http://spark.apache.org/docs/latest/sql-programming-guide.html#dataframe-data-readerwriter-interface

There I found this link:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrame@write:DataFrameWriter

And that link is a 404-error.

Question:
How to enhance this call so it overwrites instead of failing:





*selectedData.write    .format("com.databricks.spark.csv")
.option("header", "true")    .save("newcars.csv")*??

Re: With spark DataFrame, how to write to existing folder?

Posted by Yong Zhang <ja...@hotmail.com>.
df.write.format(source).mode("overwrite").save(path)


Yong


________________________________
From: Dan Bikle <bi...@gmail.com>
Sent: Friday, September 23, 2016 6:45 PM
To: user@spark.apache.org
Subject: With spark DataFrame, how to write to existing folder?

spark-world,

I am walking through the example here:

https://github.com/databricks/spark-csv#scala-api

The example complains if I try to write a DataFrame to an existing folder:

val selectedData = df.select("year", "model")
selectedData.write
    .format("com.databricks.spark.csv")
    .option("header", "true")
    .save("newcars.csv")

I used google to look for DataFrame.write() API.

It sent me here:
http://spark.apache.org/docs/latest/sql-programming-guide.html#dataframe-data-readerwriter-interface

There I found this link:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrame@write:DataFrameWriter

And that link is a 404-error.

Question:
How to enhance this call so it overwrites instead of failing:

selectedData.write
    .format("com.databricks.spark.csv")
    .option("header", "true")
    .save("newcars.csv")
??