You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by SRK <sw...@gmail.com> on 2019/02/27 19:36:05 UTC

Issue with file names writeStream in Structured Streaming

Hi,

We are using something like the following to write data to files in
Structured Streaming and we seem to get file names as part* as mentioned in
https://stackoverflow.com/questions/51056764/how-to-define-a-spark-structured-streaming-file-sink-file-path-or-file-name. 

How to get file names of our choice for each row in the dataframe? Like say
/day/month/id/log.txt?


df.writeStream 
  .format("parquet") // can be "orc", "json", "csv", etc.
  .option("path", "/path/to/save/") 
  .partitionBy("year", "month", "day", "hour") 
  .start()

Thanks for the help!!!



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: Issue with file names writeStream in Structured Streaming

Posted by Gourav Sengupta <go...@gmail.com>.

Should that not cause more problems?

Regards,
Gourav Sengupta

On Wed, Feb 27, 2019 at 7:36 PM SRK <sw...@gmail.com> wrote:

>
> Hi,
>
> We are using something like the following to write data to files in
> Structured Streaming and we seem to get file names as part* as mentioned in
>
> https://stackoverflow.com/questions/51056764/how-to-define-a-spark-structured-streaming-file-sink-file-path-or-file-name.
>
>
> How to get file names of our choice for each row in the dataframe? Like say
> /day/month/id/log.txt?
>
>
> df.writeStream
>   .format("parquet") // can be "orc", "json", "csv", etc.
>   .option("path", "/path/to/save/")
>   .partitionBy("year", "month", "day", "hour")
>   .start()
>
> Thanks for the help!!!
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>