You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ShivaKumar SS (Jira)" <ji...@apache.org> on 2019/11/25 09:21:00 UTC
[jira] [Updated] (SPARK-30023) Spark partitionby saves as
columnName={value} | Can it be only columnvalue
[ https://issues.apache.org/jira/browse/SPARK-30023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ShivaKumar SS updated SPARK-30023:
----------------------------------
Description:
I am using scala and spark.
This is using Dataframe and in dataframe i have a columns by name "year" "month" and "date" and many other columns which are not relevant here.
Code snippet.
{{df.write.partitionBy("year", "month", "day").format("csv").option("header", "true").save(outPath)
}}
and my expectation is to save in a hierarchy folder structure.
{{2016/11/15/file.csv}}
but the files are getting saved as
{{year=2016/month=11/day=15/file.csv}}
{{Is there any way i can remove the column name from the directory structure and save only the column value here. ? }}
was:
I am using scala and spark.
This is using Dataframe and in dataframe i have a columns by name "year" "month" and "date" and many other columns which are not relevant here.
Code snippet.
{{df.write.partitionBy("year", "month", "day").format("csv").option("header", "true").save(outPath)
}}
and my expectation is to save in a hierarchy folder structure.
{{2016/11/15/file.csv}}
but the files are getting saved as
{{year=2016/month=11/day=15/file.csv}}
{{}}
{{Is there any way i can remove the column name from the directory structure and save only the column value here. ? }}
{{}}
{{}}
{{}}
> Spark partitionby saves as columnName={value} | Can it be only columnvalue
> --------------------------------------------------------------------------
>
> Key: SPARK-30023
> URL: https://issues.apache.org/jira/browse/SPARK-30023
> Project: Spark
> Issue Type: Question
> Components: Spark Core, SQL
> Affects Versions: 2.4.3
> Reporter: ShivaKumar SS
> Priority: Major
>
> I am using scala and spark.
> This is using Dataframe and in dataframe i have a columns by name "year" "month" and "date" and many other columns which are not relevant here.
>
> Code snippet.
>
> {{df.write.partitionBy("year", "month", "day").format("csv").option("header", "true").save(outPath)
> }}
>
> and my expectation is to save in a hierarchy folder structure.
>
> {{2016/11/15/file.csv}}
>
> but the files are getting saved as
>
>
> {{year=2016/month=11/day=15/file.csv}}
>
> {{Is there any way i can remove the column name from the directory structure and save only the column value here. ? }}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org