You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ShivaKumar SS (Jira)" <ji...@apache.org> on 2019/11/25 09:21:00 UTC
[jira] [Updated] (SPARK-30023) Spark partitionby saves as columnName={value} | Can it be only columnvalue

     [ https://issues.apache.org/jira/browse/SPARK-30023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ShivaKumar SS updated SPARK-30023:
----------------------------------
    Description: 
I am using scala and spark.

This is using Dataframe and in dataframe i have a columns by name "year" "month" and "date" and many other columns which are not relevant here.

 

Code snippet.
  
 {{df.write.partitionBy("year", "month", "day").format("csv").option("header", "true").save(outPath)
 }}
  

and my expectation is to save in a hierarchy folder structure.
  
 {{2016/11/15/file.csv}}

 

but the files are getting saved as 

 
  
 {{year=2016/month=11/day=15/file.csv}}

 

{{Is there any way i can remove the column name from the directory structure and save only the column value here. ? }}

 

 

  was:
I am using scala and spark.

This is using Dataframe and in dataframe i have a columns by name "year" "month" and "date" and many other columns which are not relevant here.

 

Code snippet.
  
 {{df.write.partitionBy("year", "month", "day").format("csv").option("header", "true").save(outPath)
 }}
  

and my expectation is to save in a hierarchy folder structure.
 
{{2016/11/15/file.csv}}

 

but the files are getting saved as 

 
 
{{year=2016/month=11/day=15/file.csv}}

{{}}

{{Is there any way i can remove the column name from the directory structure and save only the column value here. ? }}

{{}}

{{}}

{{}}

 

 

 


> Spark partitionby saves as columnName={value} | Can it be only columnvalue
> --------------------------------------------------------------------------
>
>                 Key: SPARK-30023
>                 URL: https://issues.apache.org/jira/browse/SPARK-30023
>             Project: Spark
>          Issue Type: Question
>          Components: Spark Core, SQL
>    Affects Versions: 2.4.3
>            Reporter: ShivaKumar SS
>            Priority: Major
>
> I am using scala and spark.
> This is using Dataframe and in dataframe i have a columns by name "year" "month" and "date" and many other columns which are not relevant here.
>  
> Code snippet.
>   
>  {{df.write.partitionBy("year", "month", "day").format("csv").option("header", "true").save(outPath)
>  }}
>   
> and my expectation is to save in a hierarchy folder structure.
>   
>  {{2016/11/15/file.csv}}
>  
> but the files are getting saved as 
>  
>   
>  {{year=2016/month=11/day=15/file.csv}}
>  
> {{Is there any way i can remove the column name from the directory structure and save only the column value here. ? }}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org