You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nirav Patel <np...@xactlycorp.com> on 2017/03/14 00:20:03 UTC

DataFrameWriter - Where to find list of Options applicable to particular format(datasource)

Hi,

Is there a document for each datasource (csv, tsv, parquet, json, avro)
with available options ?  I need to find one for csv to
"ignoreLeadingWhiteSpace" and "ignoreTrailingWhiteSpace"

Thanks

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

Re: DataFrameWriter - Where to find list of Options applicable to particular format(datasource)

Posted by Nirav Patel <np...@xactlycorp.com>.
Thanks Kwon. Goal is to preserve whitespace. Not to alter data in general
or do it with user provided options. It's causing our downstream jobs to
fail.


On Mon, Mar 13, 2017 at 7:23 PM, Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi, all the options are documented in https://spark.apache.org/
> docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameWriter
>
> It seems we don't have both options for writing. If the goal is trimming
> the whitespaces, I think we could do this within dataframe operations (as
> we talked in the JIRA - https://issues.apache.org/jira/browse/SPARK-18579
> ).
>
>
>
> 2017-03-14 9:20 GMT+09:00 Nirav Patel <np...@xactlycorp.com>:
>
>> Hi,
>>
>> Is there a document for each datasource (csv, tsv, parquet, json, avro)
>> with available options ?  I need to find one for csv to
>> "ignoreLeadingWhiteSpace" and "ignoreTrailingWhiteSpace"
>>
>> Thanks
>>
>>
>>
>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>>
>> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
>> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
>> <https://twitter.com/Xactly>  [image: Facebook]
>> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
>> <http://www.youtube.com/xactlycorporation>
>
>
>

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

Re: DataFrameWriter - Where to find list of Options applicable to particular format(datasource)

Posted by Hyukjin Kwon <gu...@gmail.com>.
Hi, all the options are documented in
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameWriter

It seems we don't have both options for writing. If the goal is trimming
the whitespaces, I think we could do this within dataframe operations (as
we talked in the JIRA - https://issues.apache.org/jira/browse/SPARK-18579).



2017-03-14 9:20 GMT+09:00 Nirav Patel <np...@xactlycorp.com>:

> Hi,
>
> Is there a document for each datasource (csv, tsv, parquet, json, avro)
> with available options ?  I need to find one for csv to
> "ignoreLeadingWhiteSpace" and "ignoreTrailingWhiteSpace"
>
> Thanks
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> <https://twitter.com/Xactly>  [image: Facebook]
> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> <http://www.youtube.com/xactlycorporation>