You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/07/15 02:07:51 UTC

[GitHub] [spark] stczwd commented on a change in pull request #29088: [SPARK-32289][SQL] Some characters are garbled when opening csv files with Excel

stczwd commented on a change in pull request #29088:
URL: https://github.com/apache/spark/pull/29088#discussion_r454746965



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CsvOutputWriter.scala
##########
@@ -39,6 +39,10 @@ class CsvOutputWriter(
 
   private val gen = new UnivocityGenerator(dataSchema, writer, params)
 
+  if (params.bom) {
+    writer.write(0xFEFF)

Review comment:
       We meet the same problem in our project, and we use `0xEFBBBF` as default BOM for UTF-8, it will change the value if we use `0xFEFF`. Besides, we have meet other problems, such as the commas were used incorrectly or quotation marks were not displayed properly. If we fix these problems, it will cause other users can not read these files with other tools.
   
   Hm, what I trying to say is, maybe it is not a good idea to change CsvOutputWriter to fit Excel format. It can be done in project before use downloads the csv files or just use Excel to import.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org