You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/12/13 16:41:55 UTC

[GitHub] [spark] MaxGekk commented on a change in pull request #34853: [SPARK-37575][SQL] null values should be saved as nothing rather than quoted empty Strings "" by default settings

MaxGekk commented on a change in pull request #34853:
URL: https://github.com/apache/spark/pull/34853#discussion_r767903506



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
##########
@@ -805,6 +805,22 @@ abstract class CSVSuite
     }
   }
 
+  test("SPARK-37575: null values should not reflect to any characters by default") {

Review comment:
       Could you make test's title more clear like PR's title, please. For now, it might confuses other devs. Does null reflect to any character in write, right now?

##########
File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
##########
@@ -805,6 +805,22 @@ abstract class CSVSuite
     }
   }
 
+  test("SPARK-37575: null values should not reflect to any characters by default") {
+    val litNull: String = null
+    val data = Seq(("Tesla", litNull, ""))
+    withTempPath { path =>
+      val csvDir = new File(path, "csv")
+      val cars = data.toDF("make", "comment", "blank")
+      cars.coalesce(1).write.csv(csvDir.getCanonicalPath)
+
+      csvDir.listFiles().filter(_.getName.endsWith("csv")).foreach({ csvFile =>
+        val readBack = Files.readAllBytes(csvFile.toPath)
+        val expected = ("Tesla,,\"\"" + Properties.lineSeparator).getBytes()
+        assert(readBack === expected)
+      })
+    }

Review comment:
       Let's read the written text back by Spark:
   ```scala
       withTempPath { path =>
         Seq(("Tesla", null: String, "")).toDS().repartition(1).write.csv(path.getCanonicalPath)
         checkAnswer(spark.read.text(path.getCanonicalPath), Row("Tesla,,\"\""))
       }
   ```

##########
File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
##########
@@ -805,6 +805,22 @@ abstract class CSVSuite
     }
   }
 
+  test("SPARK-37575: null values should not reflect to any characters by default") {
+    val litNull: String = null

Review comment:
       `litNull` is used only once. Let's inline it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org