You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/16 00:05:03 UTC

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29765: [SPARK-32888][DOCS] Add user document about header flag and RDD as path for reading CSV

HyukjinKwon commented on a change in pull request #29765:
URL: https://github.com/apache/spark/pull/29765#discussion_r489084356



##########
File path: python/pyspark/sql/readwriter.py
##########
@@ -429,7 +429,8 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non
         :param comment: sets a single character used for skipping lines beginning with this
                         character. By default (None), it is disabled.
         :param header: uses the first line as names of columns. If None is set, it uses the
-                       default value, ``false``.
+                       default value, ``false``. Note that if the given path is a RDD of Strings,
+                       this header option will remove all lines same with the header if exists.

Review comment:
       Here too. I would use `.. note:` under this parameter.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##########
@@ -600,6 +600,9 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging {
    * If the enforceSchema is set to `false`, only the CSV header in the first line is checked
    * to conform specified or inferred schema.
    *
+   * Note that if `header` option is set to `true` when calling this API, all lines same with

Review comment:
       I would just `@note`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org