You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "ted-jenks (via GitHub)" <gi...@apache.org> on 2023/02/07 10:42:56 UTC

[GitHub] [spark] ted-jenks commented on a diff in pull request #39926: [WIP][SQL] Remove repeated function in CSVExprUtils

ted-jenks commented on code in PR #39926:
URL: https://github.com/apache/spark/pull/39926#discussion_r1098487401


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtils.scala:
##########
@@ -35,22 +35,11 @@ object CSVExprUtils {
     }
   }
 
-  def skipComments(iter: Iterator[String], options: CSVOptions): Iterator[String] = {
-    if (options.isCommentSet) {
-      val commentPrefix = options.comment.toString
-      iter.dropWhile { line =>

Review Comment:
   Could you help me understand why the behavior would change when switching the functions? `skipComments` is only used in `extractHeader(iter: Iterator[String], options: CSVOptions)` where it doesn't make a difference if `filter` or `dropWhile` is used. Is this an optimisation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org