You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2018/12/14 07:23:19 UTC

[GitHub] HyukjinKwon commented on a change in pull request #23288: [SPARK-26339][SQL]Throws better exception when reading files that start with underscore

HyukjinKwon commented on a change in pull request #23288: [SPARK-26339][SQL]Throws better exception when reading files that start with underscore
URL: https://github.com/apache/spark/pull/23288#discussion_r241662133
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ##########
 @@ -554,8 +554,13 @@ case class DataSource(
 
       // Sufficient to check head of the globPath seq for non-glob scenario
       // Don't need to check once again if files exist in streaming mode
-      if (checkFilesExist && !fs.exists(globPath.head)) {
-        throw new AnalysisException(s"Path does not exist: ${globPath.head}")
+      if (checkFilesExist) {
+        val firstPath = globPath.head
+        if (!fs.exists(firstPath)) {
+          throw new AnalysisException(s"Path does not exist: ${firstPath}")
+        } else if (InMemoryFileIndex.shouldFilterOut(firstPath.getName)) {
+          throw new AnalysisException(s"Path exists but is ignored: ${firstPath}")
 
 Review comment:
   One thing i'm not sure tho, it's going to throw an exception for, for instance,
   
   ```
   spark.read.text("_text.txt").show()
   ```
   
   instead of returning an empty dataframe - which is kind of a behaviour change. Also, it doesn't recursive children.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org