You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2017/07/09 23:03:03 UTC

[GitHub] spark pull request #18580: [SPARK-21354] [SQL] INPUT FILE related functions ...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/18580

    [SPARK-21354] [SQL] INPUT FILE related functions do not support more than one sources

    ### What changes were proposed in this pull request?
    The build-in functions `input_file_name`, `input_file_block_start`, `input_file_block_length` do not support more than one sources, like what Hive does. Currently, Spark does not block it and the outputs are ambiguous/non-deterministic. It could be from any side.
    
    ```
    hive> select *, INPUT__FILE__NAME FROM t1, t2;
    FAILED: SemanticException Column INPUT__FILE__NAME Found in more than One Tables/Subqueries
    ```
    
    This PR blocks it and issues an error. 
    
    ### How was this patch tested?
    Added a test case

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark inputFileName

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18580.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18580
    
----
commit 04ff406c5a82f9454419d0f0054b1aa75ea97aa0
Author: gatorsmile <ga...@gmail.com>
Date:   2017-07-09T22:52:55Z

    fix.

commit 596ea17bc99a703004ab7bef657603a9db57d5f2
Author: gatorsmile <ga...@gmail.com>
Date:   2017-07-09T23:02:36Z

    fix.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Does Hive report an error?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    In case of `UNION ALL`, everything looks normal. So, it would be a case of regression.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18580: [SPARK-21354] [SQL] INPUT FILE related functions ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18580#discussion_r126323442
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
    @@ -530,6 +530,20 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
         )
       }
     
    +  test("input_file_name, input_file_block_start, input_file_block_length - more than one sources") {
    +    withTable("tab1", "tab2") {
    +      val data = sparkContext.parallelize(0 to 10).toDF("id")
    +      data.write.saveAsTable("tab1")
    +      data.write.saveAsTable("tab2")
    +      Seq("input_file_name", "input_file_block_start", "input_file_block_length").foreach { func =>
    +        val e = intercept[AnalysisException] {
    +          sql(s"SELECT *, $func() FROM tab1 JOIN tab2 ON tab1.id = tab2.id")
    +        }.getMessage
    +        assert(e.contains(s"'$func' does not support more than one sources"))
    --- End diff --
    
    ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    cc @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    **[Test build #79651 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79651/testReport)** for PR 18580 at commit [`c4de2b8`](https://github.com/apache/spark/commit/c4de2b8e2583c55f1b761569050d2c21506c2291).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    **[Test build #79426 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79426/testReport)** for PR 18580 at commit [`596ea17`](https://github.com/apache/spark/commit/596ea17bc99a703004ab7bef657603a9db57d5f2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18580: [SPARK-21354] [SQL] INPUT FILE related functions ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18580#discussion_r126599149
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ---
    @@ -74,6 +74,15 @@ trait CheckAnalysis extends PredicateHelper {
         }
       }
     
    +  private def getNumInputFileBlockSources(operator: LogicalPlan): Int = {
    +    operator match {
    +      case _: LeafNode => 1
    --- End diff --
    
    shall we only consider file data source leaf node?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    **[Test build #79426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79426/testReport)** for PR 18580 at commit [`596ea17`](https://github.com/apache/spark/commit/596ea17bc99a703004ab7bef657603a9db57d5f2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79426/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79651/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18580: [SPARK-21354] [SQL] INPUT FILE related functions ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18580#discussion_r127621186
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ---
    @@ -74,6 +74,15 @@ trait CheckAnalysis extends PredicateHelper {
         }
       }
     
    +  private def getNumInputFileBlockSources(operator: LogicalPlan): Int = {
    +    operator match {
    +      case _: LeafNode => 1
    --- End diff --
    
    Unable to check it in `CheckAnalysis`. Both `HadoopRDD` and `FileScanRDD` have the same issues. To block both, we need to add the check as another rule. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18580: [SPARK-21354] [SQL] INPUT FILE related functions ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18580#discussion_r126323436
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ---
    @@ -100,6 +104,10 @@ trait CheckAnalysis extends PredicateHelper {
                 failAnalysis(
                   s"invalid cast from ${c.child.dataType.simpleString} to ${c.dataType.simpleString}")
     
    +          case e @ (_: InputFileName | _: InputFileBlockLength | _: InputFileBlockStart)
    +              if getNumLeafNodes(operator) > 1 =>
    +            e.failAnalysis(s"'${e.prettyName}' does not support more than one sources")
    --- End diff --
    
    nit: `one sources` -> `one source`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    For union, does Hive output an error?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    The following is the output of the current Spark.
    ```scala
    scala> spark.range(10).write.saveAsTable("t1")
    
    scala> spark.range(100,110).write.saveAsTable("t2")
    
    scala> sql("select *, input_file_name() from t1").show(false)
    +---+-------------------------------------------------------------------------------------------------------------------+
    |id |input_file_name()                                                                                                  |
    +---+-------------------------------------------------------------------------------------------------------------------+
    |3  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00003-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |4  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00003-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |8  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00007-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |9  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00007-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |0  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00000-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |1  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00001-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |2  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00002-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |5  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00004-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |6  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00005-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |7  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00006-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    +---+-------------------------------------------------------------------------------------------------------------------+
    
    
    scala> sql("select *, input_file_name() from t2").show(false)
    +---+-------------------------------------------------------------------------------------------------------------------+
    |id |input_file_name()                                                                                                  |
    +---+-------------------------------------------------------------------------------------------------------------------+
    |103|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00003-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |104|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00003-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |108|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00007-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |109|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00007-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |100|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00000-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |101|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00001-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |102|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00002-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |105|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00004-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |106|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00005-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |107|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00006-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    +---+-------------------------------------------------------------------------------------------------------------------+
    
    
    scala> sql("select *, input_file_name() from ((select * from t1) union all (select * from t2))").show(false)
    +---+-------------------------------------------------------------------------------------------------------------------+
    |id |input_file_name()                                                                                                  |
    +---+-------------------------------------------------------------------------------------------------------------------+
    |3  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00003-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |4  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00003-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |8  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00007-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |9  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00007-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |0  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00000-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |1  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00001-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |2  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00002-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |5  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00004-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |6  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00005-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |7  |file:///Users/dongjoon/spark/spark-warehouse/t1/part-00006-b0ca8fa4-03ae-4e3a-b4b4-a13d601cd155-c000.snappy.parquet|
    |103|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00003-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |104|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00003-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |108|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00007-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |109|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00007-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |100|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00000-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |101|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00001-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |102|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00002-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |105|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00004-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |106|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00005-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    |107|file:///Users/dongjoon/spark/spark-warehouse/t2/part-00006-76ea547d-0187-40f0-b5dd-f9f1fffeeabf-c000.snappy.parquet|
    +---+-------------------------------------------------------------------------------------------------------------------+
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    This is the error.
    ```
    hive> select *, INPUT__FILE__NAME from (select * from t1) T;
    FAILED: SemanticException [Error 10004]: Line 1:10 Invalid table alias or column reference 'INPUT__FILE__NAME': (possible column names are: _c0)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    LGTM, pending jenkins


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    **[Test build #79651 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79651/testReport)** for PR 18580 at commit [`c4de2b8`](https://github.com/apache/spark/commit/c4de2b8e2583c55f1b761569050d2c21506c2291).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    **[Test build #79576 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79576/testReport)** for PR 18580 at commit [`6b48a9e`](https://github.com/apache/spark/commit/6b48a9e52ded62715b32aef4ee31b121d3e7aee9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    **[Test build #79576 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79576/testReport)** for PR 18580 at commit [`6b48a9e`](https://github.com/apache/spark/commit/6b48a9e52ded62715b32aef4ee31b121d3e7aee9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    It was the same error, `Invalid table alias or column reference 'INPUT__FILE__NAME': (possible column names are: _c0)`. So, I reduce that into a simple above example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    **[Test build #79496 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79496/testReport)** for PR 18580 at commit [`6b48a9e`](https://github.com/apache/spark/commit/6b48a9e52ded62715b32aef4ee31b121d3e7aee9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18580: [SPARK-21354] [SQL] INPUT FILE related functions ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18580#discussion_r127625865
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala ---
    @@ -409,6 +409,42 @@ object HiveOnlyCheck extends (LogicalPlan => Unit) {
       }
     }
     
    +
    +/**
    + * A rule to do various checks before reading a table.
    + */
    +object PreReadCheck extends (LogicalPlan => Unit) {
    +  def apply(plan: LogicalPlan): Unit = {
    +    plan.foreach {
    +      case operator: LogicalPlan =>
    +        operator transformExpressionsUp {
    +          case e @ (_: InputFileName | _: InputFileBlockLength | _: InputFileBlockStart) =>
    +            checkNumInputFileBlockSources(e, operator)
    +            e
    +        }
    +    }
    +  }
    +
    +  private def checkNumInputFileBlockSources(e: Expression, operator: LogicalPlan): Int = {
    +    operator match {
    +      case _: CatalogRelation => 1
    +      case _ @ LogicalRelation(_: HadoopFsRelation, _, _) => 1
    +      case _: LeafNode => 0
    +      // UNION ALL has multiple children, but these children do not concurrently use InputFileBlock.
    +      case u: Union =>
    +        if (u.children.map(checkNumInputFileBlockSources(e, _)).sum >= 1) 1 else 0
    +      case o =>
    +        val numInputFileBlockSources = o.children.map(checkNumInputFileBlockSources(e, _)).sum
    +        if (numInputFileBlockSources > 1) {
    +          e.failAnalysis(s"'${e.prettyName}' does not support more than one sources")
    --- End diff --
    
    Need to check it as early as possible; otherwise, `Union` might eat it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    @dongjoon-hyun What is the output of Hive for your case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    **[Test build #79496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79496/testReport)** for PR 18580 at commit [`6b48a9e`](https://github.com/apache/spark/commit/6b48a9e52ded62715b32aef4ee31b121d3e7aee9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18580: [SPARK-21354] [SQL] INPUT FILE related functions ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18580


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79496/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79576/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18580: [SPARK-21354] [SQL] INPUT FILE related functions do not ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/18580
  
    Let me check that. BTW, I think Spark is better than Hive. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org