You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by xwu0226 <gi...@git.apache.org> on 2015/11/07 18:47:07 UTC

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

GitHub user xwu0226 opened a pull request:

    https://github.com/apache/spark/pull/9542

    [Spark-11522][SQL] input_file_name() returns "" for external tables

    When computing partition for non-parquet relation, `HadoopRDD.compute` is used. but it does not set the thread local variable `inputFileName` in `NewSqlHadoopRDD`, like `NewSqlHadoopRDD.compute` does.. Yet, when getting the `inputFileName`, `NewSqlHadoopRDD.inputFileName` is exptected, which is empty now.  
    Adding the setting inputFileName in HadoopRDD.compute resolves this issue. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xwu0226/spark SPARK-11522

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9542.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9542
    
----
commit fd7533d3171bb7e90af21e216463829b7fdbd33f
Author: xin Wu <xi...@us.ibm.com>
Date:   2015-11-06T23:47:14Z

    SPARK-11522 input_file_name() returns empty string for external tables

commit b67a0143be47fb14264bb71b787ed6eb351c0c81
Author: xin Wu <xi...@us.ibm.com>
Date:   2015-11-07T02:56:06Z

    SPARK-11522 updating testcases

commit a2d83db953470c2cff130362b462afb6ad2470d6
Author: xin Wu <xi...@us.ibm.com>
Date:   2015-11-07T05:06:14Z

    SPARK-11522 update testcase

commit a88260ff90c25dcc18dd797119fdbcfc6503f991
Author: xin Wu <xi...@us.ibm.com>
Date:   2015-11-07T17:09:24Z

    SPARK-11522 update testcase

commit 2658f2808ede6512c625f3bb33823bc6492823d5
Author: xin Wu <xi...@us.ibm.com>
Date:   2015-11-07T17:12:34Z

    SPARK-11522 update testcase

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154907890
  
    **[Test build #45330 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45330/consoleFull)** for PR 9542 at commit [`b5fa291`](https://github.com/apache/spark/commit/b5fa29111f5c0ea0f913d2fab166e1b1c41e0dff).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866990
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_t5")
    +
    +   // External parquet pointing to LOCATION
    +
    +    val location3 = Utils.getSparkClassLoader.getResource("data/files/external_parquet").getFile
    +    sql(s"""CREATE EXTERNAL table external_parquet(c1 int, c2 int)
    +        stored as parquet
    +        LOCATION '$location3'""")
    +
    +    val answer3 = sql("SELECT input_file_name() as file FROM external_parquet")
    +      .head().getString(0)
    --- End diff --
    
    https://github.com/apache/spark/pull/9542/files#r44866975


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156775499
  
    @xwu0226 Looks good! I left a few comments regarding the format.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154841972
  
    **[Test build #2014 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2014/consoleFull)** for PR 9542 at commit [`2658f28`](https://github.com/apache/spark/commit/2658f2808ede6512c625f3bb33823bc6492823d5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by xwu0226 <gi...@git.apache.org>.

Github user xwu0226 commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154882067
  
    @rxin I pushed again for the scala style test issue. Will the test build be kicked off automatically or manually? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by xwu0226 <gi...@git.apache.org>.

Github user xwu0226 commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156601028
  
    @yhuai Thanks for pointing it out! I will make the change now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866975
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_t5")
    +
    +   // External parquet pointing to LOCATION
    +
    +    val location3 = Utils.getSparkClassLoader.getResource("data/files/external_parquet").getFile
    +    sql(s"""CREATE EXTERNAL table external_parquet(c1 int, c2 int)
    +        stored as parquet
    +        LOCATION '$location3'""")
    +
    +    val answer3 = sql("SELECT input_file_name() as file FROM external_parquet")
    +      .head().getString(0)
    +    assert(answer3.contains("external_parquet"))
    +    assert(sql("SELECT input_file_name() as file FROM external_parquet")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_parquet")
    +
    +    // Non-External parquet pointing to /tmp/...
    +
    +    sql("CREATE table internal_parquet_tmp(c1 int, c2 int) " +
    +      " stored as parquet " +
    +      " as select 1, 2")
    +
    +    val answer4 = sql("SELECT input_file_name() as file FROM internal_parquet_tmp")
    +      .head().getString(0)
    --- End diff --
    
    The format looks weird.  Can we use the following?
    ```
    val answer4 =
      sql("SELECT input_file_name() as file FROM internal_parquet_tmp").head().getString(0)
    ```
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154953881
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156946318
  
    **[Test build #45977 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45977/consoleFull)** for PR 9542 at commit [`eeaa6b6`](https://github.com/apache/spark/commit/eeaa6b6eed2812b879911aba03922ec305459b88).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156842517
  
    **[Test build #45956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45956/consoleFull)** for PR 9542 at commit [`fe2d6d8`](https://github.com/apache/spark/commit/fe2d6d85627f8a145c98256a266461aaecd54736).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156868010
  
    **[Test build #45959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45959/consoleFull)** for PR 9542 at commit [`4481c82`](https://github.com/apache/spark/commit/4481c82a98af62cc4d46d2f07c4d728236bf6d83).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866986
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    --- End diff --
    
    Can we first get the count through `sql("SELECT input_file_name() as file FROM external_t5").distinct().count()` and then do the assertion?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156867750
  
    @xwu0226 Sorry for asking you to update several times. I just realized that you added a bunch of files in `sql/hive/src/test/resources/data/`. Since that file is directly copied from hive, we do not change files or add files in there. Can we just generate some test files in the test? We can make `HiveUDFSuite` extend `SQLTestUtils` and then use `withTempPath` to generate temp dirs that can be used for those external tables.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156946392
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156915753
  
    Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156890474
  
    **[Test build #45970 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45970/consoleFull)** for PR 9542 at commit [`83b1c77`](https://github.com/apache/spark/commit/83b1c7713a536f80062df7586af31e21d631cdd7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156623733
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45911/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156869654
  
    @xwu0226 Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154953696
  
    **[Test build #45330 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45330/consoleFull)** for PR 9542 at commit [`b5fa291`](https://github.com/apache/spark/commit/b5fa29111f5c0ea0f913d2fab166e1b1c41e0dff).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154907232
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866988
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_t5")
    +
    +   // External parquet pointing to LOCATION
    +
    +    val location3 = Utils.getSparkClassLoader.getResource("data/files/external_parquet").getFile
    +    sql(s"""CREATE EXTERNAL table external_parquet(c1 int, c2 int)
    +        stored as parquet
    +        LOCATION '$location3'""")
    +
    +    val answer3 = sql("SELECT input_file_name() as file FROM external_parquet")
    +      .head().getString(0)
    +    assert(answer3.contains("external_parquet"))
    +    assert(sql("SELECT input_file_name() as file FROM external_parquet")
    +      .distinct().collect().length == 1)
    --- End diff --
    
    https://github.com/apache/spark/pull/9542/files#r44866986


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154841918
  
    Jenkins, test this please.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154729999
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156898769
  
    oh seems there is a conflict...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156842320
  
    **[Test build #45956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45956/consoleFull)** for PR 9542 at commit [`fe2d6d8`](https://github.com/apache/spark/commit/fe2d6d85627f8a145c98256a266461aaecd54736).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866994
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_t5")
    +
    +   // External parquet pointing to LOCATION
    +
    +    val location3 = Utils.getSparkClassLoader.getResource("data/files/external_parquet").getFile
    +    sql(s"""CREATE EXTERNAL table external_parquet(c1 int, c2 int)
    +        stored as parquet
    +        LOCATION '$location3'""")
    +
    +    val answer3 = sql("SELECT input_file_name() as file FROM external_parquet")
    +      .head().getString(0)
    +    assert(answer3.contains("external_parquet"))
    +    assert(sql("SELECT input_file_name() as file FROM external_parquet")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_parquet")
    +
    +    // Non-External parquet pointing to /tmp/...
    +
    +    sql("CREATE table internal_parquet_tmp(c1 int, c2 int) " +
    +      " stored as parquet " +
    +      " as select 1, 2")
    --- End diff --
    
    https://github.com/apache/spark/pull/9542/files#r44866981


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156602581
  
    **[Test build #45911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45911/consoleFull)** for PR 9542 at commit [`c27d030`](https://github.com/apache/spark/commit/c27d03088f264b66ed95f79e45d47685b4923e04).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9542


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156623676
  
    **[Test build #45911 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45911/consoleFull)** for PR 9542 at commit [`c27d030`](https://github.com/apache/spark/commit/c27d03088f264b66ed95f79e45d47685b4923e04).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156914531
  
    **[Test build #45977 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45977/consoleFull)** for PR 9542 at commit [`eeaa6b6`](https://github.com/apache/spark/commit/eeaa6b6eed2812b879911aba03922ec305459b88).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156868046
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by xwu0226 <gi...@git.apache.org>.

Github user xwu0226 commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-157075099
  
    @yhuai The last test build passed. Do you know what might cause the previous errors? After resolving the conflicts, my own diff for this PR is still the same place, that passed test before. Hope it did not break anything. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156853749
  
    LGTM pending jenkins.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156944154
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45979/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156916431
  
    test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156868048
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45959/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154842123
  
    **[Test build #2014 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2014/consoleFull)** for PR 9542 at commit [`2658f28`](https://github.com/apache/spark/commit/2658f2808ede6512c625f3bb33823bc6492823d5).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156916753
  
    **[Test build #45979 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45979/consoleFull)** for PR 9542 at commit [`eeaa6b6`](https://github.com/apache/spark/commit/eeaa6b6eed2812b879911aba03922ec305459b88).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-157083196
  
    @xwu0226 https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45977/consoleFull is good. I will merge it to master and branch 1.6.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156944153
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866993
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_t5")
    +
    +   // External parquet pointing to LOCATION
    +
    +    val location3 = Utils.getSparkClassLoader.getResource("data/files/external_parquet").getFile
    +    sql(s"""CREATE EXTERNAL table external_parquet(c1 int, c2 int)
    +        stored as parquet
    +        LOCATION '$location3'""")
    --- End diff --
    
    https://github.com/apache/spark/pull/9542/files#r44866981


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by xwu0226 <gi...@git.apache.org>.

Github user xwu0226 commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156900163
  
    @yhuai Is it mergable?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by xwu0226 <gi...@git.apache.org>.

Github user xwu0226 commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-155931255
  
    @rxin or @squito , what do you think about the fix? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156842519
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156850658
  
    **[Test build #45959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45959/consoleFull)** for PR 9542 at commit [`4481c82`](https://github.com/apache/spark/commit/4481c82a98af62cc4d46d2f07c4d728236bf6d83).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156842522
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45956/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156900365
  
    @xwu0226 Can you resolve the conflict? Once you update the pr and jenkins is good, I will merge it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866992
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    --- End diff --
    
    https://github.com/apache/spark/pull/9542/files#r44866981


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866981
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    --- End diff --
    
    Can we use the following format?
    ```
    sql(
      s"""
      ....
      """)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by squito <gi...@git.apache.org>.

Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154907139
  
    Jenkins, ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156915725
  
    **[Test build #45970 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45970/consoleFull)** for PR 9542 at commit [`83b1c77`](https://github.com/apache/spark/commit/83b1c7713a536f80062df7586af31e21d631cdd7).
     * This patch **fails PySpark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by xwu0226 <gi...@git.apache.org>.

Github user xwu0226 commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156845293
  
    Accidentially pushed another JIRA's code together. . I am backing it out


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866968
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_t5")
    +
    +   // External parquet pointing to LOCATION
    +
    +    val location3 = Utils.getSparkClassLoader.getResource("data/files/external_parquet").getFile
    +    sql(s"""CREATE EXTERNAL table external_parquet(c1 int, c2 int)
    +        stored as parquet
    +        LOCATION '$location3'""")
    +
    +    val answer3 = sql("SELECT input_file_name() as file FROM external_parquet")
    +      .head().getString(0)
    +    assert(answer3.contains("external_parquet"))
    +    assert(sql("SELECT input_file_name() as file FROM external_parquet")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_parquet")
    +
    +    // Non-External parquet pointing to /tmp/...
    --- End diff --
    
    Seems we do not need to say where it points to since it is a managed table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156623732
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156946393
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45977/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866996
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala ---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_t5")
    +
    +   // External parquet pointing to LOCATION
    +
    +    val location3 = Utils.getSparkClassLoader.getResource("data/files/external_parquet").getFile
    +    sql(s"""CREATE EXTERNAL table external_parquet(c1 int, c2 int)
    +        stored as parquet
    +        LOCATION '$location3'""")
    +
    +    val answer3 = sql("SELECT input_file_name() as file FROM external_parquet")
    +      .head().getString(0)
    +    assert(answer3.contains("external_parquet"))
    +    assert(sql("SELECT input_file_name() as file FROM external_parquet")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_parquet")
    +
    +    // Non-External parquet pointing to /tmp/...
    +
    +    sql("CREATE table internal_parquet_tmp(c1 int, c2 int) " +
    +      " stored as parquet " +
    +      " as select 1, 2")
    +
    +    val answer4 = sql("SELECT input_file_name() as file FROM internal_parquet_tmp")
    +      .head().getString(0)
    +    assert(answer4.contains("internal_parquet_tmp"))
    +    assert(sql("SELECT input_file_name() as file FROM internal_parquet_tmp")
    +      .distinct().collect().length == 1)
    --- End diff --
    
    https://github.com/apache/spark/pull/9542/files#r44866986


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-154907221
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by xwu0226 <gi...@git.apache.org>.

Github user xwu0226 commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156869030
  
    @yhuai I did not know that we should not update the resources/data directory.. I thought the test data files were added along the way by contributors. Thanks for pointing it out! Let me update HiveUDFSuite then. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44848151
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
    @@ -213,6 +213,12 @@ class HadoopRDD[K, V](
     
           val inputMetrics = context.taskMetrics.getInputMetricsForReadMethod(DataReadMethod.Hadoop)
     
    +      // Sets the thread local variable for the file's name
    +      split.inputSplit.value match {
    +        case fs: FileSplit => SqlNewHadoopRDD.setInputFileName(fs.getPath.toString)
    +        case _ => SqlNewHadoopRDD.unsetInputFileName()
    +      }
    --- End diff --
    
    Can you call `SqlNewHadoopRDD.unsetInputFileName()` in https://github.com/apache/spark/pull/9542/files#diff-83eb37f7b0ebed3c14ccb7bff0d577c2R257? Like what we do in `SqlNewHadoopRDD`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156944117
  
    **[Test build #45979 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45979/consoleFull)** for PR 9542 at commit [`eeaa6b6`](https://github.com/apache/spark/commit/eeaa6b6eed2812b879911aba03922ec305459b88).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9542#issuecomment-156915756
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45970/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org