You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2016/05/17 22:06:50 UTC

[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/13156

    [SPARK-15367] [SQL] Add refreshTable back

    #### What changes were proposed in this pull request?
    `refreshTable` was a method in `HiveContext`. It was deleted accidentally while we were migrating the APIs. This PR is to add it back to `HiveContext`. 
    
    In addition, in `SparkSession`, we put it under the catalog namespace (`SparkSession.catalog.refreshTable`).
    
    #### How was this patch tested?
    Changed the existing test cases to use the function `refreshTable`

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark refreshTable

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13156.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13156
    
----
commit b1cd1c670adbd0db3dcb82831b4aacae514c37f1
Author: gatorsmile <ga...@gmail.com>
Date:   2016-05-17T20:38:22Z

    initial fix

commit bdd7c61b9d1d7fb1b839485c10f82300304860c1
Author: gatorsmile <ga...@gmail.com>
Date:   2016-05-17T21:14:58Z

    fix.

commit e3564d5dff530ce84a28d7ed90a4ff4bac7de46b
Author: gatorsmile <ga...@gmail.com>
Date:   2016-05-17T21:29:28Z

    fix again.

commit c3f3f0b481c5a3fe3b2485ab0d73194dd7898911
Author: gatorsmile <ga...@gmail.com>
Date:   2016-05-17T22:03:33Z

    revert it back

commit 9e6c4b7d8ef035a63f6c3c219950be326f2e8357
Author: gatorsmile <ga...@gmail.com>
Date:   2016-05-17T22:04:01Z

    Merge remote-tracking branch 'upstream/master' into refreshTable

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63643173
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala ---
    @@ -622,7 +622,7 @@ class MetastoreDataSourcesSuite extends QueryTest with SQLTestUtils with TestHiv
             .mode(SaveMode.Append)
             .saveAsTable("arrayInParquet")
     
    -      sessionState.refreshTable("arrayInParquet")
    +      sparkSession.catalog.refreshTable("arrayInParquet")
    --- End diff --
    
    As we don't call `refreshTable` through `SessionState`, do we still need to keep `SessionState.refreshTable`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63760124
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala ---
    @@ -163,6 +163,9 @@ private[sql] class SessionState(sparkSession: SparkSession) {
       def executePlan(plan: LogicalPlan): QueryExecution = new QueryExecution(sparkSession, plan)
     
       def refreshTable(tableName: String): Unit = {
    +    // Different from SparkSession.catalog.refreshTable, this API only refreshes the metadata.
    +    // It does not reload the cached data. That means, if this table is cached as
    +    // an InMemoryRelation, we do not refresh the cached data.
    --- End diff --
    
    Let me know if we need to remove the API `refreshTable` in `SessionState`. So far, it is not being used by any test case. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63732945
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    --- End diff --
    
    Sure, will do it soon. Then, the new behavior will be different from what Spark 1.6 behaves. However, I think we should keep two interfaces (SQL and API) consistent. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220458174
  
    I see. `spark.sessionState.invalidateTable` already exists. They have the same implementation. Thus, I will just remove `spark.sessionState.refreshTable`? Let me do it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63817417
  
    --- Diff: sql/hivecontext-compatibility/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala ---
    @@ -58,4 +58,16 @@ class HiveContext private[hive](
         sparkSession.sharedState.asInstanceOf[HiveSharedState]
       }
     
    +  /**
    +   * Invalidate and refresh all the cached the metadata of the given table. For performance reasons,
    +   * Spark SQL or the external data source library it uses might cache certain metadata about a
    +   * table, such as the location of blocks. When those change outside of Spark SQL, users should
    +   * call this function to invalidate the cache.
    +   *
    +   * @since 1.3.0
    +   */
    +  def refreshTable(tableName: String): Unit = {
    --- End diff --
    
    if `invalidateTable` has different meaning than `refreshTable`, should we also add it to `HiveContext`? cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220212724
  
    **[Test build #58831 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58831/consoleFull)** for PR 13156 at commit [`2b773b8`](https://github.com/apache/spark/commit/2b773b823672199a685e765f5345ceb6584eb3d8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEach `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-219931474
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220517365
  
    **[Test build #58940 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58940/consoleFull)** for PR 13156 at commit [`20d5055`](https://github.com/apache/spark/commit/20d50556c6a3a4ca2d69f961822a2bb058edbbec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220201874
  
    **[Test build #58831 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58831/consoleFull)** for PR 13156 at commit [`2b773b8`](https://github.com/apache/spark/commit/2b773b823672199a685e765f5345ceb6584eb3d8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63647706
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    --- End diff --
    
    The test cases modified by this PR are used to verify `refreshTable` APIs. We also have test cases to verify the corresponding SQL interface, which is calling RefreshTable Command. For example, 
    https://github.com/apache/spark/blob/d8a83a564ff3fd0281007adbf8aa3757da8a2c2b/sql/hive/src/test/scala/org/apache/spark/sql/hive/CachedTableSuite.scala#L164-L207
    
    Now, to test the pure `HiveContext`, the only way we can do is to add a test case in `sql/hivecontext-compatibility`. 
    
    Not sure if this can answer your question. Let me know if you have any concern


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63937901
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala ---
    @@ -163,6 +163,9 @@ private[sql] class SessionState(sparkSession: SparkSession) {
       def executePlan(plan: LogicalPlan): QueryExecution = new QueryExecution(sparkSession, plan)
     
       def refreshTable(tableName: String): Unit = {
    +    // Different from SparkSession.catalog.refreshTable, this API only refreshes the metadata.
    +    // It does not reload the cached data. That means, if this table is cached as
    +    // an InMemoryRelation, we do not refresh the cached data.
    --- End diff --
    
    this is super confusing, the fact that `spark.catalog.refreshTable` and `spark.sessionState.refreshTable` do different things. Should we just rename this to `invalidateTable` along with `HiveMetastoreCatalog.refreshTable`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63615059
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MultiDatabaseSuite.scala ---
    @@ -202,7 +202,8 @@ class MultiDatabaseSuite extends QueryTest with SQLTestUtils with TestHiveSingle
     
             activateDatabase(db) {
               sql(
    -            s"""CREATE EXTERNAL TABLE t (id BIGINT)
    +            s"""
    +               |CREATE EXTERNAL TABLE t (id BIGINT)
    --- End diff --
    
    Will revert them back. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220461425
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220155088
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58808/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63808728
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    +    val _hc = hc
    +    import _hc.implicits._
    +
    +    withTempPath { tempDir =>
    --- End diff --
    
    Sure, let me remove it. : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220515502
  
    LGTM, pending jenkins


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63820600
  
    --- Diff: sql/hivecontext-compatibility/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala ---
    @@ -58,4 +58,16 @@ class HiveContext private[hive](
         sparkSession.sharedState.asInstanceOf[HiveSharedState]
       }
     
    +  /**
    +   * Invalidate and refresh all the cached the metadata of the given table. For performance reasons,
    +   * Spark SQL or the external data source library it uses might cache certain metadata about a
    +   * table, such as the location of blocks. When those change outside of Spark SQL, users should
    +   * call this function to invalidate the cache.
    +   *
    +   * @since 1.3.0
    +   */
    +  def refreshTable(tableName: String): Unit = {
    --- End diff --
    
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220155086
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220516949
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63729506
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala ---
    @@ -622,7 +622,7 @@ class MetastoreDataSourcesSuite extends QueryTest with SQLTestUtils with TestHiv
             .mode(SaveMode.Append)
             .saveAsTable("arrayInParquet")
     
    -      sessionState.refreshTable("arrayInParquet")
    +      sparkSession.catalog.refreshTable("arrayInParquet")
    --- End diff --
    
    Actually, invalidateTable and refreshTable do have different meanings. The current implementation of `HiveMetastoreCatalog.refreshTable` is `HiveMetastoreCatalog.invalidateTable` (and then we retrieve the new metadata lazily). But, it does not mean that `refreshTable` and `invalidateTable` have the same semantic. If we should remove any of `invalidateTable` or `refreshTable` should be discussed in a different thread.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220459388
  
    **[Test build #58905 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58905/consoleFull)** for PR 13156 at commit [`20d5055`](https://github.com/apache/spark/commit/20d50556c6a3a4ca2d69f961822a2bb058edbbec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63646713
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    --- End diff --
    
    Do we have test for `refreshTable`/RefreshTable before? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220212872
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63644169
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala ---
    @@ -622,7 +622,7 @@ class MetastoreDataSourcesSuite extends QueryTest with SQLTestUtils with TestHiv
             .mode(SaveMode.Append)
             .saveAsTable("arrayInParquet")
     
    -      sessionState.refreshTable("arrayInParquet")
    +      sparkSession.catalog.refreshTable("arrayInParquet")
    --- End diff --
    
    Actually, I also want to remove `invalidateTable`, which is a duplicate name of `refreshTable`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220529183
  
    thanks, merging to master and 2.0!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-219931325
  
    **[Test build #58740 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58740/consoleFull)** for PR 13156 at commit [`4ac3b76`](https://github.com/apache/spark/commit/4ac3b768e0b4720dcef86b910e89e31335390217).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63757822
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala ---
    @@ -126,24 +126,9 @@ case class RefreshTable(tableIdent: TableIdentifier)
       extends RunnableCommand {
     
       override def run(sparkSession: SparkSession): Seq[Row] = {
    -    // Refresh the given table's metadata first.
    -    sparkSession.sessionState.catalog.refreshTable(tableIdent)
    -
    -    // If this table is cached as a InMemoryColumnarRelation, drop the original
    -    // cached version and make the new version cached lazily.
    -    val logicalPlan = sparkSession.sessionState.catalog.lookupRelation(tableIdent)
    -    // Use lookupCachedData directly since RefreshTable also takes databaseName.
    -    val isCached = sparkSession.cacheManager.lookupCachedData(logicalPlan).nonEmpty
    -    if (isCached) {
    -      // Create a data frame to represent the table.
    -      // TODO: Use uncacheTable once it supports database name.
    -      val df = Dataset.ofRows(sparkSession, logicalPlan)
    -      // Uncache the logicalPlan.
    -      sparkSession.cacheManager.tryUncacheQuery(df, blocking = true)
    -      // Cache it again.
    -      sparkSession.cacheManager.cacheQuery(df, Some(tableIdent.table))
    -    }
    --- End diff --
    
    The above logics are moved to `sparkSession.catalog.refreshTable`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-219886323
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58725/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220528241
  
    **[Test build #58940 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58940/consoleFull)** for PR 13156 at commit [`20d5055`](https://github.com/apache/spark/commit/20d50556c6a3a4ca2d69f961822a2bb058edbbec).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220200787
  
    LGTM except the test stuff, thanks for working on it!  I agree that we should remove `refreshTable` in `SessionState`, but need someone to confirm, or we can do it in follow-up PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-219869025
  
    **[Test build #58725 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58725/consoleFull)** for PR 13156 at commit [`9e6c4b7`](https://github.com/apache/spark/commit/9e6c4b7d8ef035a63f6c3c219950be326f2e8357).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63613203
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MultiDatabaseSuite.scala ---
    @@ -202,7 +202,8 @@ class MultiDatabaseSuite extends QueryTest with SQLTestUtils with TestHiveSingle
     
             activateDatabase(db) {
               sql(
    -            s"""CREATE EXTERNAL TABLE t (id BIGINT)
    +            s"""
    +               |CREATE EXTERNAL TABLE t (id BIGINT)
    --- End diff --
    
    Seems these formatting changes are not necessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63727680
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    --- End diff --
    
    Looks like `RefreshTable` command is actually doing more work. I think we need to make `RefreshTable` and `sparkSession.catalog.refreshTable` have the same behavior. Can you make that change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220528428
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63719915
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    --- End diff --
    
    I see. `refreshTable` API is in `HiveContext`. I think we can just do a dummy call to verify if the API still exists but does not check the functionalities. Does that sound good to you? @cloud-fan @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-219918792
  
    **[Test build #58740 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58740/consoleFull)** for PR 13156 at commit [`4ac3b76`](https://github.com/apache/spark/commit/4ac3b768e0b4720dcef86b910e89e31335390217).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-219886108
  
    **[Test build #58725 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58725/consoleFull)** for PR 13156 at commit [`9e6c4b7`](https://github.com/apache/spark/commit/9e6c4b7d8ef035a63f6c3c219950be326f2e8357).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63615033
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
    @@ -294,6 +294,18 @@ class SQLContext private[sql](
         sparkSession.catalog.clearCache()
       }
     
    +  /**
    +   * Invalidate and refresh all the cached the metadata of the given table. For performance reasons,
    +   * Spark SQL or the external data source library it uses might cache certain metadata about a
    +   * table, such as the location of blocks. When those change outside of Spark SQL, users should
    +   * call this function to invalidate the cache.
    +   *
    +   * @since 1.3.0
    +   */
    +  def refreshTable(tableName: String): Unit = {
    +    sparkSession.catalog.refreshTable(tableName)
    +  }
    --- End diff --
    
    Sure, will remove it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220212875
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58831/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-219931477
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58740/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220461395
  
    **[Test build #58905 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58905/consoleFull)** for PR 13156 at commit [`20d5055`](https://github.com/apache/spark/commit/20d50556c6a3a4ca2d69f961822a2bb058edbbec).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220529510
  
    Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220461429
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58905/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220507152
  
    **[Test build #58934 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58934/consoleFull)** for PR 13156 at commit [`20d5055`](https://github.com/apache/spark/commit/20d50556c6a3a4ca2d69f961822a2bb058edbbec).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63808539
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    +    val _hc = hc
    +    import _hc.implicits._
    +
    +    withTempPath { tempDir =>
    --- End diff --
    
    Do we still need this test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220149990
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220513843
  
    retest this please



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63820108
  
    --- Diff: sql/hivecontext-compatibility/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala ---
    @@ -58,4 +58,16 @@ class HiveContext private[hive](
         sparkSession.sharedState.asInstanceOf[HiveSharedState]
       }
     
    +  /**
    +   * Invalidate and refresh all the cached the metadata of the given table. For performance reasons,
    +   * Spark SQL or the external data source library it uses might cache certain metadata about a
    +   * table, such as the location of blocks. When those change outside of Spark SQL, users should
    +   * call this function to invalidate the cache.
    +   *
    +   * @since 1.3.0
    +   */
    +  def refreshTable(tableName: String): Unit = {
    --- End diff --
    
    This class is for the compatibility purpose. Let's leave it as is. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63613130
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
    @@ -294,6 +294,18 @@ class SQLContext private[sql](
         sparkSession.catalog.clearCache()
       }
     
    +  /**
    +   * Invalidate and refresh all the cached the metadata of the given table. For performance reasons,
    +   * Spark SQL or the external data source library it uses might cache certain metadata about a
    +   * table, such as the location of blocks. When those change outside of Spark SQL, users should
    +   * call this function to invalidate the cache.
    +   *
    +   * @since 1.3.0
    +   */
    +  def refreshTable(tableName: String): Unit = {
    +    sparkSession.catalog.refreshTable(tableName)
    +  }
    --- End diff --
    
    This method did not exist in SQLContext. It's in HiveContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220528429
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58940/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63758568
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala ---
    @@ -622,7 +622,7 @@ class MetastoreDataSourcesSuite extends QueryTest with SQLTestUtils with TestHiv
             .mode(SaveMode.Append)
             .saveAsTable("arrayInParquet")
     
    -      sessionState.refreshTable("arrayInParquet")
    +      sparkSession.catalog.refreshTable("arrayInParquet")
    --- End diff --
    
    Got it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220500710
  
    **[Test build #58934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58934/consoleFull)** for PR 13156 at commit [`20d5055`](https://github.com/apache/spark/commit/20d50556c6a3a4ca2d69f961822a2bb058edbbec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220117679
  
    **[Test build #58805 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58805/consoleFull)** for PR 13156 at commit [`7142ef5`](https://github.com/apache/spark/commit/7142ef54fbc806c12859f2af152794af5d50ec72).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220507235
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220424613
  
    LGTM other than the renaming. We shouldn't have `spark.catalog.refreshTable` and `spark.sessionState.refreshTable` do different things. I would rename the latter to `invalidateTable` since that's what it's really doing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63758626
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    --- End diff --
    
    Done. Please review the latest code changes. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220149660
  
    **[Test build #58805 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58805/consoleFull)** for PR 13156 at commit [`7142ef5`](https://github.com/apache/spark/commit/7142ef54fbc806c12859f2af152794af5d50ec72).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220507239
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58934/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220121864
  
    **[Test build #58808 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58808/consoleFull)** for PR 13156 at commit [`8a52ac6`](https://github.com/apache/spark/commit/8a52ac608d1836e095cf83185be37a25696cf0c7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220154809
  
    **[Test build #58808 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58808/consoleFull)** for PR 13156 at commit [`8a52ac6`](https://github.com/apache/spark/commit/8a52ac608d1836e095cf83185be37a25696cf0c7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63675218
  
    --- Diff: sql/hivecontext-compatibility/src/test/scala/org/apache/spark/sql/hive/HiveContextCompatibilitySuite.scala ---
    @@ -99,4 +105,41 @@ class HiveContextCompatibilitySuite extends SparkFunSuite with BeforeAndAfterEac
         assert(databases3.toSeq == Seq("default"))
       }
     
    +  test("check change after refresh") {
    --- End diff --
    
    They share the same implementation and I think we don't need to test all of them.  cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13156


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13156#discussion_r63758414
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala ---
    @@ -163,6 +163,9 @@ private[sql] class SessionState(sparkSession: SparkSession) {
       def executePlan(plan: LogicalPlan): QueryExecution = new QueryExecution(sparkSession, plan)
     
       def refreshTable(tableName: String): Unit = {
    +    // Different from SparkSession.catalog.refreshTable, this API only refreshes the metadata.
    +    // It does not reload the cached data. That means, if this table is cached as
    +    // an InMemoryRelation, we do not refresh the cached data.
    --- End diff --
    
    `SharedState` can refresh the cached table data. In `SessionState`, we only can refresh the metadata. Thus, this API `refreshTable` only refresh the metadata


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220500330
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-220149993
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58805/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13156#issuecomment-219886319
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org