You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jerryshao <gi...@git.apache.org> on 2017/11/03 01:53:31 UTC

[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

GitHub user jerryshao opened a pull request:

    https://github.com/apache/spark/pull/19649

    [SPARK-22405][SQL] Add more ExternalCatalogEvent

    ## What changes were proposed in this pull request?
    
    We're building a data lineage tool in which we need to monitor the metadata changes in ExternalCatalog, current ExternalCatalog already provides several useful events like "CreateDatabaseEvent" for custom SparkListener to use. But still there's some event missing, like alter database event and alter table event. So here propose to and new ExternalCatalogEvent.
    
    ## How was this patch tested?
    
    Enrich the current UT and tested on local cluster.
    
    CC @hvanhovell please let me know your comments about current proposal, thanks.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jerryshao/apache-spark SPARK-22405

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19649.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19649
    
----
commit 5c628be6b6838b224a27e06731f686a5182e1bad
Author: jerryshao <ss...@hortonworks.com>
Date:   2017-11-03T01:48:48Z

    Add more ExternalCatalogEvent

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r148783615
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala ---
    @@ -158,7 +173,13 @@ abstract class ExternalCatalog
        * @param table Name of table to alter schema for
        * @param newDataSchema Updated data schema to be used for the table.
        */
    -  def alterTableDataSchema(db: String, table: String, newDataSchema: StructType): Unit
    +  final def alterTableDataSchema(db: String, table: String, newDataSchema: StructType): Unit = {
    +    postToAll(AlterTableSchemaPreEvent(db, table))
    --- End diff --
    
    shall we carry the new schema?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149315470
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/events.scala ---
    @@ -110,7 +120,31 @@ case class RenameTableEvent(
       extends TableEvent
     
     /**
    - * Event fired when a function is created, dropped or renamed.
    + * Enumeration to indicate which part of table is altered. If a plain alterTable API is called, then
    + * type will generally be Table.
    + */
    +object AlterTableKind extends Enumeration {
    +  val Table, DataSchema, Stats = Value
    --- End diff --
    
    String is better for backward compatibility, but easier to make mistake. I don't have a strong preference, cc @hvanhovell @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83373/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83375/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    **[Test build #83375 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83375/testReport)** for PR 19649 at commit [`a3867b7`](https://github.com/apache/spark/commit/a3867b78c1a64f7d3196aaef6ab63db740dcc758).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149009631
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogEventSuite.scala ---
    @@ -104,6 +109,8 @@ class ExternalCatalogEventSuite extends SparkFunSuite {
           tableType = CatalogTableType.MANAGED,
           storage = storage,
           schema = new StructType().add("id", "long"))
    +    val tableDefWithSparkVersion =
    --- End diff --
    
    Sorry this was from my original code, will update it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add new alter table and alter databas...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    **[Test build #83617 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83617/testReport)** for PR 19649 at commit [`6b4fcff`](https://github.com/apache/spark/commit/6b4fcff9288ab3942f026dbdb053c69a0fdb31b7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add new alter table and alter databas...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r148810633
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
    @@ -619,8 +619,10 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat
         }
       }
     
    -  override def alterTableDataSchema(
    -      db: String, table: String, newDataSchema: StructType): Unit = withClient {
    +  override def doAlterTableDataSchema(
    --- End diff --
    
    Please add description of this method.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r148783570
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala ---
    @@ -147,7 +154,15 @@ abstract class ExternalCatalog
        * Note: If the underlying implementation does not support altering a certain field,
        * this becomes a no-op.
        */
    -  def alterTable(tableDefinition: CatalogTable): Unit
    +  final def alterTable(tableDefinition: CatalogTable): Unit = {
    --- End diff --
    
    I'd suggest we leave it for now and watch other `alterTableXXX` events instead. I feel it's an overkill to have a heavy `alterTable` method to handling all table metadata updating, I think we will add more and more fine-grained alter table methods in the future, like `alterTableStats`, `alterTableDataSchema`, etc. and eventually this `alterTable` method will go away.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by wzhfy <gi...@git.apache.org>.
Github user wzhfy commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r148925392
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala ---
    @@ -147,7 +154,15 @@ abstract class ExternalCatalog
        * Note: If the underlying implementation does not support altering a certain field,
        * this becomes a no-op.
        */
    -  def alterTable(tableDefinition: CatalogTable): Unit
    +  final def alterTable(tableDefinition: CatalogTable): Unit = {
    --- End diff --
    
    This depends on which level of details we wanna collect in the event. Are there any guidelines or documentation of what events spark should monitor? 
    Besides, partition changes are missing, I think it's necessary to also monitor these changes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149845572
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/events.scala ---
    @@ -62,6 +62,16 @@ case class DropDatabasePreEvent(database: String) extends DatabaseEvent
     case class DropDatabaseEvent(database: String) extends DatabaseEvent
     
     /**
    + * Event fired before a database is altered.
    + */
    +case class AlterDatabasePreEvent(database: String) extends DatabaseEvent
    --- End diff --
    
    Sure, I will update the title.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149003803
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala ---
    @@ -147,7 +154,15 @@ abstract class ExternalCatalog
        * Note: If the underlying implementation does not support altering a certain field,
        * this becomes a no-op.
        */
    -  def alterTable(tableDefinition: CatalogTable): Unit
    +  final def alterTable(tableDefinition: CatalogTable): Unit = {
    --- End diff --
    
    @cloud-fan , since now we expose `alterTable` interface for other components to leverage, if we don't track this, then looks like we missed a piece of `ExternalCatalogEvent`s. I think for now we can add this `AlterTableEvent`, later on if we removed this method, then we can make this event a no-op (only kept for compatibility), what do you think?
    
    @wzhfy , I was thinking to add partition related events, but I'm not clearly sure why this whole piece is missing and is it necessary to add partition related events? If we have an agreement on such events, I'm OK to add them.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add new alter table and alter databas...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83617/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    **[Test build #83476 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83476/testReport)** for PR 19649 at commit [`8eef970`](https://github.com/apache/spark/commit/8eef9701729a64152dbee32abed7a1f958041bf3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149004933
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala ---
    @@ -158,7 +173,13 @@ abstract class ExternalCatalog
        * @param table Name of table to alter schema for
        * @param newDataSchema Updated data schema to be used for the table.
        */
    -  def alterTableDataSchema(db: String, table: String, newDataSchema: StructType): Unit
    +  final def alterTableDataSchema(db: String, table: String, newDataSchema: StructType): Unit = {
    +    postToAll(AlterTableSchemaPreEvent(db, table))
    --- End diff --
    
    For me I think it is not so necessary to carry the new schema, we can query the catalog by `db` and `table` to get this newly set schema.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149765194
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/events.scala ---
    @@ -62,6 +62,16 @@ case class DropDatabasePreEvent(database: String) extends DatabaseEvent
     case class DropDatabaseEvent(database: String) extends DatabaseEvent
     
     /**
    + * Event fired before a database is altered.
    + */
    +case class AlterDatabasePreEvent(database: String) extends DatabaseEvent
    --- End diff --
    
    Hi, @jerryshao .
    We are adding `AlterTableEvent` and `AlterDatabaseEvent`. Can we have a more specific PR title instead of `Add more ExternalCatalogEvent`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    not sure, but maybe do it in a new PR?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Looks good, one small question.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    One question as mentioned above also, do we need to track partition related events? @cloud-fan  @hvanhovell @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add new alter table and alter databas...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149086543
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/events.scala ---
    @@ -110,7 +120,27 @@ case class RenameTableEvent(
       extends TableEvent
     
     /**
    - * Event fired when a function is created, dropped or renamed.
    + * Event fired before a table is altered.
    + */
    +case class AlterTablePreEvent(database: String, name: String) extends TableEvent
    +
    +/**
    + * Event fired after a table is altered.
    + */
    +case class AlterTableEvent(database: String, name: String) extends TableEvent
    --- End diff --
    
    Maybe we can have one event for all `alterTableXXX`, and add one more parameter to indicate which part of the table is altered. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add new alter table and alter databas...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    **[Test build #83617 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83617/testReport)** for PR 19649 at commit [`6b4fcff`](https://github.com/apache/spark/commit/6b4fcff9288ab3942f026dbdb053c69a0fdb31b7).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    **[Test build #83538 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83538/testReport)** for PR 19649 at commit [`c79c314`](https://github.com/apache/spark/commit/c79c31452d285c3b36b8adda55b9daccd6cea0d4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AlterTablePreEvent(`
      * `case class AlterTableEvent(`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    **[Test build #83538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83538/testReport)** for PR 19649 at commit [`c79c314`](https://github.com/apache/spark/commit/c79c31452d285c3b36b8adda55b9daccd6cea0d4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    **[Test build #83476 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83476/testReport)** for PR 19649 at commit [`8eef970`](https://github.com/apache/spark/commit/8eef9701729a64152dbee32abed7a1f958041bf3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AlterTableDataSchemaPreEvent(database: String, name: String) extends TableEvent`
      * `case class AlterTableDataSchemaEvent(database: String, name: String) extends TableEvent`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149314315
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/events.scala ---
    @@ -110,7 +120,31 @@ case class RenameTableEvent(
       extends TableEvent
     
     /**
    - * Event fired when a function is created, dropped or renamed.
    + * Enumeration to indicate which part of table is altered. If a plain alterTable API is called, then
    + * type will generally be Table.
    + */
    +object AlterTableKind extends Enumeration {
    +  val Table, DataSchema, Stats = Value
    --- End diff --
    
    shall we just use string?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r148810142
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogEventSuite.scala ---
    @@ -104,6 +109,8 @@ class ExternalCatalogEventSuite extends SparkFunSuite {
           tableType = CatalogTableType.MANAGED,
           storage = storage,
           schema = new StructType().add("id", "long"))
    +    val tableDefWithSparkVersion =
    --- End diff --
    
    Where do we need this val?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149315115
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/events.scala ---
    @@ -110,7 +120,31 @@ case class RenameTableEvent(
       extends TableEvent
     
     /**
    - * Event fired when a function is created, dropped or renamed.
    + * Enumeration to indicate which part of table is altered. If a plain alterTable API is called, then
    + * type will generally be Table.
    + */
    +object AlterTableKind extends Enumeration {
    +  val Table, DataSchema, Stats = Value
    --- End diff --
    
    I'm OK to use String, but I'd prefer strong type to avoid nasty issues.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83476/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    **[Test build #83375 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83375/testReport)** for PR 19649 at commit [`a3867b7`](https://github.com/apache/spark/commit/a3867b78c1a64f7d3196aaef6ab63db740dcc758).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add new alter table and alter ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19649


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r148781384
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/events.scala ---
    @@ -110,7 +122,27 @@ case class RenameTableEvent(
       extends TableEvent
     
     /**
    - * Event fired when a function is created, dropped or renamed.
    + * Event fired before a table is altered.
    + */
    +case class AlterTablePreEvent(database: String, name: String) extends TableEvent
    +
    +/**
    + * Event fired after a table is altered.
    + */
    +case class AlterTableEvent(database: String, name: String) extends TableEvent
    +
    +/**
    + * Event fired before table schema is altered.
    + */
    +case class AlterTableSchemaPreEvent(database: String, name: String) extends TableEvent
    --- End diff --
    
    Do we need a separate event for alter schema?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    lgtm


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19649
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83538/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r149086112
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala ---
    @@ -158,7 +173,13 @@ abstract class ExternalCatalog
        * @param table Name of table to alter schema for
        * @param newDataSchema Updated data schema to be used for the table.
        */
    -  def alterTableDataSchema(db: String, table: String, newDataSchema: StructType): Unit
    +  final def alterTableDataSchema(db: String, table: String, newDataSchema: StructType): Unit = {
    --- End diff --
    
    how about `alterTableStats`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r148931286
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/events.scala ---
    @@ -17,6 +17,8 @@
     package org.apache.spark.sql.catalyst.catalog
     
     import org.apache.spark.scheduler.SparkListenerEvent
    +import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec
    +import org.apache.spark.sql.types.StructType
    --- End diff --
    
    I didn't see `StructType` and `TablePartitionSpec` are used below.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add more ExternalCatalogEvent

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19649#discussion_r148782419
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala ---
    @@ -158,7 +173,13 @@ abstract class ExternalCatalog
        * @param table Name of table to alter schema for
        * @param newDataSchema Updated data schema to be used for the table.
        */
    -  def alterTableDataSchema(db: String, table: String, newDataSchema: StructType): Unit
    +  final def alterTableDataSchema(db: String, table: String, newDataSchema: StructType): Unit = {
    +    postToAll(AlterTableSchemaPreEvent(db, table))
    --- End diff --
    
    `AlterTableDataSchemaXXX`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org