You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by joseph-torres <gi...@git.apache.org> on 2017/08/04 03:56:23 UTC

[GitHub] spark pull request #18840: [SPARK-21565] Propagate metadata in attribute rep...

GitHub user joseph-torres opened a pull request:

    https://github.com/apache/spark/pull/18840

    [SPARK-21565] Propagate metadata in attribute replacement.

    ## What changes were proposed in this pull request?
    
    Propagate metadata in attribute replacement during streaming execution. This is necessary for EventTimeWatermarks consuming replaced attributes.
    
    ## How was this patch tested?
    new unit test, which was verified to fail before the fix

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/joseph-torres/spark SPARK-21565

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18840.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18840
    
----
commit e54d81200569c2260f0995b2f91aa9829dc10ad7
Author: Jose Torres <jo...@databricks.com>
Date:   2017-08-04T03:52:57Z

    Propagate metadata in attribute replacement.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565][SS] Propagate metadata in attribute replac...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80356/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565][SS] Propagate metadata in attribute replac...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    **[Test build #80356 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80356/testReport)** for PR 18840 at commit [`578e26d`](https://github.com/apache/spark/commit/578e26d16efaa9e5720124963e9d8a49a64fcf40).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80233/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18840: [SPARK-21565][SS] Propagate metadata in attribute...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18840


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18840: [SPARK-21565] Propagate metadata in attribute rep...

Posted by joseph-torres <gi...@git.apache.org>.
Github user joseph-torres commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18840#discussion_r131707863
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/EventTimeWatermarkSuite.scala ---
    @@ -391,6 +391,30 @@ class EventTimeWatermarkSuite extends StreamTest with BeforeAndAfter with Matche
         checkDataset[Long](df, 1L to 100L: _*)
       }
     
    +  test("SPARK-21565: watermark operator accepts attributes from replacement") {
    +    withTempDir { dir =>
    +      dir.delete()
    +
    +      val df = Seq(("a", 100.0, new java.sql.Timestamp(100L)))
    +        .toDF("symbol", "price", "eventTime")
    +      df.write.json(dir.getCanonicalPath)
    +
    +      val input = spark.readStream.schema(df.schema)
    +        .json(dir.getCanonicalPath)
    +
    +      val groupEvents = input
    +        .withWatermark("eventTime", "2 seconds")
    +        .groupBy("symbol", "eventTime")
    +        .agg(count("price") as 'count)
    +        .select("symbol", "eventTime", "count")
    +      val q = groupEvents.writeStream
    +        .outputMode("append")
    +        .format("console")
    +        .start()
    +      q.processAllAvailable()
    --- End diff --
    
    Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    **[Test build #80233 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80233/testReport)** for PR 18840 at commit [`e54d812`](https://github.com/apache/spark/commit/e54d81200569c2260f0995b2f91aa9829dc10ad7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    LGTM pending tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    **[Test build #80233 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80233/testReport)** for PR 18840 at commit [`e54d812`](https://github.com/apache/spark/commit/e54d81200569c2260f0995b2f91aa9829dc10ad7).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565][SS] Propagate metadata in attribute replac...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    Thanks! Merging to master and branch-2.2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    **[Test build #80356 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80356/testReport)** for PR 18840 at commit [`578e26d`](https://github.com/apache/spark/commit/578e26d16efaa9e5720124963e9d8a49a64fcf40).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565][SS] Propagate metadata in attribute replac...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18840: [SPARK-21565] Propagate metadata in attribute replacemen...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/18840
  
    @joseph-torres could you change the PR title to "[SPARK-21565]**[SS]** Propagate metadata in attribute replacement"? We usually put the module name in the PR title.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18840: [SPARK-21565] Propagate metadata in attribute rep...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18840#discussion_r131309601
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/EventTimeWatermarkSuite.scala ---
    @@ -391,6 +391,30 @@ class EventTimeWatermarkSuite extends StreamTest with BeforeAndAfter with Matche
         checkDataset[Long](df, 1L to 100L: _*)
       }
     
    +  test("SPARK-21565: watermark operator accepts attributes from replacement") {
    +    withTempDir { dir =>
    +      dir.delete()
    +
    +      val df = Seq(("a", 100.0, new java.sql.Timestamp(100L)))
    +        .toDF("symbol", "price", "eventTime")
    +      df.write.json(dir.getCanonicalPath)
    +
    +      val input = spark.readStream.schema(df.schema)
    +        .json(dir.getCanonicalPath)
    +
    +      val groupEvents = input
    +        .withWatermark("eventTime", "2 seconds")
    +        .groupBy("symbol", "eventTime")
    +        .agg(count("price") as 'count)
    +        .select("symbol", "eventTime", "count")
    +      val q = groupEvents.writeStream
    +        .outputMode("append")
    +        .format("console")
    +        .start()
    +      q.processAllAvailable()
    --- End diff --
    
    nit: `q.processAllAvailable()` ->
    ```
    try {
      q.processAllAvailable()
    } finally {
      q.stop()
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org