You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by willb <gi...@git.apache.org> on 2014/08/31 19:09:23 UTC

[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

GitHub user willb opened a pull request:

    https://github.com/apache/spark/pull/2220

    SPARK-3329:  [SQL] Don't depend on Hive SET pair ordering.

    This fixes some possible spurious test failures in `HiveQuerySuite` by comparing sets of key-value pairs as sets, rather than as lists.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/willb/spark spark-3329

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2220.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2220
    
----
commit 36ff52aee8ac2d8da336a56df195e6093e8f7807
Author: William Benton <wi...@redhat.com>
Date:   2014-08-31T17:08:00Z

    Don't depend on Hive SET pair ordering.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54249254
  
    Hey @willb, thanks for looking into / fixing this!  Minor pattern matching suggestion only.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54303022
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19659/consoleFull) for   PR 2220 at commit [`3b3e205`](https://github.com/apache/spark/commit/3b3e205cfd2e794dbb1fdb5cfeae20b8535593af).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54916080
  
    Thanks for cleaning this up!  Since this passed tests before I'm going to merge to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-73394623
  
    @marmbrus @liancheng Can we backport this into `branch-1.1` (1.1.2)?  I've been observing a lot of flakiness in the "HiveQuerySuite.SET commands semantics for a HieContext" suite in `branch-1.1` and it sounds like this patch might fix it.  There are a couple of merge conflicts, but I think they should be simple to fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54289812
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19659/consoleFull) for   PR 2220 at commit [`3b3e205`](https://github.com/apache/spark/commit/3b3e205cfd2e794dbb1fdb5cfeae20b8535593af).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by concretevitamin <gi...@git.apache.org>.
Github user concretevitamin commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-53995880
  
    Ah, so this problem *was* fixed in PR #1514, but it seems like this [PR](https://github.com/apache/spark/commit/a7a9d14479ea6421513a962ff0f45cb969368bab#diff-48) reverted the change accidentally. I think it'd be good to re-adapt 1514's solution.
    
    /cc @aarondav @liancheng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2220


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54303443
  
    This failure (in `SparkSubmitSuite`) appears unrelated to my patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by concretevitamin <gi...@git.apache.org>.
Github user concretevitamin commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-53995569
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2220#discussion_r17030555
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala ---
    @@ -520,62 +520,69 @@ class HiveQuerySuite extends HiveComparisonTest {
         val testKey = "spark.sql.key.usedfortestonly"
         val testVal = "test.val.0"
         val nonexistentKey = "nonexistent"
    -
    +    val KV = "([^=]+)=([^=]*)".r
    +    def collectResults(rdd: SchemaRDD): Set[(String, String)] = 
    +      rdd.collect().map { 
    +        case Row(key: String, value: String) => key -> value 
    +        case Row(kv: String) => kv match {
    +          case KV(key, value) => key -> value
    --- End diff --
    
    Minor style nit: this could just be: `case Row(KV(key,value)) => ...` I believe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-53995653
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19543/consoleFull) for   PR 2220 at commit [`36ff52a`](https://github.com/apache/spark/commit/36ff52aee8ac2d8da336a56df195e6093e8f7807).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54165982
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19590/consoleFull) for   PR 2220 at commit [`6525d8e`](https://github.com/apache/spark/commit/6525d8ef79645abe3ed26e4e2ac6aa09bcd0b203).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54182324
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19590/consoleFull) for   PR 2220 at commit [`6525d8e`](https://github.com/apache/spark/commit/6525d8ef79645abe3ed26e4e2ac6aa09bcd0b203).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54695034
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19845/consoleFull) for   PR 2220 at commit [`3b3e205`](https://github.com/apache/spark/commit/3b3e205cfd2e794dbb1fdb5cfeae20b8535593af).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54699188
  
    **[Tests timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19845/consoleFull)** after     a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-54165469
  
    @concretevitamin I cherry-picked @aarondav's fix (and added a very simple fix to handle cases that it didn't).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-53996991
  
    Thanks, @concretevitamin!  I'll close this one then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2220#issuecomment-53998959
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19543/consoleFull) for   PR 2220 at commit [`36ff52a`](https://github.com/apache/spark/commit/36ff52aee8ac2d8da336a56df195e6093e8f7807).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org