You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by JoshRosen <gi...@git.apache.org> on 2014/10/26 23:52:16 UTC

[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/2951

    [SPARK-3266] [WIP] Remove implementations from Java*Like traits.

    This PR addresses a Scala compiler bug ([SI-8905](https://issues.scala-lang.org/browse/SI-8905)) that was breaking some of the Spark Java APIs.  In a nutshell, it seems that methods whose implementations are inherited from generic traits sometimes have their type parameters erased to Object.  This was causing methods like `DoubleRDD.min()` to throw confusing NoSuchMethodErrors at runtime.
    
    The workaround is to move the implementations of these methods out of the trait.  Ideally, I'd like to just delete the trait and convert it into an abstract class, but this technically breaks binary compatibility because it removes an interface (see #2186).  Instead, I remove the method implementations from the traits and put them in AbstractJavaDStream and AbstractJavaRDD classes that inherit from their respective traits.  This leads to a bit of redundancy, but I think it's the cleanest general solution that maintains binary compatibility.
    
    I also improved the test coverage of the Java API.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark javarddlike-trait-bug-fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2951.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2951
    
----
commit 1b69b474da7732b45d04105e1aa719f0fe6375a4
Author: Josh Rosen <jo...@databricks.com>
Date:   2014-10-26T22:42:46Z

    [SPARK-3266] Remove implementations from Java*Like traits.
    
    This PR addresses a Scala compiler bug ([SI-8905](https://issues.scala-lang.org/browse/SI-8905)) that was breaking some of the Spark Java APIs.  In a nutshell, it seems that methods whose implementations are inherited from generic traits sometimes have their type parameters erased to Object.  This was causing methods like `DoubleRDD.min()` to throw confusing NoSuchMethodErrors at runtime.
    
    The workaround is to move the implementations of these methods out of the trait.  Ideally, I'd like to just delete the trait and convert it into an abstract class, but this technically breaks binary compatibility because it removes an interface (see #2186).  Instead, I remove the method implementations from the traits and put them in AbstractJavaDStream and AbstractJavaRDD classes that inherit from their respective traits.  This leads to a bit of redundancy, but I think it's the cleanest general solution that maintains binary compatibility.
    
    I also improved the test coverage of the Java API.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-63769300
  
    Alright, copied this over to https://issues.apache.org/jira/browse/SPARK-3266


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-60536376
  
      [Test build #22262 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22262/consoleFull) for   PR 2951 at commit [`d3ac65d`](https://github.com/apache/spark/commit/d3ac65d5823d7b86ebfe6ab5cf72dad943d3beda).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-60536413
  
      [Test build #22262 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22262/consoleFull) for   PR 2951 at commit [`d3ac65d`](https://github.com/apache/spark/commit/d3ac65d5823d7b86ebfe6ab5cf72dad943d3beda).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class JavaDoubleRDD(val srdd: RDD[scala.Double]) extends AbstractJavaRDD[JDouble, JavaDoubleRDD] `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-63769092
  
    @viper-kun No, I'm not actively working on this.  A pull request here would be _very_ welcome, since this is an annoying bug.  If you're planning to work on this, make sure to include the extra test cases that I added to `JavaAPISuite`; these tests should be useful regardless of what approach you taking to fixing these bugs.
    
    From a binary compatibility standpoint, it's important to keep the `Java*Like` interfaces since there's some code in the wild that uses these interfaces to abstract over the different implementations.  Removing default implementations from traits technically breaks compatibility for anyone who might have extended those traits, but I don't think that should be a huge concern / likely problem.
    
    If you want to avoid copying / moving everything around, then I think it would be sufficient to just identify the methods that are affected by this compiler bug and copy only those methods to each subclass.  We could maintain 100% binary compatibility with this approach, even for the obscure case where someone extended a trait, and it might make it easier to backport the fix to maintenance branches, but it also seems sort of risk-prone because someone might add new default implementations in the trait.
    
    For the sake of keeping the discussion in one place, let's chat about alternative designs on the JIRA ticket.  I'll unassign myself from it and copy this comment over there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-60536251
  
    Actually, this raises an interesting question about MiMa and our custom annotations: if I have a public class that contains public methods inherited from a class that's marked as `@DeveloperAPI`, do those methods become part of the inheriting class's public API?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen closed the pull request at:

    https://github.com/apache/spark/pull/2951


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-60536642
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22261/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-60536068
  
    Unfortunately, this is going to fail the MiMa checks:
    
    ```
    [info] spark-core: found 65 potential binary incompatibilities (filtered 293)
    [error]  * method cartesian(org.apache.spark.api.java.JavaRDDLike)org.apache.spark.api.java.JavaPairRDD in trait org.apache.spark.api.java.JavaRDDLike does not have a correspondent in new version
    [error]    filter with: ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.api.java.JavaRDDLike.cartesian")
    [error]  * method reduce(org.apache.spark.api.java.function.Function2)java.lang.Object in trait org.apache.spark.api.java.JavaRDDLike does not have a correspondent in new version
    [error]    filter with: ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.api.java.JavaRDDLike.reduce")
    [...]
    ```
    
    I guess this is because the fact that the trait contained default implementations was part of its API contract from a Scala point of view.  I think that this is irrelevant from Java's perspective, though, so I'm going to experiment with marking `Java*Like` as a developer APIs and purposely exposing `AbstractJava*` as public abstract classes so that changes of their public APIs will cause MiMa breakage.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by viper-kun <gi...@git.apache.org>.
Github user viper-kun commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-63768467
  
    are you still working on this?  i am working on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3266] [WIP] Remove implementations from...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2951#issuecomment-60536414
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22262/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org