You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rcsenkbeil <gi...@git.apache.org> on 2014/08/13 23:02:32 UTC

[GitHub] spark pull request: Added support for :cp that was broken in...

GitHub user rcsenkbeil opened a pull request:

    https://github.com/apache/spark/pull/1929

    Added support for :cp <jar> that was broken in Scala 2.10.x for REPL

    As seen with [SI-6502](https://issues.scala-lang.org/browse/SI-6502) of Scala, the _:cp_ command was broken in Scala 2.10.x. As the Spark shell is a friendly wrapper on top of the Scala REPL, it is also affected by this problem.
    
    My solution was to alter the internal classpath and invalidate any new entries. I also had to add the ability to add new entries to the parent classloader of the interpreter (SparkIMain's global).
    
    The advantage of this versus wiping the interpreter and replaying all of the commands is that you don't have to worry about rerunning heavy Spark-related commands (going to the cluster) or potentially reloading data that might have changed. Instead, you get to work from where you left off.
    
    Until this is fixed upstream for 2.10.x, I had to use reflection to alter the internal compiler classpath.
    
    The solution now looks like this:
    ![screen shot 2014-08-13 at 3 46 02 pm](https://cloud.githubusercontent.com/assets/2481802/3912625/f02b1440-232c-11e4-9bf6-bafb3e352d14.png)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rcsenkbeil/spark FixReplClasspathSupport

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1929.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1929
    
----
commit a220639475694418f010420b24960b53e81509e7
Author: Chip Senkbeil <rc...@us.ibm.com>
Date:   2014-08-13T20:47:33Z

    Added support for :cp <jar> that was broken in Scala 2.10.x for REPL

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53506313
  
    @mateiz I've added a hack that just runs a compilation error in silence before adding anything to allow this. It works as seen here (and doesn't interfere with anything else). I'll spend a little more time investigating, but this does some to work for Scala classes now.
    
    ![screen shot 2014-08-26 at 6 05 01 pm](https://cloud.githubusercontent.com/assets/2481802/4053181/d66a81d2-2d75-11e4-8f3e-ec22c954a4d7.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by thunterdb <gi...@git.apache.org>.
Github user thunterdb commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1929#discussion_r16207973
  
    --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkIMain.scala ---
    @@ -130,6 +131,9 @@ import org.apache.spark.util.Utils
         private var _classLoader: AbstractFileClassLoader = null                              // active classloader
         private val _compiler: Global                     = newCompiler(settings, reporter)   // our private compiler
     
    +    private trait ExposeAddUrl extends URLClassLoader { def addNewUrl(url: URL) = this.addURL(url) }
    --- End diff --
    
    You are mixing a trait with URLClassloader. is there a reason for not implementing a derived class?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1929#discussion_r16206386
  
    --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala ---
    @@ -711,22 +714,22 @@ class SparkILoop(in0: Option[BufferedReader], protected val out: JPrintWriter,
             added = true
             addedClasspath = ClassPath.join(addedClasspath, f.path)
             totalClasspath = ClassPath.join(settings.classpath.value, addedClasspath)
    +        intp.addUrlsToClassPath(f.toURI.toURL)
    --- End diff --
    
    This adds the series of jars to both our compile time and runtime classpaths.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53537357
  
    @mateiz I'd prefer not to merge this into branch-1.1 at this point unless you see a really compelling need. Scarred from Spark 1.0.0 which actually released with a major REPL bug (which was itself, an attempt to fix another bug!).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by heathermiller <gi...@git.apache.org>.
Github user heathermiller commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52683367
  
    Yep, so shortly after I posted my last message, I had a conference call with @gkossakowski and then chatted briefly with @mateiz. Here's an attempt to summarize our discussions. Important note: @gkossakowski is currently recovering from surgery on his wrists, so written responses from him ATM are precious :) (and we shouldn't push him for too many...)
    
    ### RE: point 2) 
    
    @gkossakowski and I discussed the issue of shadowing other jars on the classpath. 
    
    He explained a bit about the history of `invalidateClassPathEntries`; he was previously tasked to work on the resident compiler to enable faster incremental compilation, exploitable by both the Scala IDE and sbt. Thus, the functionality he was working on would have to manage large and complicated builds where shadowing and managing multiple versions of the same library would be a real challenge (he explained a bit the sort of maintenance that'd have to be done with Symbols, and all of the things that could go wrong, and it's really a nightmare). In the end, the approach was abandoned due to its complexity. `invalidateClassPathEntries` is a not-completely-baked artifact from this effort, and as far as I understand, no, it doesn't support any of the scenarios you list.
    
    However, my intuition (which could be wrong) was that, **for now**, this is likely less important for users of spark-shell. Rationale: you're manually adding jars to the classpath; it's unlikely that the list of jars one would add to the classpath be so numerous that you'd shadow something else that you've added. The real concern would be that someone adds a different version of a library that's already on the classpath – and you're right, for this we have no support as far as I can tell. I agree with you, that if folks are going to import jars at runtime, they should just _be aware_ of conflicting classes (at least importing jars at runtime would now once again be a possibility).
    
    @gkossakowski also assured me that for the rest of 2.10.x, logic that this PR touches most likely wouldn't fluctuate, so we shouldn't worry about any of this breaking for the rest of the 2.10.x series.
    
    The particularly important bit of @gkossakowski & I's conversation though centered on what spark-shell would do when migrating to 2.11.x when you can't even rely on a partial solution like `invalidateClassPathEntries`. In the end, the consensus was that, on the side of scala/scalac, we need at minimum some lightweight fix for [SI-6502](https://issues.scala-lang.org/browse/SI-6502) that's at least sufficient for spark-shell. 
    
    Right now, the idea would be to write a specification of what the bare minimum that the spark-shell/the Scala REPL should do when there are conflicts on the classpath. And, ideally, we could try to get this fix into one of the [next two releases of the 2.11.x series](https://github.com/scala/scala/milestones) (due dates: Sept 19, Nov 28) so that this fix doesn't have to live in Spark and depend on delicate scalac internals. 
    
    ### RE: point 1) 
    
    After chatting briefly with @mateiz, it was determined that the later-appended jars should take precedence (right Matei? or do I have that backwards?)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52651504
  
    @heathermiller Thanks for the feedback! It looks like @gkossakowski commented here: [SI-6052](https://issues.scala-lang.org/browse/SI-6502?focusedCommentId=70407&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-70407).
    
    So, it looks like issues that might not be tackled by the current pull request are as follows:
    
    1. what should happen when appended jar contains package that overlap with what's already on classpath?
    2. what should happen when appended jar contains a class that shadows existing class?
    
    At least, he makes me believe that the invalidateClassPaths doesn't handle that. While I didn't mention it in the pull request earlier, I made assumptions that entries added wouldn't overlap with the existing classpaths. If the feature is added later to properly handle this, even better. For now, it looks like this implementation works as long as you don't add conflicting classpath entries. @mateiz, I'm not sure if that is a small enough problem for you to want this or not.
    
    Personally, I haven't run into these issues with Apache libraries that I've used with the patch. Nor have I run into issues trying out different jars for testing such as oddballs like [JGoodies](http://www.jgoodies.com/) and [lwjgl](http://lwjgl.org/). Obviously, if someone wants to import jars at runtime, they need to be aware of conflicting classes.
    
    I did just test this with the following files:
    
    Placed inside TestJar.jar
    ```
    package com.ibm.testjar;
    
    public class TestClass {
        public void runMe() {
            System.out.println("I WAS RUN!");
        }
    }
    ```
    
    Placed inside TestJar2.jar
    ```
    package com.ibm.testjar;
    
    public class TestClass2 {
        public void runMe() {
            System.out.println("I WAS ALSO RUN!");
        }
    }
    ```
    
    And running yields something like this:
    ```
    scala> :cp /path/to/TestJar.jar
    
    scala> :cp /path/to/TestJar2.jar
    
    scala> import com.ibm.testjar._
    
    scala> new TestClass().runMe
    I WAS RUN!
    scala> new TestClass2().runMe
    I WAS ALSO RUN!
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53361512
  
    Actually, one follow-up: It seems that the patch *works* if you try to use the new class before calling `:cp` and fail to find it, but fails if you just call `:cp` first and then try to use the class. Here's a session that worked for me:
    ```
    scala> import test.Test
    <console>:10: error: not found: value test
           import test.Test
                  ^
    
    scala> :cp /Users/matei/workspace/jar-test/simple-jar-v1.jar
    Added '/Users/matei/workspace/jar-test/simple-jar-v1.jar'.  Your new classpath is:
    "/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/JObjC.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Ho
 me/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/System/Library/Java/Extensions/MRJToolkit.jar:/System/Library/Java/Extensions/QTJava.zip:/Users/matei/workspace/apache-spark/conf:/Users/matei/workspace/apache-spark/assembly/target/scala-2.10/spark-assembly-1.1.0-SNAPSHOT-hadoop1.0.4.jar:/Users/matei/workspace/jar-test/simple-jar-v1.jar"
    
    scala> import test.Test
    import test.Test
    
    scala> new Test
    res0: test.Test = Test2
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by heathermiller <gi...@git.apache.org>.
Github user heathermiller commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52532231
  
    So, this is isn't my area of expertise (I'm not sure who at Typesafe/EFPL is currently responsible for the development/maintenance of the REPL, thus I don't know who could comment on future plans/roadmap)
    
    However,
    
    1. I think the reliance of this patch on `invalidateClassPathEntries` (removed in scala/scala#3884) is inconsequential because I get from @gkossakowski’s comment (https://github.com/scala/scala/pull/3884#issuecomment-49277957) that this functionality can easily (and will most likely) be provided by the 2.11.x’s new classpath handling code. So, the approach taken in this PR is _probably_ still valid into 2.11. Though, obligatory to note that of course, we don't know yet what it'll look like, or precisely when it’ll hit, but since SI-6502 is "major" and it has an owner, I would bet on it being included in 2.11.x.
    
    2. There were some questions above about reflection – I second your Java reflection-based approach for two reasons. (1) As many know, Scala reflection is not thread-safe in 2.10.x, but perhaps more importantly (2) it’s an experimental API that is still fluctuating, and has fluctuated quite a bit so far between 2.10.x and 2.11.x. That said, it’s just a safer, stabler bet to rely on Java reflection if you can for now.
    
    Since I’m not the foremost expert on this stuff, I’m also going to ping @gkossakowski (our best bet haha) – hopefully he can confirm/deny/clarify these insights?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53360174
  
    Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53352042
  
    BTW this actual patch looks good to me for 2.10, so I'll probably merge it after a bit of manual testing. @pwendell should we put this in branch-1.1 as well? It is technically a bug fix, but not a critical one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53529843
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19293/consoleFull) for   PR 1929 at commit [`f420cbf`](https://github.com/apache/spark/commit/f420cbf00a5f98c8eec73d251ed1d6b9352ad063).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52309557
  
    @mateiz @som-snytt The summary I have is that this should work for 2.10.x as it doesn't appear like they are trickling the removal of the class invalidation from [scala/scala#3884](https://github.com/scala/scala/pull/3884). I've given more analysis to the internal classpath editing from 2.10 and I don't see any potential side effects from appending to the existing merged classpath.
    
    In terms of Spark moving to 2.11(.3 or higher) in the future, one option would be to simply include a trait mixed in with global that keeps the functionality they removed. From what I've seen, the only reason they removed it was because it wasn't used in the compiler or sbt, not that it had any issues. The underlying functionality wasn't broken, so I don't see any harm in doing that as a backup.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52128287
  
    Alright, it might be good to ping them on that JIRA to make sure this doesn't break subtle things. Although obviously it does seem to work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by heathermiller <gi...@git.apache.org>.
Github user heathermiller commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1929#discussion_r16749371
  
    --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkIMain.scala ---
    @@ -308,6 +312,56 @@ import org.apache.spark.util.Utils
           }
         }
     
    +    /**
    +     * Adds any specified jars to the compile and runtime classpaths.
    +     *
    +     * @note Currently only supports jars, not directories
    +     * @param urls The list of items to add to the compile and runtime classpaths
    +     */
    +    def addUrlsToClassPath(urls: URL*): Unit = {
    --- End diff --
    
    so if the first line in this method is:
    ```scala
    new Run  // force some initialization
    ```
    as the comment says, it forces some initialization, and the bugs below, **in Scala** (no idea about Java, but likely should be fine)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53512263
  
    Awesome! Great find, @heathermiller! I've added an additional commit to move to that solution. @mateiz, what do you think now? This is what Global itself does to reinitialize. Is that clean enough?
    
    FYI, I ran through tests with both Scala and Java classes (from within jars) and everything checks out.
    
    I also added the SparkContext.addJar call as you requested to make this more Sparkish.
    
    I tested this by starting up a standalone cluster using the `sbin/start-all.sh` script to launch a cluster with a single worker and connecting using 
    `./bin/spark-shell --master spark://rcsenkbe.local:7077`. 
    
    I'm hoping this demonstrates that the jars also work on the remote cluster. I don't have access to a remote Spark cluster at the moment.
    
    ![screen shot 2014-08-26 at 7 18 16 pm](https://cloud.githubusercontent.com/assets/2481802/4053710/62a9cdba-2d80-11e4-82a1-ace9c189b8cf.png)
    
    The Java implementation:
    ```
    package my.java;
    
    public class MyJavaClass {
        public int add(int x, int y) {
            return x + y;
        }
    }
    ```
    
    The Scala implementation:
    ```
    package my.scala
    
    class MyScalaClass {
        def add(x: Int, y: Int) = x + y
    }
    ```
    
    ![screen shot 2014-08-26 at 7 20 58 pm](https://cloud.githubusercontent.com/assets/2481802/4053711/62b0d6b4-2d80-11e4-9ea5-c23b19b7cac2.png)
    ![screen shot 2014-08-26 at 7 21 30 pm](https://cloud.githubusercontent.com/assets/2481802/4053712/62b1c0d8-2d80-11e4-938e-2d11edd3a86b.png)
    ![screen shot 2014-08-26 at 7 22 25 pm](https://cloud.githubusercontent.com/assets/2481802/4053714/62b79558-2d80-11e4-9aa4-cc6bc48cfeab.png)
    ![screen shot 2014-08-26 at 7 23 04 pm](https://cloud.githubusercontent.com/assets/2481802/4053713/62b6d88e-2d80-11e4-943b-62080993035e.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53512662
  
    Cool, thanks! Running a cluster locally sounds fine as a way to test it; it will go through the same codepath.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1929#discussion_r16206874
  
    --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkIMain.scala ---
    @@ -308,6 +312,56 @@ import org.apache.spark.util.Utils
           }
         }
     
    +    /**
    +     * Adds any specified jars to the compile and runtime classpaths.
    +     *
    +     * @note Currently only supports jars, not directories
    +     * @param urls The list of items to add to the compile and runtime classpaths
    +     */
    +    def addUrlsToClassPath(urls: URL*): Unit = {
    --- End diff --
    
    This is the publicly-available method I added to SparkIMain to allow you to add jars to the compile and runtime classpaths of the interpreter (global).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52223710
  
    I actually wasn't super concerned about new 2.10.x releases, since I don't think they'll change the REPL, but I'm just worried that there may be assumptions in the compiler that we're violating by adding these JARs. Maybe not, but it wouldn't hurt to ask them whether this is an okay fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3256] Added support for :cp that ...

Posted by gkossakowski <gi...@git.apache.org>.
Github user gkossakowski commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53823422
  
    @mateiz, we haven't been planning to touch REPL code and this is not trivial request so I'll have to ask around about this. I'll update this ticket once I know what our plans are.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3256] Added support for :cp that ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1929


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53312925
  
    @gkossakowski, thanks for the detailed reply. From my point of view, what we want when new JARs are added is for *earlier* JARs to take precedence. This is what makes the most sense. If you already instantiated an object from the previous version of the class and stored it in a variable, it's not possible for it to suddenly change class. So instead the effect should be the same as tacking on another JAR at the end of your classpath -- only classes that are not found in earlier JARs come from there. Would these semantics be possible to implement for 2.11?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by ScrapCodes <gi...@git.apache.org>.
Github user ScrapCodes commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53565268
  
    I have a easy test case which crashes the compiler.
    
    jar1 has a package repl1 and a case class A
    
    :cp jar1
    
    import repl1
    val a = repl1.A
    
    jar2 also has package repl1 and case class A and B
    val b = repl1.A
    
    b == a // will crash the compiler.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53529460
  
    test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53613031
  
    Hey Chip, one more thing, can you create an account on https://issues.apache.org/jira/browse/SPARK so I can assign this issue to you? Then update the pull request to say [SPARK-3256] in the title. I created this as an issue: https://issues.apache.org/jira/browse/SPARK-3256.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53375633
  
    Yeah, mine was in Scala. Very interesting that *any* compile error fixes this -- we can certainly try to have some hidden one, but it would be nice to understand what causes this ;).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53566931
  
    @ScrapCodes, this was the case that I said I didn't support since the likelihood of bringing in different jars with the same package structure and class names is rare for the repl and can be tricky to deal with in many cases with the current code base. This is where we would want something upstream in Scala as @gkossakowski has mentioned. Also, as he discussed, this would really be a feature to add new classes, not replace existing ones.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by gkossakowski <gi...@git.apache.org>.
Github user gkossakowski commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52694293
  
    Thanks @heathermiller for summing up my conversation.
    
    I realized that I introduced a bit of confusion with my [comment](https://issues.scala-lang.org/browse/SI-6502?focusedCommentId=70407&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-70407) on SI-6052. Let me clarify.
    
    The logic that is being removed by scala/scala#3884 does support the scenario of appending a jar to classpath. This is demonstrated by @rcsenkbeil's comment. However, that piece of logic promises much more. It promises to handle arbitrary changes to classpath, including removal and changes to single classes. However, it doesn't deliver on that promise. We discovered all sorts of subtle issues while working on resident compilation mode of scalac. This is not a real surprise: Scala compiler wasn't designed with resident compilation in mind.
    
    To sum it up: the `invalidateClassPathEntries` is half-baked implementation of a very ambitious goal. It comes with little documentation, zero tests and no clients using it. For that reason we decided to remove it.
    
    As I mentioned in SI-6052, we have a good chance to implement a small subset of what `invalidateClassPathEntries` does and that should be enough to satisfy Spark's needs. You guys don't need full-blown resident compilation logic just to load some jar.
    
    How do we determine what's the small subset? I said mentioned in SI-6502. The API that `:cp` command needs have to deal with two concerns:
    
      1. what should happen when appended jar contains package that overlap with what's already on classpath?
      2. what should happen when appended jar contains a class that shadows existing class?
    
    One might think, that the first point is not important because it's unlikely the same package will span several jars. That would be true if packages were flat. However, in Scala packages are *nested*. So if you have one jar with `org.apache.spark` package and another jar with `org.apache.commons` then you have overlapping packages. From Scala's point of view you have:
    
    ```
    # first jar
      org
        apache
          spark
    
    # second jar
      org
        apache
          commons
    ```
    
    When appending the second jar, you must merge org and apache packages. What you merge is contents of `PackageSymbol` which is a symbol that represents a single (nested) package from classpath. The `invalidateClassPathEntries` handles this scenario by clearing entire contents of symbol representing `org` and reloading a fresh entries from classpath and making sure that old symbol for `spark` package is not lost (so we have just one symbol representing `spark` package).
    
    The second point is extremely hard to implement properly. That's why I think the alternative api to current `invalidateClassPathEntries` should just abort in such case.
    
    To sum it up, here's what needs to happen for Scala 2.11.x:
    
      - define minimal classpath manipulation api that will be enough for Spark use case
      - write tests for scenarios outlined above
      - implement the minimal classpath manipulation (`invalidateClassPathEntries` can serve as inspiration)
      - make sure that the implementation abort on every case that is not supported; this way we'll avoid some weird compiler crashes caused by classpath and symbols getting out of sync
    
    I hope that helps a little bit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1929#discussion_r16208310
  
    --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkIMain.scala ---
    @@ -130,6 +131,9 @@ import org.apache.spark.util.Utils
         private var _classLoader: AbstractFileClassLoader = null                              // active classloader
         private val _compiler: Global                     = newCompiler(settings, reporter)   // our private compiler
     
    +    private trait ExposeAddUrl extends URLClassLoader { def addNewUrl(url: URL) = this.addURL(url) }
    --- End diff --
    
    I previously was using the following inline:
    
    ```
    new URLClassLoader(...) {
      def addNewUrl(url: URL) = this.addURL(url)
    }
    ```
    But I was unable to access the exposed method since it was creating an anonymous subclass. In terms of providing a trait versus a class that extends URLClassLoader, I figured a trait was a lighter way to mix in the needed exposure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53361297
  
    BTW, the problem happens even if I call `new test.Test` before creating it through reflection first. Also, since we're in Spark-land, it would probably be good to call SparkContext.addJar on the newly added JARs to send them to executors in the cluster.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53612308
  
    I think :cp is fine, no reason to make a new command. I tested this again and it seems to work well (also on a cluster), so I'm going to merge this into 1.2. And I'll open a new JIRA for a Scala 2.11 version of this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1929#discussion_r16206505
  
    --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkIMain.scala ---
    @@ -130,6 +131,9 @@ import org.apache.spark.util.Utils
         private var _classLoader: AbstractFileClassLoader = null                              // active classloader
         private val _compiler: Global                     = newCompiler(settings, reporter)   // our private compiler
     
    +    private trait ExposeAddUrl extends URLClassLoader { def addNewUrl(url: URL) = this.addURL(url) }
    +    private var _runtimeClassLoader: URLClassLoader with ExposeAddUrl = null              // wrapper exposing addURL
    --- End diff --
    
    I needed to add a parent loader that explicitly exposed the addURL method so I could add new jars to our runtime, otherwise just adding to the compile time classpath would allow you to import the classes but not use them in the interpreter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53369372
  
    @mateiz I have not run into that issue. In fact, I just cloned a copy of my branch I submitted on a fresh computer and it still worked fine.
    
    To test, I cloned the branch and ran `sbt/sbt assembly` to build the new jars.
    
    From there, I created a file called _TestClass.java_ as an illustration:
    
    ```
    package com.ibm.testjar;
    
    public class TestClass {
        public void runMe() {
            System.out.println("I WAS RUN!");
        }
    }
    ```
    
    I created a jar from this by issuing `javac TestClass.java`, `mv TestClass.class com/ibm/testjar/`, and `jar -cf TestJar.jar com/ibm/testjar/TestClass.class`.
    
    Finally, I ran two different cases after starting the Spark shell through `bin/spark-shell` (after `sbt/sbt assembly`):
    
    ![screen shot 2014-08-25 at 9 43 08 pm](https://cloud.githubusercontent.com/assets/2481802/4039431/6189b356-2ccb-11e4-92b1-41d78102218d.png)
    ![screen shot 2014-08-25 at 9 44 35 pm](https://cloud.githubusercontent.com/assets/2481802/4039432/61905b20-2ccb-11e4-8333-dc3ee2bd511b.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52117091
  
    @mateiz I don't know what the Scala team's plans would be. I know that the only suggestion thus far was to wipe the global reference and recreate it (https://issues.scala-lang.org/browse/SI-6502?focusedCommentId=68422&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-68422), which is not ideal. There hasn't been any discussion since February on this.
    
    On quick glance, it doesn't look like they have fixed anything with the REPL in 2.11.2. The code was moved (https://github.com/scala/scala/blob/v2.11.2/src/repl/scala/tools/nsc/interpreter/ILoop.scala), but has the same issue.
    
    The 2.11.2 REPL gives me the same problem:
    
    ![screen shot 2014-08-13 at 4 48 39 pm](https://cloud.githubusercontent.com/assets/2481802/3913239/9f60ab04-2333-11e4-89e4-44bc107caeee.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52223769
  
    BTW I'd also be okay merging it if there are only minor problems, since adding these JARs at runtime is very useful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by som-snytt <gi...@git.apache.org>.
Github user som-snytt commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52670339
  
    FWIW, I remembered my fix included overriding the ClassPath (in the JavaPlatform) of the REPL compiler.  So it was a bit more nuanced in handling the replacement; it was a long time ago now, in Internet or dog years.  But having an "additive" solution is better than nothing, the perfect being the enemy of the good and all that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3256] Added support for :cp that ...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53631925
  
    Alright, thanks! I've merged this in. I also opened https://issues.apache.org/jira/browse/SPARK-3257 to do this for the Scala 2.11 REPL. @gkossakowski it would be great if you guys add an API for this in the next Scala 2.11 release -- our 1.2 release will be in roughly 3 months.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53512666
  
    Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52190737
  
    @ScrapCodes I recently picked up Scala and am much more familiar with Java's reflection than Scala's runtime mirrors and such. Do you have any pseudo code to express what this would look like using Scala's mirrors? 
    
    @mateiz What side affects are you concerned about?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by heathermiller <gi...@git.apache.org>.
Github user heathermiller commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53507521
  
    Ok, so I think I have a solution. It took an embarrassing amount of experimentation and grokking of the REPL (not my area of expertise), but if you just create a dummy `Run` instance, it seems to force symbols to be initialized when you need them to be (i.e., classloading)
    
    So, in your implementation of `addUrlsToClassPath`, what seems to work is:
    ```scala
        def addUrlsToClassPath(urls: URL*): Unit = {
          new Run // force some initialization
          urls.foreach(_runtimeClassLoader.addNewUrl) // Add jars/classes to runtime for execution
          updateCompilerClassPath(urls: _*)           // Add jars/classes to compile time for compiling
        }
    ```
    
    After running into 10,000 walls trying to force the initialization of symbols in previous/current runs, I discovered this happening in Global:
    https://github.com/scala/scala/blob/2.11.x/src/compiler/scala/tools/nsc/Global.scala#L754
    Which in turn forces initialization in `Run`places like:
    https://github.com/scala/scala/blob/2.11.x/src/compiler/scala/tools/nsc/Global.scala#L1051
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52114333
  
    Just curious, is this what the Scala team intends to do here (or did in 2.11), or could it interfere with other things in the compiler?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52110677
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53361051
  
    Hey @rcsenkbeil, so I tested this out manually, but unfortunately it doesn't seem to quite work -- you can't use the new classes in the REPL, though you can create them using reflection. Here's a sample session:
    
    ```
    scala> :cp /Users/matei/workspace/jar-test/simple-jar-v1.jar
    Added '/Users/matei/workspace/jar-test/simple-jar-v1.jar'.  Your new classpath is:
    "/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/JObjC.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Ho
 me/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/System/Library/Java/Extensions/MRJToolkit.jar:/System/Library/Java/Extensions/QTJava.zip:/Users/matei/workspace/apache-spark/conf:/Users/matei/workspace/apache-spark/assembly/target/scala-2.10/spark-assembly-1.1.0-SNAPSHOT-hadoop1.0.4.jar:/Users/matei/workspace/jar-test/simple-jar-v1.jar"
    
    scala> Class.forName("test.Test").newInstance
    res0: Any = Test2
    
    scala> new test.Test
    <console>:11: error: not found: value test
                  new test.Test
                      ^
    ```
    
    My simple-jar-v1.jar just contains a class called Test in package test.
    
    Have you seen this happen yourself? It doesn't seem like this will be usable enough in its current form.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53626590
  
    @mateiz Created an account located at https://issues.apache.org/jira/secure/ViewProfile.jspa?name=senkwich


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1929#discussion_r16206403
  
    --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala ---
    @@ -711,22 +714,22 @@ class SparkILoop(in0: Option[BufferedReader], protected val out: JPrintWriter,
             added = true
             addedClasspath = ClassPath.join(addedClasspath, f.path)
             totalClasspath = ClassPath.join(settings.classpath.value, addedClasspath)
    +        intp.addUrlsToClassPath(f.toURI.toURL)
           }
         }
    -    if (added) replay()
       }
     
       def addClasspath(arg: String): Unit = {
         val f = File(arg).normalize
         if (f.exists) {
           addedClasspath = ClassPath.join(addedClasspath, f.path)
    -      val totalClasspath = ClassPath.join(settings.classpath.value, addedClasspath)
    -      echo("Added '%s'.  Your new classpath is:\n\"%s\"".format(f.path, totalClasspath))
    -      replay()
    +      intp.addUrlsToClassPath(f.toURI.toURL)
    --- End diff --
    
    This adds the specified jar to both our compile time and runtime classpaths in the interpreter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53369528
  
    Ah, it appears that I run into your issue when I create a class in Scala and attempt to import. My previous tests had been with Java.
    
    So, when I use code like this:
    
    ```
    package test
    
    class Test {
        def runMe = println("I'm Scala and I was also run!")
    }
    ```
    
    And build a jar in the same manner as previously done with Java, I run into your issues. I'll investigate this to see if I can figure out what is causing the difference (or at least what is triggered by running an import before adding the jar to see if I can duplicate that behavior).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52451682
  
    Got it, thanks for looking into this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by rcsenkbeil <gi...@git.apache.org>.
Github user rcsenkbeil commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53377066
  
    I've been fiddling around in the debugger to try to see if I can pinpoint what occurs during a compile error that would affect the state of Global, considering a new Run instance is being created per interpret call. I have noticed that the _reportCompileErrors_ of the Global#Run class does some reseting:
    
    ```
        def reportCompileErrors() {
          if (reporter.hasErrors) {
            for ((sym, file) <- symSource.iterator) {
              sym.reset(new loaders.SourcefileLoader(file))
              if (sym.isTerm)
                sym.moduleClass reset loaders.moduleClassLoader
            }
          }
          else {
            allConditionalWarnings foreach (_.summarize)
    
            if (seenMacroExpansionsFallingBack)
              warning("some macros could not be expanded and code fell back to overridden methods;"+
                      "\nrecompiling with generated classfiles on the classpath might help.")
            // todo: migrationWarnings
          }
        }
    ```
    
    While the symSource is local to the Run instance, I'm not sure if these resets would affect something in Global. I was trying to bypass the resets by setting a breakpoint on the `if (reporter.hasErrors) {` line and executing `reporter.reset()` to alter the state before entering the block. That ended up giving me the compile errors that we've seen with `new test.Test`, but I'm not sure if my attempt really proved that this chunk of code is the culprit/solution.
    
    Also, based on your comment about only symbol-not-found errors, I was at least able to trigger this compile error to also load a Scala class:
    ![screen shot 2014-08-26 at 12 25 26 am](https://cloud.githubusercontent.com/assets/2481802/4040334/db40421c-2ce1-11e4-8d02-795c104408ec.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52113271
  
    @ScrapCodes can you look at this one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53533093
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19293/consoleFull) for   PR 1929 at commit [`f420cbf`](https://github.com/apache/spark/commit/f420cbf00a5f98c8eec73d251ed1d6b9352ad063).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `rem In this case, leave out the main class (org.apache.spark.deploy.SparkSubmit) and use our own.`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53375860
  
    BTW it might be only symbol-not-found errors that cause it, not any error; maybe some of the stuff to read from the classpath is lazily initialized.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by ScrapCodes <gi...@git.apache.org>.
Github user ScrapCodes commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52149882
  
    Looks okay to me, I was wondering if you tried scala reflection before resorting to java reflection ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53467652
  
    Yeah, probably with `myFunc()` it does another search for a version of myFunc without arguments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by som-snytt <gi...@git.apache.org>.
Github user som-snytt commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52230440
  
    This is similar to what I tried last year, but I recently saw:
    
    https://github.com/scala/scala/pull/3884
    
    which says the mechanisms are in flux in 2.11.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by thunterdb <gi...@git.apache.org>.
Github user thunterdb commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-52208640
  
    I believe @mateiz  is concerned about compatibility with other scala 2.10.x versions. We would have to make sure this works for all supported scala versions, as the internals may change between minor releases. In that case, we might have to add different code paths for different versions. This is unlikely though and I think the benefits far outweigh the costs in this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Added support for :cp that was broken in...

Posted by gkossakowski <gi...@git.apache.org>.
Github user gkossakowski commented on the pull request:

    https://github.com/apache/spark/pull/1929#issuecomment-53536315
  
    > @gkossakowski, thanks for the detailed reply. From my point of view, what we want when new JARs are added is for earlier JARs to take precedence. This is what makes the most sense. If you already instantiated an object from the previous version of the class and stored it in a variable, it's not possible for it to suddenly change class. So instead the effect should be the same as tacking on another JAR at the end of your classpath -- only classes that are not found in earlier JARs come from there. Would these semantics be possible to implement for 2.11?
    
    We agree on semantics. I called changing existing class shadowing but we mean the same: changes to existing classes should not be allowed.
    
    Adding jars to the classpath means just adding new classes that were not previously available. For that we need merging of packages as I explained earlier. It's possible to implement this kind of API for 2.11 but it doesn't exist yet .
    
    I hope we can figure out how to merge your changes and work on the API on the compiler side. The current approach of going deep into internals of `Global` as seen in this PR is fine as a short term experimentation so you can quickly deliver a fix to your users. Long term solution would be migrating most of the Spark's code that talks to compiler internals to Scala code base.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org