You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by retronym <gi...@git.apache.org> on 2018/04/03 10:41:52 UTC

[GitHub] spark issue #19675: [SPARK-14540][BUILD] Support Scala 2.12 closures and Jav...

Github user retronym commented on the issue:

    https://github.com/apache/spark/pull/19675
  
    I'm happy to help, but I would appreciate if someone could pose the question you have about the lambda encoding / `SerializedLambda` in a standalone fashion.
    
    In the meantime, here are some details that might help.
    
    To find the captured fields of a Scala 2.12+ (or Java 8+ lambda), you can use:
    
    ```
    scala> :paste -raw
    // Entering paste mode (ctrl-D to finish)
    
    package p1;
    class C { val x = "foo"; def test1 = () => this.x; def test2 = () => C.this; def test3 = () => "" }
    
    // Exiting paste mode, now interpreting.
    
    
    scala> def inspect(closure: Object) = { val writeReplace = closure.getClass.getDeclaredMethod("writeReplace"); writeReplace.setAccessible(true); val sl = writeReplace.invoke(closure).asInstanceOf[java.lang.invoke.SerializedLambda]; println(closure); println(sl); println(List.tabulate(sl.getCapturedArgCount)(sl.getCapturedArg))}
    inspect: (closure: Object)Unit
    
    scala> inspect(new p1.C().test1)
    p1.C$$Lambda$1072/2133998394@79b37277
    SerializedLambda[capturingClass=class p1.C, functionalInterfaceMethod=scala/Function0.apply:()Ljava/lang/Object;, implementation=invokeStatic p1/C.$anonfun$test1$1:(Lp1/C;)Ljava/lang/String;, instantiatedMethodType=()Ljava/lang/String;, numCaptured=1]
    List(p1.C@107f5239)
    
    scala> inspect(new p1.C().test2)
    p1.C$$Lambda$1412/2049133976@1215aae8
    SerializedLambda[capturingClass=class p1.C, functionalInterfaceMethod=scala/Function0.apply:()Ljava/lang/Object;, implementation=invokeStatic p1/C.$anonfun$test2$1:(Lp1/C;)Lp1/C;, instantiatedMethodType=()Lp1/C;, numCaptured=1]
    List(p1.C@21378791)
    
    scala> inspect(new p1.C().test3)
    p1.C$$Lambda$1413/399039406@7643ae41
    SerializedLambda[capturingClass=class p1.C, functionalInterfaceMethod=scala/Function0.apply:()Ljava/lang/Object;, implementation=invokeStatic p1/C.$anonfun$test3$1:()Ljava/lang/String;, instantiatedMethodType=()Ljava/lang/String;, numCaptured=0]
    List()
    ```
    
    If you need to access the bytecode of such a lambda to try to find unused fields, things get harder. The nuclear option is the have your users compile with `-Ydelambdafy:inline`, which avoids the use of `LambdaMetafactory` and spins up the lambda anonymous classes ahead-of-time in `scalac`.
    
    But I would imagine you have the same problems for Java or for Scala, unless there is a problem in Scala's lambda translation where we still capture more, potentially non-serializable or huge, than the equivalent Java code. So it would be good to discuss a concrete problematic example, to see if we can fix something in the compiler.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org