You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Marek Kolodziej <mk...@gmail.com> on 2015/08/21 14:29:13 UTC

Tungsten and sun.misc.Unsafe

Hello,

I attended the Tungsten-related presentations at Spark Summit (by Josh
Rosen) and at Big Data Scala (by Matei Zaharia). Needless to say, this
project holds great promise for major performance improvements.

At Josh's talk, I heard about the use of sun.misc.Unsafe as a way of
achieving some of these optimizations (e.g. slides 11-17 of Josh's
presentation:
http://www.slideshare.net/SparkSummit/deep-dive-into-project-tungsten-josh-rosen).
I have no problems with the use of Unsafe in the code itself (I've done it
before myself, too), however I think there is a considerable risk
associated with beginning the use of Unsafe now, because Oracle is
determined to limit access to APIs such as Unsafe starting in Java 9.

JEP 260 <http://openjdk.java.net/jeps/260> was filed specifically to limit
access to internal JDK APIs that were "never intended for external use,
including "sun.misc.*" The JEP does say that the functionality of
sun.misc.Unsafe is to remain available even as other internal APIs are
blocked for non-JDK use, however, it also says that "the functionality of
many methods of this class is now available via *variable handles (JEP 193
<http://openjdk.java.net/jeps/193>).*" If the direct access to
sun.misc.Unsafe is blocked and only the variable handles access remains,
this may mean more than just a need for code refactoring - functionality
such as doing "malloc" from Spark core may be restricted.

JEP 260 has evolved quite a bit over time and the wording available now
(after the Aug. 4, 2015) seems more reasonable than before. Nevertheless,
Hazelcast and other companies whose technologies depend on the availability
of Unsafe started a Google doc here
<https://docs.google.com/document/d/1GDm_cAxYInmoHMor-AkStzWvwE9pw6tnz_CebJQxuUE/edit#heading=h.brct71tr6e13>
.

I doubt that Oracle would want to make life difficult for everyone. In
addition to Spark's code base, projects such as Akka, Cassandra, Hibernate,
Netty, Neo4j and Spring (among many others) depend on Unsafe. Still, there
are tons of posts about this issue in the Java community (e.g. here
<https://jaxenter.com/hazelcast-on-java-unsafe-class-119286.html>'s a
Hazelcast interview, also from Aug. 3, the day before the latest update to
JEP 260). There are tons of concerned posts on the blogosphere, too (e.g.
here
<http://blog.dripstat.com/removal-of-sun-misc-unsafe-a-disaster-in-the-making/>
).

Have the leaders of the Spark community been following these Unsafe-related
developments and if so, what's Spark's plan of handling whatever Oracle
throws our way?

Marek

Re: Tungsten and sun.misc.Unsafe

Posted by Steve Loughran <st...@hortonworks.com>.
On 21 Aug 2015, at 05:29, Marek Kolodziej <mk...@gmail.com>> wrote:

I doubt that Oracle would want to make life difficult for everyone. In addition to Spark's code base, projects such as Akka, Cassandra, Hibernate, Netty, Neo4j and Spring (among many others) depend on Unsafe. Still, there are tons of posts about this issue in the Java community (e.g. here<https://jaxenter.com/hazelcast-on-java-unsafe-class-119286.html>'s a Hazelcast interview, also from Aug. 3, the day before the latest update to JEP 260). There are tons of concerned posts on the blogosphere, too (e.g. here<http://blog.dripstat.com/removal-of-sun-misc-unsafe-a-disaster-in-the-making/>).

Have the leaders of the Spark community been following these Unsafe-related developments and if so, what's Spark's plan of handling whatever Oracle throws our way?

I don't know about Spark, but I know that Hadoop uses a lot of it, introspecting into sun.security for access to kerberos operations, switching to the ibm. equivalent. Without that kerberos simply doesn't work.

As of now, the project's stance is "if Oracle want hadoop to run on Oracle Java 9, they'd better have a plan"

Re: Tungsten and sun.misc.Unsafe

Posted by Marek Kolodziej <mk...@gmail.com>.
Thanks Reynold, that helps a lot. I'm glad you're involved with that Google
Doc community effort. I think it's because of that doc that the JEP's
wording and scope changed for the better since it originally got
introduced.

Marek

On Fri, Aug 21, 2015 at 11:18 AM, Reynold Xin <rx...@databricks.com> wrote:

> I'm actually somewhat involved with the Google Docs you linked to.
>
> I don't think Oracle will remove Unsafe in JVM 9. As you said, JEP 260
> already proposes making Unsafe available. Given the widespread use of
> Unsafe for performance and advanced functionalities, I don't think Oracle
> can just remove it in one release. If they do, there will be strong
> backlash and the act would significantly undermine the credibility of the
> JVM as a long-term platform.
>
> Note that for Spark itself, we move pretty fast and can replace all the
> use of Unsafe with a newer alternative in one release if absolutely
> necessary (the actual coding takes only a day or two).
>
>
>
> On Fri, Aug 21, 2015 at 5:29 AM, Marek Kolodziej <mk...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I attended the Tungsten-related presentations at Spark Summit (by Josh
>> Rosen) and at Big Data Scala (by Matei Zaharia). Needless to say, this
>> project holds great promise for major performance improvements.
>>
>> At Josh's talk, I heard about the use of sun.misc.Unsafe as a way of
>> achieving some of these optimizations (e.g. slides 11-17 of Josh's
>> presentation:
>> http://www.slideshare.net/SparkSummit/deep-dive-into-project-tungsten-josh-rosen).
>> I have no problems with the use of Unsafe in the code itself (I've done it
>> before myself, too), however I think there is a considerable risk
>> associated with beginning the use of Unsafe now, because Oracle is
>> determined to limit access to APIs such as Unsafe starting in Java 9.
>>
>> JEP 260 <http://openjdk.java.net/jeps/260> was filed specifically to
>> limit access to internal JDK APIs that were "never intended for external
>> use, including "sun.misc.*" The JEP does say that the functionality of
>> sun.misc.Unsafe is to remain available even as other internal APIs are
>> blocked for non-JDK use, however, it also says that "the functionality of
>> many methods of this class is now available via *variable handles (JEP
>> 193 <http://openjdk.java.net/jeps/193>).*" If the direct access to
>> sun.misc.Unsafe is blocked and only the variable handles access remains,
>> this may mean more than just a need for code refactoring - functionality
>> such as doing "malloc" from Spark core may be restricted.
>>
>> JEP 260 has evolved quite a bit over time and the wording available now
>> (after the Aug. 4, 2015) seems more reasonable than before. Nevertheless,
>> Hazelcast and other companies whose technologies depend on the availability
>> of Unsafe started a Google doc here
>> <https://docs.google.com/document/d/1GDm_cAxYInmoHMor-AkStzWvwE9pw6tnz_CebJQxuUE/edit#heading=h.brct71tr6e13>
>> .
>>
>> I doubt that Oracle would want to make life difficult for everyone. In
>> addition to Spark's code base, projects such as Akka, Cassandra, Hibernate,
>> Netty, Neo4j and Spring (among many others) depend on Unsafe. Still, there
>> are tons of posts about this issue in the Java community (e.g. here
>> <https://jaxenter.com/hazelcast-on-java-unsafe-class-119286.html>'s a
>> Hazelcast interview, also from Aug. 3, the day before the latest update to
>> JEP 260). There are tons of concerned posts on the blogosphere, too (e.g.
>> here
>> <http://blog.dripstat.com/removal-of-sun-misc-unsafe-a-disaster-in-the-making/>
>> ).
>>
>> Have the leaders of the Spark community been following these
>> Unsafe-related developments and if so, what's Spark's plan of handling
>> whatever Oracle throws our way?
>>
>> Marek
>>
>
>

Re: Tungsten and sun.misc.Unsafe

Posted by Marek Kolodziej <mk...@gmail.com>.
Thanks Reynold, that helps a lot. I'm glad you're involved with that Google
Doc community effort. I think it's because of that doc that the JEP's
wording and scope changed for the better since it originally got
introduced.

Marek

On Fri, Aug 21, 2015 at 11:18 AM, Reynold Xin <rx...@databricks.com> wrote:

> I'm actually somewhat involved with the Google Docs you linked to.
>
> I don't think Oracle will remove Unsafe in JVM 9. As you said, JEP 260
> already proposes making Unsafe available. Given the widespread use of
> Unsafe for performance and advanced functionalities, I don't think Oracle
> can just remove it in one release. If they do, there will be strong
> backlash and the act would significantly undermine the credibility of the
> JVM as a long-term platform.
>
> Note that for Spark itself, we move pretty fast and can replace all the
> use of Unsafe with a newer alternative in one release if absolutely
> necessary (the actual coding takes only a day or two).
>
>
>
> On Fri, Aug 21, 2015 at 5:29 AM, Marek Kolodziej <mk...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I attended the Tungsten-related presentations at Spark Summit (by Josh
>> Rosen) and at Big Data Scala (by Matei Zaharia). Needless to say, this
>> project holds great promise for major performance improvements.
>>
>> At Josh's talk, I heard about the use of sun.misc.Unsafe as a way of
>> achieving some of these optimizations (e.g. slides 11-17 of Josh's
>> presentation:
>> http://www.slideshare.net/SparkSummit/deep-dive-into-project-tungsten-josh-rosen).
>> I have no problems with the use of Unsafe in the code itself (I've done it
>> before myself, too), however I think there is a considerable risk
>> associated with beginning the use of Unsafe now, because Oracle is
>> determined to limit access to APIs such as Unsafe starting in Java 9.
>>
>> JEP 260 <http://openjdk.java.net/jeps/260> was filed specifically to
>> limit access to internal JDK APIs that were "never intended for external
>> use, including "sun.misc.*" The JEP does say that the functionality of
>> sun.misc.Unsafe is to remain available even as other internal APIs are
>> blocked for non-JDK use, however, it also says that "the functionality of
>> many methods of this class is now available via *variable handles (JEP
>> 193 <http://openjdk.java.net/jeps/193>).*" If the direct access to
>> sun.misc.Unsafe is blocked and only the variable handles access remains,
>> this may mean more than just a need for code refactoring - functionality
>> such as doing "malloc" from Spark core may be restricted.
>>
>> JEP 260 has evolved quite a bit over time and the wording available now
>> (after the Aug. 4, 2015) seems more reasonable than before. Nevertheless,
>> Hazelcast and other companies whose technologies depend on the availability
>> of Unsafe started a Google doc here
>> <https://docs.google.com/document/d/1GDm_cAxYInmoHMor-AkStzWvwE9pw6tnz_CebJQxuUE/edit#heading=h.brct71tr6e13>
>> .
>>
>> I doubt that Oracle would want to make life difficult for everyone. In
>> addition to Spark's code base, projects such as Akka, Cassandra, Hibernate,
>> Netty, Neo4j and Spring (among many others) depend on Unsafe. Still, there
>> are tons of posts about this issue in the Java community (e.g. here
>> <https://jaxenter.com/hazelcast-on-java-unsafe-class-119286.html>'s a
>> Hazelcast interview, also from Aug. 3, the day before the latest update to
>> JEP 260). There are tons of concerned posts on the blogosphere, too (e.g.
>> here
>> <http://blog.dripstat.com/removal-of-sun-misc-unsafe-a-disaster-in-the-making/>
>> ).
>>
>> Have the leaders of the Spark community been following these
>> Unsafe-related developments and if so, what's Spark's plan of handling
>> whatever Oracle throws our way?
>>
>> Marek
>>
>
>

Re: Tungsten and sun.misc.Unsafe

Posted by Reynold Xin <rx...@databricks.com>.
I'm actually somewhat involved with the Google Docs you linked to.

I don't think Oracle will remove Unsafe in JVM 9. As you said, JEP 260
already proposes making Unsafe available. Given the widespread use of
Unsafe for performance and advanced functionalities, I don't think Oracle
can just remove it in one release. If they do, there will be strong
backlash and the act would significantly undermine the credibility of the
JVM as a long-term platform.

Note that for Spark itself, we move pretty fast and can replace all the use
of Unsafe with a newer alternative in one release if absolutely necessary
(the actual coding takes only a day or two).



On Fri, Aug 21, 2015 at 5:29 AM, Marek Kolodziej <mk...@gmail.com>
wrote:

> Hello,
>
> I attended the Tungsten-related presentations at Spark Summit (by Josh
> Rosen) and at Big Data Scala (by Matei Zaharia). Needless to say, this
> project holds great promise for major performance improvements.
>
> At Josh's talk, I heard about the use of sun.misc.Unsafe as a way of
> achieving some of these optimizations (e.g. slides 11-17 of Josh's
> presentation:
> http://www.slideshare.net/SparkSummit/deep-dive-into-project-tungsten-josh-rosen).
> I have no problems with the use of Unsafe in the code itself (I've done it
> before myself, too), however I think there is a considerable risk
> associated with beginning the use of Unsafe now, because Oracle is
> determined to limit access to APIs such as Unsafe starting in Java 9.
>
> JEP 260 <http://openjdk.java.net/jeps/260> was filed specifically to
> limit access to internal JDK APIs that were "never intended for external
> use, including "sun.misc.*" The JEP does say that the functionality of
> sun.misc.Unsafe is to remain available even as other internal APIs are
> blocked for non-JDK use, however, it also says that "the functionality of
> many methods of this class is now available via *variable handles (JEP
> 193 <http://openjdk.java.net/jeps/193>).*" If the direct access to
> sun.misc.Unsafe is blocked and only the variable handles access remains,
> this may mean more than just a need for code refactoring - functionality
> such as doing "malloc" from Spark core may be restricted.
>
> JEP 260 has evolved quite a bit over time and the wording available now
> (after the Aug. 4, 2015) seems more reasonable than before. Nevertheless,
> Hazelcast and other companies whose technologies depend on the availability
> of Unsafe started a Google doc here
> <https://docs.google.com/document/d/1GDm_cAxYInmoHMor-AkStzWvwE9pw6tnz_CebJQxuUE/edit#heading=h.brct71tr6e13>
> .
>
> I doubt that Oracle would want to make life difficult for everyone. In
> addition to Spark's code base, projects such as Akka, Cassandra, Hibernate,
> Netty, Neo4j and Spring (among many others) depend on Unsafe. Still, there
> are tons of posts about this issue in the Java community (e.g. here
> <https://jaxenter.com/hazelcast-on-java-unsafe-class-119286.html>'s a
> Hazelcast interview, also from Aug. 3, the day before the latest update to
> JEP 260). There are tons of concerned posts on the blogosphere, too (e.g.
> here
> <http://blog.dripstat.com/removal-of-sun-misc-unsafe-a-disaster-in-the-making/>
> ).
>
> Have the leaders of the Spark community been following these
> Unsafe-related developments and if so, what's Spark's plan of handling
> whatever Oracle throws our way?
>
> Marek
>

Re: Tungsten and sun.misc.Unsafe

Posted by Reynold Xin <rx...@databricks.com>.
I'm actually somewhat involved with the Google Docs you linked to.

I don't think Oracle will remove Unsafe in JVM 9. As you said, JEP 260
already proposes making Unsafe available. Given the widespread use of
Unsafe for performance and advanced functionalities, I don't think Oracle
can just remove it in one release. If they do, there will be strong
backlash and the act would significantly undermine the credibility of the
JVM as a long-term platform.

Note that for Spark itself, we move pretty fast and can replace all the use
of Unsafe with a newer alternative in one release if absolutely necessary
(the actual coding takes only a day or two).



On Fri, Aug 21, 2015 at 5:29 AM, Marek Kolodziej <mk...@gmail.com>
wrote:

> Hello,
>
> I attended the Tungsten-related presentations at Spark Summit (by Josh
> Rosen) and at Big Data Scala (by Matei Zaharia). Needless to say, this
> project holds great promise for major performance improvements.
>
> At Josh's talk, I heard about the use of sun.misc.Unsafe as a way of
> achieving some of these optimizations (e.g. slides 11-17 of Josh's
> presentation:
> http://www.slideshare.net/SparkSummit/deep-dive-into-project-tungsten-josh-rosen).
> I have no problems with the use of Unsafe in the code itself (I've done it
> before myself, too), however I think there is a considerable risk
> associated with beginning the use of Unsafe now, because Oracle is
> determined to limit access to APIs such as Unsafe starting in Java 9.
>
> JEP 260 <http://openjdk.java.net/jeps/260> was filed specifically to
> limit access to internal JDK APIs that were "never intended for external
> use, including "sun.misc.*" The JEP does say that the functionality of
> sun.misc.Unsafe is to remain available even as other internal APIs are
> blocked for non-JDK use, however, it also says that "the functionality of
> many methods of this class is now available via *variable handles (JEP
> 193 <http://openjdk.java.net/jeps/193>).*" If the direct access to
> sun.misc.Unsafe is blocked and only the variable handles access remains,
> this may mean more than just a need for code refactoring - functionality
> such as doing "malloc" from Spark core may be restricted.
>
> JEP 260 has evolved quite a bit over time and the wording available now
> (after the Aug. 4, 2015) seems more reasonable than before. Nevertheless,
> Hazelcast and other companies whose technologies depend on the availability
> of Unsafe started a Google doc here
> <https://docs.google.com/document/d/1GDm_cAxYInmoHMor-AkStzWvwE9pw6tnz_CebJQxuUE/edit#heading=h.brct71tr6e13>
> .
>
> I doubt that Oracle would want to make life difficult for everyone. In
> addition to Spark's code base, projects such as Akka, Cassandra, Hibernate,
> Netty, Neo4j and Spring (among many others) depend on Unsafe. Still, there
> are tons of posts about this issue in the Java community (e.g. here
> <https://jaxenter.com/hazelcast-on-java-unsafe-class-119286.html>'s a
> Hazelcast interview, also from Aug. 3, the day before the latest update to
> JEP 260). There are tons of concerned posts on the blogosphere, too (e.g.
> here
> <http://blog.dripstat.com/removal-of-sun-misc-unsafe-a-disaster-in-the-making/>
> ).
>
> Have the leaders of the Spark community been following these
> Unsafe-related developments and if so, what's Spark's plan of handling
> whatever Oracle throws our way?
>
> Marek
>