You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by unk1102 <um...@gmail.com> on 2016/01/08 21:21:13 UTC

Do we need to enabled Tungsten sort in Spark 1.6?

Hi I was using Spark 1.5 with Tungsten sort and now I have using Spark 1.6 I
dont see any difference I was expecting Spark 1.6 to be faster. Anyways do
we need to enable Tunsten and unsafe options or they are enabled by default
I see in documentation that default sort manager is sort I though it is
Tungsten no? Please guide. 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Do-we-need-to-enabled-Tungsten-sort-in-Spark-1-6-tp25923.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Do we need to enabled Tungsten sort in Spark 1.6?

Posted by Chris Fregly <ch...@fregly.com>.
Yeah, this confused me, as well.  Good question, Umesh.

As Ted pointed out:  between Spark 1.5 and 1.6,
o.a.s.shuffle.unsafe.UnsafeShuffleManager no longer exists as a separate
shuffle manager.  Here's the old code (notice the o.a.s.shuffle.unsafe
package):

https://github.com/apache/spark/blob/branch-1.5/core/src/main/scala/org/apache/spark/shuffle/unsafe/UnsafeShuffleManager.scala

The functionality has essentially been rolled into
o.a.s.shuffle.sort.SortShuffleManager with the help of a Scala match/case
statement.  Here's the newer code (notice the o.a.s.shuffle.unsafe package
is gone):

https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala


On Fri, Jan 8, 2016 at 1:14 PM, Ted Yu <yu...@gmail.com> wrote:

> For "spark.shuffle.manager", the default is "sort"
> From core/src/main/scala/org/apache/spark/SparkEnv.scala :
>
>     val shuffleMgrName = conf.get("spark.shuffle.manager", "sort")
>
> "tungsten-sort" is the same as "sort" :
>
>     val shortShuffleMgrNames = Map(
>       "hash" -> "org.apache.spark.shuffle.hash.HashShuffleManager",
>       "sort" -> "org.apache.spark.shuffle.sort.SortShuffleManager",
>       "tungsten-sort" ->
> "org.apache.spark.shuffle.sort.SortShuffleManager")
>
> FYI
>
> On Fri, Jan 8, 2016 at 12:59 PM, Umesh Kacha <um...@gmail.com>
> wrote:
>
>> ok thanks so it will be enabled by default always if yes then in
>> documentation why default shuffle manager is mentioned as sort?
>>
>> On Sat, Jan 9, 2016 at 1:55 AM, Ted Yu <yu...@gmail.com> wrote:
>>
>>> From
>>> sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala :
>>>
>>>     case Some((SQLConf.Deprecated.TUNGSTEN_ENABLED, Some(value))) =>
>>>       val runFunc = (sqlContext: SQLContext) => {
>>>         logWarning(
>>>           s"Property ${SQLConf.Deprecated.TUNGSTEN_ENABLED} is
>>> deprecated and " +
>>>             s"will be ignored. Tungsten will continue to be used.")
>>>         Seq(Row(SQLConf.Deprecated.TUNGSTEN_ENABLED, "true"))
>>>       }
>>>
>>> FYI
>>>
>>> On Fri, Jan 8, 2016 at 12:21 PM, unk1102 <um...@gmail.com> wrote:
>>>
>>>> Hi I was using Spark 1.5 with Tungsten sort and now I have using Spark
>>>> 1.6 I
>>>> dont see any difference I was expecting Spark 1.6 to be faster. Anyways
>>>> do
>>>> we need to enable Tunsten and unsafe options or they are enabled by
>>>> default
>>>> I see in documentation that default sort manager is sort I though it is
>>>> Tungsten no? Please guide.
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Do-we-need-to-enabled-Tungsten-sort-in-Spark-1-6-tp25923.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>
>>>>
>>>
>>
>


-- 

*Chris Fregly*
Principal Data Solutions Engineer
IBM Spark Technology Center, San Francisco, CA
http://spark.tc | http://advancedspark.com

Re: Do we need to enabled Tungsten sort in Spark 1.6?

Posted by Ted Yu <yu...@gmail.com>.
For "spark.shuffle.manager", the default is "sort"
>From core/src/main/scala/org/apache/spark/SparkEnv.scala :

    val shuffleMgrName = conf.get("spark.shuffle.manager", "sort")

"tungsten-sort" is the same as "sort" :

    val shortShuffleMgrNames = Map(
      "hash" -> "org.apache.spark.shuffle.hash.HashShuffleManager",
      "sort" -> "org.apache.spark.shuffle.sort.SortShuffleManager",
      "tungsten-sort" -> "org.apache.spark.shuffle.sort.SortShuffleManager")

FYI

On Fri, Jan 8, 2016 at 12:59 PM, Umesh Kacha <um...@gmail.com> wrote:

> ok thanks so it will be enabled by default always if yes then in
> documentation why default shuffle manager is mentioned as sort?
>
> On Sat, Jan 9, 2016 at 1:55 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> From
>> sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala :
>>
>>     case Some((SQLConf.Deprecated.TUNGSTEN_ENABLED, Some(value))) =>
>>       val runFunc = (sqlContext: SQLContext) => {
>>         logWarning(
>>           s"Property ${SQLConf.Deprecated.TUNGSTEN_ENABLED} is deprecated
>> and " +
>>             s"will be ignored. Tungsten will continue to be used.")
>>         Seq(Row(SQLConf.Deprecated.TUNGSTEN_ENABLED, "true"))
>>       }
>>
>> FYI
>>
>> On Fri, Jan 8, 2016 at 12:21 PM, unk1102 <um...@gmail.com> wrote:
>>
>>> Hi I was using Spark 1.5 with Tungsten sort and now I have using Spark
>>> 1.6 I
>>> dont see any difference I was expecting Spark 1.6 to be faster. Anyways
>>> do
>>> we need to enable Tunsten and unsafe options or they are enabled by
>>> default
>>> I see in documentation that default sort manager is sort I though it is
>>> Tungsten no? Please guide.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Do-we-need-to-enabled-Tungsten-sort-in-Spark-1-6-tp25923.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>
>

Re: Do we need to enabled Tungsten sort in Spark 1.6?

Posted by Umesh Kacha <um...@gmail.com>.
ok thanks so it will be enabled by default always if yes then in
documentation why default shuffle manager is mentioned as sort?

On Sat, Jan 9, 2016 at 1:55 AM, Ted Yu <yu...@gmail.com> wrote:

> From sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala
> :
>
>     case Some((SQLConf.Deprecated.TUNGSTEN_ENABLED, Some(value))) =>
>       val runFunc = (sqlContext: SQLContext) => {
>         logWarning(
>           s"Property ${SQLConf.Deprecated.TUNGSTEN_ENABLED} is deprecated
> and " +
>             s"will be ignored. Tungsten will continue to be used.")
>         Seq(Row(SQLConf.Deprecated.TUNGSTEN_ENABLED, "true"))
>       }
>
> FYI
>
> On Fri, Jan 8, 2016 at 12:21 PM, unk1102 <um...@gmail.com> wrote:
>
>> Hi I was using Spark 1.5 with Tungsten sort and now I have using Spark
>> 1.6 I
>> dont see any difference I was expecting Spark 1.6 to be faster. Anyways do
>> we need to enable Tunsten and unsafe options or they are enabled by
>> default
>> I see in documentation that default sort manager is sort I though it is
>> Tungsten no? Please guide.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Do-we-need-to-enabled-Tungsten-sort-in-Spark-1-6-tp25923.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Re: Do we need to enabled Tungsten sort in Spark 1.6?

Posted by Ted Yu <yu...@gmail.com>.
>From sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala :

    case Some((SQLConf.Deprecated.TUNGSTEN_ENABLED, Some(value))) =>
      val runFunc = (sqlContext: SQLContext) => {
        logWarning(
          s"Property ${SQLConf.Deprecated.TUNGSTEN_ENABLED} is deprecated
and " +
            s"will be ignored. Tungsten will continue to be used.")
        Seq(Row(SQLConf.Deprecated.TUNGSTEN_ENABLED, "true"))
      }

FYI

On Fri, Jan 8, 2016 at 12:21 PM, unk1102 <um...@gmail.com> wrote:

> Hi I was using Spark 1.5 with Tungsten sort and now I have using Spark 1.6
> I
> dont see any difference I was expecting Spark 1.6 to be faster. Anyways do
> we need to enable Tunsten and unsafe options or they are enabled by default
> I see in documentation that default sort manager is sort I though it is
> Tungsten no? Please guide.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Do-we-need-to-enabled-Tungsten-sort-in-Spark-1-6-tp25923.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>