You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Patrick Wendell <pw...@gmail.com> on 2014/11/17 10:42:47 UTC

[ANNOUNCE] Spark 1.2.0 Release Preview Posted

Hi All,

I've just posted a preview of the Spark 1.2.0. release for community
regression testing.

Issues reported now will get close attention, so please help us test!
You can help by running an existing Spark 1.X workload on this and
reporting any regressions. As we start voting, etc, the bar for
reported issues to hold the release will get higher and higher, so
test early!

The tag to be is v1.2.0-snapshot1 (commit 38c1fbd96)

The release files, including signatures, digests, etc can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-snapshot1

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1038/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-snapshot1-docs/

== Notes ==
- Maven artifacts are published for both Scala 2.10 and 2.11. Binary
distributions are not posted for Scala 2.11 yet, but will be posted
soon.

- There are two significant config default changes that users may want
to revert if doing A:B testing against older versions.

"spark.shuffle.manager" default has changed to "sort" (was "hash")
"spark.shuffle.blockTransferService" default has changed to "netty" (was "nio")

- This release contains a shuffle service for YARN. This jar is
present in all Hadoop 2.X binary packages in
"lib/spark-1.2.0-yarn-shuffle.jar"

Cheers,
Patrick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

Posted by Nishkam Ravi <nr...@cloudera.com>.

Seeing issues with sort-based shuffle (OOM errors and memory leak):
https://issues.apache.org/jira/browse/SPARK-4515.

Good performance gains for TeraSort as compared to hash (as expected).

Thanks,
Nishkam


On Thu, Nov 20, 2014 at 11:20 AM, Matei Zaharia <ma...@gmail.com>
wrote:

> You can still send patches for docs until the release goes out -- please
> do if you see stuff.
>
> Matei
>
> > On Nov 20, 2014, at 6:39 AM, Madhu <ma...@madhu.com> wrote:
> >
> > Thanks Patrick.
> >
> > I've been testing some 1.2 features, looks good so far.
> > I have some example code that I think will be helpful for certain
> MR-style
> > use cases (secondary sort).
> > Can I still add that to the 1.2 documentation, or is that frozen at this
> > point?
> >
> >
> >
> > -----
> > --
> > Madhu
> > https://www.linkedin.com/in/msiddalingaiah
> > --
> > View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> > Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > For additional commands, e-mail: dev-help@spark.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

Posted by Matei Zaharia <ma...@gmail.com>.

You can still send patches for docs until the release goes out -- please do if you see stuff.

Matei

> On Nov 20, 2014, at 6:39 AM, Madhu <ma...@madhu.com> wrote:
> 
> Thanks Patrick.
> 
> I've been testing some 1.2 features, looks good so far.
> I have some example code that I think will be helpful for certain MR-style
> use cases (secondary sort).
> Can I still add that to the 1.2 documentation, or is that frozen at this
> point?
> 
> 
> 
> -----
> --
> Madhu
> https://www.linkedin.com/in/msiddalingaiah
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

Posted by Hector Yee <he...@gmail.com>.

I'm getting a lot of task lost with this build in a large mesos cluster.
Happens with both hash and sort shuffles.

14/11/20 18:08:38 WARN TaskSetManager: Lost task 9.1 in stage 1.0 (TID 897,
i-d4d6553a.inst.aws.airbnb.com): FetchFailed(null, shuffleId=1, mapId=-1,
reduceId=9, message=
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output
location for shuffle 1
        at
org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:386)
        at
org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:383)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at
scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
        at
org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:382)
        at
org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:178)
        at
org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.fetch(BlockStoreShuffleFetcher.scala:42)
        at
org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40)
        at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:92)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)


On Thu, Nov 20, 2014 at 7:42 AM, Nan Zhu <zh...@gmail.com> wrote:

> BTW, this PR https://github.com/apache/spark/pull/2524 is related to a
> blocker level bug,
>
> and this is actually close to be merged (have been reviewed for several
> rounds)
>
> I would appreciated if anyone can continue the process,
>
> @mateiz
>
> --
> Nan Zhu
> http://codingcat.me
>
>
> On Thursday, November 20, 2014 at 10:17 AM, Corey Nolet wrote:
>
> > I was actually about to post this myself- I have a complex join that
> could
> > benefit from something like a GroupComparator vs having to do multiple
> > grouyBy operations. This is probably the wrong thread for a full
> discussion
> > on this but I didn't see a JIRA ticket for this or anything similar- any
> > reasons why this would not make sense given Spark's design?
> >
> > On Thu, Nov 20, 2014 at 9:39 AM, Madhu <madhu@madhu.com (mailto:
> madhu@madhu.com)> wrote:
> >
> > > Thanks Patrick.
> > >
> > > I've been testing some 1.2 features, looks good so far.
> > > I have some example code that I think will be helpful for certain
> MR-style
> > > use cases (secondary sort).
> > > Can I still add that to the 1.2 documentation, or is that frozen at
> this
> > > point?
> > >
> > >
> > >
> > > -----
> > > --
> > > Madhu
> > > https://www.linkedin.com/in/msiddalingaiah
> > > --
> > > View this message in context:
> > >
> http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> > > Sent from the Apache Spark Developers List mailing list archive at
> > > Nabble.com (http://Nabble.com).
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org (mailto:
> dev-unsubscribe@spark.apache.org)
> > > For additional commands, e-mail: dev-help@spark.apache.org (mailto:
> dev-help@spark.apache.org)
> > >
> >
> >
> >
> >
>
>
>


-- 
Yee Yang Li Hector <http://google.com/+HectorYee>
*google.com/+HectorYee <http://google.com/+HectorYee>*

Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

Posted by Nan Zhu <zh...@gmail.com>.

BTW, this PR https://github.com/apache/spark/pull/2524 is related to a blocker level bug, 

and this is actually close to be merged (have been reviewed for several rounds)

I would appreciated if anyone can continue the process, 

@mateiz 

-- 
Nan Zhu
http://codingcat.me


On Thursday, November 20, 2014 at 10:17 AM, Corey Nolet wrote:

> I was actually about to post this myself- I have a complex join that could
> benefit from something like a GroupComparator vs having to do multiple
> grouyBy operations. This is probably the wrong thread for a full discussion
> on this but I didn't see a JIRA ticket for this or anything similar- any
> reasons why this would not make sense given Spark's design?
> 
> On Thu, Nov 20, 2014 at 9:39 AM, Madhu <madhu@madhu.com (mailto:madhu@madhu.com)> wrote:
> 
> > Thanks Patrick.
> > 
> > I've been testing some 1.2 features, looks good so far.
> > I have some example code that I think will be helpful for certain MR-style
> > use cases (secondary sort).
> > Can I still add that to the 1.2 documentation, or is that frozen at this
> > point?
> > 
> > 
> > 
> > -----
> > --
> > Madhu
> > https://www.linkedin.com/in/msiddalingaiah
> > --
> > View this message in context:
> > http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> > Sent from the Apache Spark Developers List mailing list archive at
> > Nabble.com (http://Nabble.com).
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org (mailto:dev-unsubscribe@spark.apache.org)
> > For additional commands, e-mail: dev-help@spark.apache.org (mailto:dev-help@spark.apache.org)
> > 
> 
> 
> 
>

Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

Posted by Corey Nolet <cj...@gmail.com>.

I was actually about to post this myself- I have a complex join that could
benefit from something like a GroupComparator vs having to do multiple
grouyBy operations. This is probably the wrong thread for a full discussion
on this but I didn't see a JIRA ticket for this or anything similar- any
reasons why this would not make sense given Spark's design?

On Thu, Nov 20, 2014 at 9:39 AM, Madhu <ma...@madhu.com> wrote:

> Thanks Patrick.
>
> I've been testing some 1.2 features, looks good so far.
> I have some example code that I think will be helpful for certain MR-style
> use cases (secondary sort).
> Can I still add that to the 1.2 documentation, or is that frozen at this
> point?
>
>
>
> -----
> --
> Madhu
> https://www.linkedin.com/in/msiddalingaiah
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

Posted by Madhu <ma...@madhu.com>.

Thanks Patrick.

I've been testing some 1.2 features, looks good so far.
I have some example code that I think will be helpful for certain MR-style
use cases (secondary sort).
Can I still add that to the 1.2 documentation, or is that frozen at this
point?



-----
--
Madhu
https://www.linkedin.com/in/msiddalingaiah
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org