You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by "Shixiong(Ryan) Zhu" <sh...@databricks.com> on 2016/10/01 03:47:11 UTC

Re: [VOTE] Release Apache Spark 2.0.1 (RC4)

Hey Mark,

I can reproduce the failure locally using your command. There were a lot of
OutOfMemoryError in the unit test log. I increased the heap size from 3g to
4g at https://github.com/apache/spark/blob/v2.0.1-rc4/pom.xml#L2029 and it
passed tests. I think the patch you mentioned increased the memory
usage of BlockManagerSuite
and made the tests easy to OOM. It can be fixed by mocking SparkContext (or
may be not necessary since Jenkins's maven and sbt builds are green now).

However, since this is only a test issue, it should not be a blocker.


On Fri, Sep 30, 2016 at 8:34 AM, Mark Hamstra <ma...@clearstorydata.com>
wrote:

> 0
>
> RC4 is causing a build regression for me on at least one of my machines.
> RC3 built and ran tests successfully, but the tests consistently fail with
> RC4 unless I revert 9e91a1009e6f916245b4d4018de1664ea3decfe7,
> "[SPARK-15703][SCHEDULER][CORE][WEBUI] Make ListenerBus event queue size
> configurable (branch 2.0)".  This is using build/mvn -U -Pyarn -Phadoop-2.7
> -Pkinesis-asl -Phive -Phive-thriftserver -Dpyspark -Dsparkr -DskipTests
> clean package; build/mvn -U -Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive
> -Phive-thriftserver -Dpyspark -Dsparkr test.  Environment is macOS 10.12,
> Java 1.8.0_102.
>
> There are no tests that go red.  Rather, the core tests just end after...
>
> ...
> BlockManagerSuite:
> ...
> - overly large block
> - block compression
> - block store put failure
>
> ...with only the generic "[ERROR] Failed to execute goal
> org.scalatest:scalatest-maven-plugin:1.0:test (test) on project
> spark-core_2.11: There are test failures".
>
> I'll try some other environments today to see whether I can turn this 0
> into either a -1 or +1, but right now that commit is looking deeply
> suspicious to me.
>
> On Wed, Sep 28, 2016 at 7:14 PM, Reynold Xin <rx...@databricks.com> wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.0.1. The vote is open until Sat, Oct 1, 2016 at 20:00 PDT and passes if a
>> majority of at least 3+1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 2.0.1
>> [ ] -1 Do not release this package because ...
>>
>>
>> The tag to be voted on is v2.0.1-rc4 (933d2c1ea4e5f5c4ec8d375b5ccaa
>> 4577ba4be38)
>>
>> This release candidate resolves 301 issues:
>> https://s.apache.org/spark-2.0.1-jira
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.1-rc4-bin/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1203/
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.1-rc4-docs/
>>
>>
>> Q: How can I help test this release?
>> A: If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions from 2.0.0.
>>
>> Q: What justifies a -1 vote for this release?
>> A: This is a maintenance release in the 2.0.x series.  Bugs already
>> present in 2.0.0, missing features, or bugs related to new features will
>> not necessarily block this release.
>>
>> Q: What fix version should I use for patches merging into branch-2.0 from
>> now on?
>> A: Please mark the fix version as 2.0.2, rather than 2.0.1. If a new RC
>> (i.e. RC5) is cut, I will change the fix version of those patches to 2.0.1.
>>
>>
>>
>

Re: [VOTE] Release Apache Spark 2.0.1 (RC4)

Posted by Mark Hamstra <ma...@clearstorydata.com>.
Thanks for doing the investigation.  What I found out yesterday is that my
other macOs 10.12 machine ran into the same issue, while various Linux
machines did not, so there may well be an OS-specific component to this
particular OOM-in-tests problem.  Unfortunately, increasing the heap as you
suggest doesn't resolve the issue for me -- even if I increase it all the
way to 6g.  This does appear to be environment-specific (and not an
environment that I would expect to see in Spark deployments), so I agree
that this is not a blocker.

I looked a bit into the other annoying issue that I've been seeing for
awhile now with the shell terminating when YarnClusterSuite is run on an
Ubuntu 16.0.4 box.  Both Sean Owen and I have run into this problem when
running the tests over an ssh connection, and we each assumed that it was
an ssh-specific problem.  Yesterday, though, I spent some time logged
directly into both a normal graphical sessions and console sessions, and I
am seeing similar problems there. Running the tests from the graphical
session actually ends up failing and kicking me all the way out to the
login screen when YarnClusterSuite is run, while doing the same from the
console ends up terminating the shell.  All very strange, and I don't have
much of a clue what is going on yet, but it also seems to quite specific to
this environment, so I wouldn't consider this issue to be a blocker, either

On Fri, Sep 30, 2016 at 8:47 PM, Shixiong(Ryan) Zhu <shixiong@databricks.com
> wrote:

> Hey Mark,
>
> I can reproduce the failure locally using your command. There were a lot
> of OutOfMemoryError in the unit test log. I increased the heap size from 3g
> to 4g at https://github.com/apache/spark/blob/v2.0.1-rc4/pom.xml#L2029
> and it passed tests. I think the patch you mentioned increased the memory
> usage of BlockManagerSuite and made the tests easy to OOM. It can be
> fixed by mocking SparkContext (or may be not necessary since Jenkins's
> maven and sbt builds are green now).
>
> However, since this is only a test issue, it should not be a blocker.
>
>
> On Fri, Sep 30, 2016 at 8:34 AM, Mark Hamstra <ma...@clearstorydata.com>
> wrote:
>
>> 0
>>
>> RC4 is causing a build regression for me on at least one of my machines.
>> RC3 built and ran tests successfully, but the tests consistently fail with
>> RC4 unless I revert 9e91a1009e6f916245b4d4018de1664ea3decfe7,
>> "[SPARK-15703][SCHEDULER][CORE][WEBUI] Make ListenerBus event queue size
>> configurable (branch 2.0)".  This is using build/mvn -U -Pyarn -Phadoop-2.7
>> -Pkinesis-asl -Phive -Phive-thriftserver -Dpyspark -Dsparkr -DskipTests
>> clean package; build/mvn -U -Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive
>> -Phive-thriftserver -Dpyspark -Dsparkr test.  Environment is macOS 10.12,
>> Java 1.8.0_102.
>>
>> There are no tests that go red.  Rather, the core tests just end after...
>>
>> ...
>> BlockManagerSuite:
>> ...
>> - overly large block
>> - block compression
>> - block store put failure
>>
>> ...with only the generic "[ERROR] Failed to execute goal
>> org.scalatest:scalatest-maven-plugin:1.0:test (test) on project
>> spark-core_2.11: There are test failures".
>>
>> I'll try some other environments today to see whether I can turn this 0
>> into either a -1 or +1, but right now that commit is looking deeply
>> suspicious to me.
>>
>> On Wed, Sep 28, 2016 at 7:14 PM, Reynold Xin <rx...@databricks.com> wrote:
>>
>>> Please vote on releasing the following candidate as Apache Spark version
>>> 2.0.1. The vote is open until Sat, Oct 1, 2016 at 20:00 PDT and passes if a
>>> majority of at least 3+1 PMC votes are cast.
>>>
>>> [ ] +1 Release this package as Apache Spark 2.0.1
>>> [ ] -1 Do not release this package because ...
>>>
>>>
>>> The tag to be voted on is v2.0.1-rc4 (933d2c1ea4e5f5c4ec8d375b5ccaa
>>> 4577ba4be38)
>>>
>>> This release candidate resolves 301 issues:
>>> https://s.apache.org/spark-2.0.1-jira
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.1-rc4-bin/
>>>
>>> Release artifacts are signed with the following key:
>>> https://people.apache.org/keys/committer/pwendell.asc
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1203/
>>>
>>> The documentation corresponding to this release can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.1-rc4-docs/
>>>
>>>
>>> Q: How can I help test this release?
>>> A: If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate, then
>>> reporting any regressions from 2.0.0.
>>>
>>> Q: What justifies a -1 vote for this release?
>>> A: This is a maintenance release in the 2.0.x series.  Bugs already
>>> present in 2.0.0, missing features, or bugs related to new features will
>>> not necessarily block this release.
>>>
>>> Q: What fix version should I use for patches merging into branch-2.0
>>> from now on?
>>> A: Please mark the fix version as 2.0.2, rather than 2.0.1. If a new RC
>>> (i.e. RC5) is cut, I will change the fix version of those patches to 2.0.1.
>>>
>>>
>>>
>>
>