You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Reynold Xin <rx...@databricks.com> on 2016/06/22 01:26:31 UTC

[VOTE] Release Apache Spark 2.0.0 (RC1)

Please vote on releasing the following candidate as Apache Spark version
2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
if a majority of at least 3+1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.0.0
[ ] -1 Do not release this package because ...


The tag to be voted on is v2.0.0-rc1
(0c66ca41afade6db73c9aeddd5aed6e5dcea90df).

This release candidate resolves ~2400 issues:
https://s.apache.org/spark-2.0.0-rc1-jira

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1187/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/


=======================================
== How can I help test this release? ==
=======================================
If you are a Spark user, you can help us test this release by taking an
existing Spark workload and running on this release candidate, then
reporting any regressions from 1.x.

================================================
== What justifies a -1 vote for this release? ==
================================================
Critical bugs impacting major functionalities.

Bugs already present in 1.x, missing features, or bugs related to new
features will not necessarily block this release. Note that historically
Spark documentation has been published on the website separately from the
main release so we do not need to block the release due to documentation
errors either.

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Nick Pentreath <ni...@gmail.com>.

Hey everyone,

Is there an idea for updated timeline for cutting a next RC? Do we have a
clear picture of outstanding issues? I see 21 issues marked Blocker or
Critical targeted at 2.0.0.

The only blockers I see on JIRA are related to MLlib doc updates etc (I
will go through a few of these to clean them up and see where they stand).
If there are other blockers then we should mark them as such to help
tracking progress?


On Tue, 28 Jun 2016 at 11:28 Nick Pentreath <ni...@gmail.com>
wrote:

> I take it there will be another RC due to some blockers and as there were
> no +1 votes anyway.
>
> FWIW, I cannot run python tests using "./python/run-tests".
>
> I'd be -1 for this reason (see https://github.com/apache/spark/pull/13737 /
> http://issues.apache.org/jira/browse/SPARK-15954) - does anyone else
> encounter this?
>
> ./python/run-tests --python-executables=python2.7
> Running PySpark tests. Output is in
> /Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/unit-tests.log
> Will test against the following Python executables: ['python2.7']
> Will test the following Python modules: ['pyspark-core', 'pyspark-ml',
> 'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming']
> ....Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel).
> ======================================================================
> ERROR: setUpClass (pyspark.sql.tests.HiveContextSQLTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File
> "/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/sql/tests.py",
> line 1620, in setUpClass
>     cls.spark = HiveContext._createForTesting(cls.sc)
>   File
> "/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/sql/context.py",
> line 490, in _createForTesting
>     jtestHive =
> sparkContext._jvm.org.apache.spark.sql.hive.test.TestHiveContext(jsc)
>   File
> "/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py",
> line 1183, in __call__
>     answer, self._gateway_client, None, self._fqn)
>   File
> "/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py",
> line 312, in get_return_value
>     format(target_id, ".", name), value)
> Py4JJavaError: An error occurred while calling
> None.org.apache.spark.sql.hive.test.TestHiveContext.
> : java.lang.NullPointerException
> at
> org.apache.spark.sql.hive.test.TestHiveSparkSession.getHiveFile(TestHive.scala:183)
> at
> org.apache.spark.sql.hive.test.TestHiveSparkSession.<init>(TestHive.scala:214)
> at
> org.apache.spark.sql.hive.test.TestHiveSparkSession.<init>(TestHive.scala:122)
> at org.apache.spark.sql.hive.test.TestHiveContext.<init>(TestHive.scala:77)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> at py4j.Gateway.invoke(Gateway.java:236)
> at
> py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
> at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
> at py4j.GatewayConnection.run(GatewayConnection.java:211)
> at java.lang.Thread.run(Thread.java:745)
>
>
> ======================================================================
> ERROR: setUpClass (pyspark.sql.tests.SQLTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File
> "/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/sql/tests.py",
> line 189, in setUpClass
>     ReusedPySparkTestCase.setUpClass()
>   File
> "/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/tests.py",
> line 344, in setUpClass
>     cls.sc = SparkContext('local[4]', cls.__name__)
>   File
> "/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/context.py",
> line 112, in __init__
>     SparkContext._ensure_initialized(self, gateway=gateway)
>   File
> "/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/context.py",
> line 261, in _ensure_initialized
>     callsite.function, callsite.file, callsite.linenum))
> ValueError: Cannot run multiple SparkContexts at once; existing
> SparkContext(app=ReusedPySparkTestCase, master=local[4]) created by
> <module> at /Users/nick/miniconda2/lib/python2.7/runpy.py:72
>
> ----------------------------------------------------------------------
> Ran 4 tests in 4.800s
>
> FAILED (errors=2)
>
> Had test failures in pyspark.sql.tests with python2.7; see logs.
>
>
> On Mon, 27 Jun 2016 at 20:13 Egor Pahomov <pa...@gmail.com> wrote:
>
>> -1 : SPARK-16228 [SQL]  - "Percentile" needs explicit cast to double,
>> otherwise it throws an error. I can not move my existing 100500 quires to
>> 2.0 transparently.
>>
>> 2016-06-24 11:52 GMT-07:00 Matt Cheah <mc...@palantir.com>:
>>
>>> -1 because of SPARK-16181 which is a correctness regression from 1.6.
>>> Looks like the patch is ready though:
>>> https://github.com/apache/spark/pull/13884 – it would be ideal for this
>>> patch to make it into the release.
>>>
>>> -Matt Cheah
>>>
>>> From: Nick Pentreath <ni...@gmail.com>
>>> Date: Friday, June 24, 2016 at 4:37 AM
>>> To: "dev@spark.apache.org" <de...@spark.apache.org>
>>> Subject: Re: [VOTE] Release Apache Spark 2.0.0 (RC1)
>>>
>>> I'm getting the following when trying to run ./dev/run-tests (not
>>> happening on master) from the extracted source tar. Anyone else seeing
>>> this?
>>>
>>> error: Could not access 'fc0a1475ef'
>>> **********************************************************************
>>> File "./dev/run-tests.py", line 69, in
>>> __main__.identify_changed_files_from_git_commits
>>> Failed example:
>>>     [x.name
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
>>> for x in determine_modules_for_files(
>>> identify_changed_files_from_git_commits("fc0a1475ef",
>>> target_ref="5da21f07"))]
>>> Exception raised:
>>>     Traceback (most recent call last):
>>>       File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315,
>>> in __run
>>>         compileflags, 1) in test.globs
>>>       File "<doctest
>>> __main__.identify_changed_files_from_git_commits[0]>", line 1, in <module>
>>>         [x.name
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
>>> for x in determine_modules_for_files(
>>> identify_changed_files_from_git_commits("fc0a1475ef",
>>> target_ref="5da21f07"))]
>>>       File "./dev/run-tests.py", line 86, in
>>> identify_changed_files_from_git_commits
>>>         universal_newlines=True)
>>>       File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line
>>> 573, in check_output
>>>         raise CalledProcessError(retcode, cmd, output=output)
>>>     CalledProcessError: Command '['git', 'diff', '--name-only',
>>> 'fc0a1475ef', '5da21f07']' returned non-zero exit status 1
>>> error: Could not access '50a0496a43'
>>> **********************************************************************
>>> File "./dev/run-tests.py", line 71, in
>>> __main__.identify_changed_files_from_git_commits
>>> Failed example:
>>>     'root' in [x.name
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
>>> for x in determine_modules_for_files(
>>>  identify_changed_files_from_git_commits("50a0496a43",
>>> target_ref="6765ef9"))]
>>> Exception raised:
>>>     Traceback (most recent call last):
>>>       File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315,
>>> in __run
>>>         compileflags, 1) in test.globs
>>>       File "<doctest
>>> __main__.identify_changed_files_from_git_commits[1]>", line 1, in <module>
>>>         'root' in [x.name
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
>>> for x in determine_modules_for_files(
>>>  identify_changed_files_from_git_commits("50a0496a43",
>>> target_ref="6765ef9"))]
>>>       File "./dev/run-tests.py", line 86, in
>>> identify_changed_files_from_git_commits
>>>         universal_newlines=True)
>>>       File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line
>>> 573, in check_output
>>>         raise CalledProcessError(retcode, cmd, output=output)
>>>     CalledProcessError: Command '['git', 'diff', '--name-only',
>>> '50a0496a43', '6765ef9']' returned non-zero exit status 1
>>> **********************************************************************
>>> 1 items had failures:
>>>    2 of   2 in __main__.identify_changed_files_from_git_commits
>>> ***Test Failed*** 2 failures.
>>>
>>>
>>>
>>> On Fri, 24 Jun 2016 at 06:59 Yin Huai <yh...@databricks.com> wrote:
>>>
>>>> -1 because of https://issues.apache.org/jira/browse/SPARK-16121
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D16121&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=9200NP4SpeJSUNrSrlWWEC7vFvjWSyCHnx5LD7Sj9u4&e=>.
>>>>
>>>>
>>>> This jira was resolved after 2.0.0-RC1 was cut. Without the fix, Spark
>>>> SQL effectively only uses the driver to list files when loading datasets
>>>> and the driver-side file listing is very slow for datasets having many
>>>> files and partitions. Since this bug causes a serious performance
>>>> regression, I am giving -1.
>>>>
>>>> On Thu, Jun 23, 2016 at 1:25 AM, Pete Robbins <ro...@gmail.com>
>>>> wrote:
>>>>
>>>>> I'm also seeing some of these same failures:
>>>>>
>>>>> - spilling with compression *** FAILED ***
>>>>> I have seen this occassionaly
>>>>>
>>>>> - to UTC timestamp *** FAILED ***
>>>>> This was fixed yesterday in branch-2.0 (
>>>>> https://issues.apache.org/jira/browse/SPARK-16078
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D16078&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=SuVdXUNGdAhYgtA2fMLe5vZ2PFrPOaeO3i3cbhYU4tc&e=>)
>>>>>
>>>>>
>>>>> - offset recovery *** FAILED ***
>>>>> Haven't seen this for a while and thought the flaky test was fixed but
>>>>> it popped up again in one of our builds.
>>>>>
>>>>> StateStoreSuite:
>>>>> - maintenance *** FAILED ***
>>>>> Just seen this has been failing for last 2 days on one build machine
>>>>> (linux amd64)
>>>>>
>>>>>
>>>>> On 23 June 2016 at 08:51, Sean Owen <so...@cloudera.com> wrote:
>>>>>
>>>>>> First pass of feedback on the RC: all the sigs, hashes, etc are fine.
>>>>>> Licensing is up to date to the best of my knowledge.
>>>>>>
>>>>>> I'm hitting test failures, some of which may be spurious. Just putting
>>>>>> them out there to see if they ring bells. This is Java 8 on Ubuntu 16.
>>>>>>
>>>>>>
>>>>>> - spilling with compression *** FAILED ***
>>>>>>   java.lang.Exception: Test failed with compression using codec
>>>>>> org.apache.spark.io.SnappyCompressionCodec:
>>>>>> assertion failed: expected cogroup to spill, but did not
>>>>>>   at scala.Predef$.assert(Predef.scala:170)
>>>>>>   at org.apache.spark.TestUtils$.assertSpilled(TestUtils.scala:170)
>>>>>>   at org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=goarAptcJYfLg44f7BAwhbipqJlRFKz9Y6Z36HItiKg&e=>
>>>>>> $apache$spark$util$collection$ExternalAppendOnlyMapSuite$$testSimpleSpilling(ExternalAppendOnlyMapSuite.scala:263)
>>>>>> ...
>>>>>>
>>>>>> I feel like I've seen this before, and see some possibly relevant
>>>>>> fixes, but they're in 2.0.0 already:
>>>>>> https://github.com/apache/spark/pull/10990
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_10990&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=dFymYD9NRVHIJ5MKpmzPcH_NYwLjOWcZd7FUuQBpTUU&e=>
>>>>>> Is this something where a native library needs to be installed or
>>>>>> something?
>>>>>>
>>>>>>
>>>>>> - to UTC timestamp *** FAILED ***
>>>>>>   "2016-03-13 [02]:00:00.0" did not equal "2016-03-13 [10]:00:00.0"
>>>>>> (DateTimeUtilsSuite.scala:506)
>>>>>>
>>>>>> I know, we talked about this for the 1.6.2 RC, but I reproduced this
>>>>>> locally too. I will investigate, could still be spurious.
>>>>>>
>>>>>>
>>>>>> StateStoreSuite:
>>>>>> - maintenance *** FAILED ***
>>>>>>   The code passed to eventually never returned normally. Attempted 627
>>>>>> times over 10.000180116 seconds. Last failure message:
>>>>>> StateStoreSuite.this.fileExists(provider, 1L, false) was true earliest
>>>>>> file not deleted. (StateStoreSuite.scala:395)
>>>>>>
>>>>>> No idea.
>>>>>>
>>>>>>
>>>>>> - offset recovery *** FAILED ***
>>>>>>   The code passed to eventually never returned normally. Attempted 197
>>>>>> times over 10.040864806 seconds. Last failure message:
>>>>>> strings.forall({
>>>>>>     ((x$1: Any) => DirectKafkaStreamSuite.collectedData.contains(x$1))
>>>>>>   }) was false. (DirectKafkaStreamSuite.scala:250)
>>>>>>
>>>>>> Also something that was possibly fixed already for 2.0.0 and that I
>>>>>> just back-ported into 1.6. Could be just a very similar failure.
>>>>>>
>>>>>> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com>
>>>>>> wrote:
>>>>>> > Please vote on releasing the following candidate as Apache Spark
>>>>>> version
>>>>>> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT
>>>>>> and passes
>>>>>> > if a majority of at least 3+1 PMC votes are cast.
>>>>>> >
>>>>>> > [ ] +1 Release this package as Apache Spark 2.0.0
>>>>>> > [ ] -1 Do not release this package because ...
>>>>>> >
>>>>>> >
>>>>>> > The tag to be voted on is v2.0.0-rc1
>>>>>> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>>>>>> >
>>>>>> > This release candidate resolves ~2400 issues:
>>>>>> > https://s.apache.org/spark-2.0.0-rc1-jira
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__s.apache.org_spark-2D2.0.0-2Drc1-2Djira&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=ZD_PezvsJ1GyDhv7MhaeUrVba_uhED5mPkqKpfenKEE&e=>
>>>>>> >
>>>>>> > The release files, including signatures, digests, etc. can be found
>>>>>> at:
>>>>>> >
>>>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__people.apache.org_-7Epwendell_spark-2Dreleases_spark-2D2.0.0-2Drc1-2Dbin_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wSbzZ2LyuDcNKaCijEPdt9rokQ0R9w66tn2jMfjKN2I&e=>
>>>>>> >
>>>>>> > Release artifacts are signed with the following key:
>>>>>> > https://people.apache.org/keys/committer/pwendell.asc
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__people.apache.org_keys_committer_pwendell.asc&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=i1Uxw1NyUf2iuA3CXbyiEODD1RR24rAXUvkc42ut8Ao&e=>
>>>>>> >
>>>>>> > The staging repository for this release can be found at:
>>>>>> >
>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1187/
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__repository.apache.org_content_repositories_orgapachespark-2D1187_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=QjsvnxXe6JBQqXwKw6r-fIIHI9E0ugeeICAqjRXRNwc&e=>
>>>>>> >
>>>>>> > The documentation corresponding to this release can be found at:
>>>>>> >
>>>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__people.apache.org_-7Epwendell_spark-2Dreleases_spark-2D2.0.0-2Drc1-2Ddocs_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=_6IZExLgc8WoxW0kft_weR7AvELgbFXnHZdezQ_IYGk&e=>
>>>>>> >
>>>>>> >
>>>>>> > =======================================
>>>>>> > == How can I help test this release? ==
>>>>>> > =======================================
>>>>>> > If you are a Spark user, you can help us test this release by
>>>>>> taking an
>>>>>> > existing Spark workload and running on this release candidate, then
>>>>>> > reporting any regressions from 1.x.
>>>>>> >
>>>>>> > ================================================
>>>>>> > == What justifies a -1 vote for this release? ==
>>>>>> > ================================================
>>>>>> > Critical bugs impacting major functionalities.
>>>>>> >
>>>>>> > Bugs already present in 1.x, missing features, or bugs related to
>>>>>> new
>>>>>> > features will not necessarily block this release. Note that
>>>>>> historically
>>>>>> > Spark documentation has been published on the website separately
>>>>>> from the
>>>>>> > main release so we do not need to block the release due to
>>>>>> documentation
>>>>>> > errors either.
>>>>>> >
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>>
>> --
>>
>>
>> *Sincerely yoursEgor Pakhomov*
>>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Nick Pentreath <ni...@gmail.com>.

I take it there will be another RC due to some blockers and as there were
no +1 votes anyway.

FWIW, I cannot run python tests using "./python/run-tests".

I'd be -1 for this reason (see https://github.com/apache/spark/pull/13737 /
http://issues.apache.org/jira/browse/SPARK-15954) - does anyone else
encounter this?

./python/run-tests --python-executables=python2.7
Running PySpark tests. Output is in
/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/unit-tests.log
Will test against the following Python executables: ['python2.7']
Will test the following Python modules: ['pyspark-core', 'pyspark-ml',
'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming']
....Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
======================================================================
ERROR: setUpClass (pyspark.sql.tests.HiveContextSQLTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/sql/tests.py",
line 1620, in setUpClass
    cls.spark = HiveContext._createForTesting(cls.sc)
  File
"/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/sql/context.py",
line 490, in _createForTesting
    jtestHive =
sparkContext._jvm.org.apache.spark.sql.hive.test.TestHiveContext(jsc)
  File
"/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py",
line 1183, in __call__
    answer, self._gateway_client, None, self._fqn)
  File
"/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py",
line 312, in get_return_value
    format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling
None.org.apache.spark.sql.hive.test.TestHiveContext.
: java.lang.NullPointerException
at
org.apache.spark.sql.hive.test.TestHiveSparkSession.getHiveFile(TestHive.scala:183)
at
org.apache.spark.sql.hive.test.TestHiveSparkSession.<init>(TestHive.scala:214)
at
org.apache.spark.sql.hive.test.TestHiveSparkSession.<init>(TestHive.scala:122)
at org.apache.spark.sql.hive.test.TestHiveContext.<init>(TestHive.scala:77)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:211)
at java.lang.Thread.run(Thread.java:745)


======================================================================
ERROR: setUpClass (pyspark.sql.tests.SQLTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/sql/tests.py",
line 189, in setUpClass
    ReusedPySparkTestCase.setUpClass()
  File
"/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/tests.py",
line 344, in setUpClass
    cls.sc = SparkContext('local[4]', cls.__name__)
  File
"/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/context.py",
line 112, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway)
  File
"/Users/nick/workspace/scala/spark-rcs/spark-2.0.0/python/pyspark/context.py",
line 261, in _ensure_initialized
    callsite.function, callsite.file, callsite.linenum))
ValueError: Cannot run multiple SparkContexts at once; existing
SparkContext(app=ReusedPySparkTestCase, master=local[4]) created by
<module> at /Users/nick/miniconda2/lib/python2.7/runpy.py:72

----------------------------------------------------------------------
Ran 4 tests in 4.800s

FAILED (errors=2)

Had test failures in pyspark.sql.tests with python2.7; see logs.


On Mon, 27 Jun 2016 at 20:13 Egor Pahomov <pa...@gmail.com> wrote:

> -1 : SPARK-16228 [SQL]  - "Percentile" needs explicit cast to double,
> otherwise it throws an error. I can not move my existing 100500 quires to
> 2.0 transparently.
>
> 2016-06-24 11:52 GMT-07:00 Matt Cheah <mc...@palantir.com>:
>
>> -1 because of SPARK-16181 which is a correctness regression from 1.6.
>> Looks like the patch is ready though:
>> https://github.com/apache/spark/pull/13884 – it would be ideal for this
>> patch to make it into the release.
>>
>> -Matt Cheah
>>
>> From: Nick Pentreath <ni...@gmail.com>
>> Date: Friday, June 24, 2016 at 4:37 AM
>> To: "dev@spark.apache.org" <de...@spark.apache.org>
>> Subject: Re: [VOTE] Release Apache Spark 2.0.0 (RC1)
>>
>> I'm getting the following when trying to run ./dev/run-tests (not
>> happening on master) from the extracted source tar. Anyone else seeing
>> this?
>>
>> error: Could not access 'fc0a1475ef'
>> **********************************************************************
>> File "./dev/run-tests.py", line 69, in
>> __main__.identify_changed_files_from_git_commits
>> Failed example:
>>     [x.name
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
>> for x in determine_modules_for_files(
>> identify_changed_files_from_git_commits("fc0a1475ef",
>> target_ref="5da21f07"))]
>> Exception raised:
>>     Traceback (most recent call last):
>>       File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315,
>> in __run
>>         compileflags, 1) in test.globs
>>       File "<doctest
>> __main__.identify_changed_files_from_git_commits[0]>", line 1, in <module>
>>         [x.name
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
>> for x in determine_modules_for_files(
>> identify_changed_files_from_git_commits("fc0a1475ef",
>> target_ref="5da21f07"))]
>>       File "./dev/run-tests.py", line 86, in
>> identify_changed_files_from_git_commits
>>         universal_newlines=True)
>>       File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line
>> 573, in check_output
>>         raise CalledProcessError(retcode, cmd, output=output)
>>     CalledProcessError: Command '['git', 'diff', '--name-only',
>> 'fc0a1475ef', '5da21f07']' returned non-zero exit status 1
>> error: Could not access '50a0496a43'
>> **********************************************************************
>> File "./dev/run-tests.py", line 71, in
>> __main__.identify_changed_files_from_git_commits
>> Failed example:
>>     'root' in [x.name
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
>> for x in determine_modules_for_files(
>>  identify_changed_files_from_git_commits("50a0496a43",
>> target_ref="6765ef9"))]
>> Exception raised:
>>     Traceback (most recent call last):
>>       File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315,
>> in __run
>>         compileflags, 1) in test.globs
>>       File "<doctest
>> __main__.identify_changed_files_from_git_commits[1]>", line 1, in <module>
>>         'root' in [x.name
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
>> for x in determine_modules_for_files(
>>  identify_changed_files_from_git_commits("50a0496a43",
>> target_ref="6765ef9"))]
>>       File "./dev/run-tests.py", line 86, in
>> identify_changed_files_from_git_commits
>>         universal_newlines=True)
>>       File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line
>> 573, in check_output
>>         raise CalledProcessError(retcode, cmd, output=output)
>>     CalledProcessError: Command '['git', 'diff', '--name-only',
>> '50a0496a43', '6765ef9']' returned non-zero exit status 1
>> **********************************************************************
>> 1 items had failures:
>>    2 of   2 in __main__.identify_changed_files_from_git_commits
>> ***Test Failed*** 2 failures.
>>
>>
>>
>> On Fri, 24 Jun 2016 at 06:59 Yin Huai <yh...@databricks.com> wrote:
>>
>>> -1 because of https://issues.apache.org/jira/browse/SPARK-16121
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D16121&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=9200NP4SpeJSUNrSrlWWEC7vFvjWSyCHnx5LD7Sj9u4&e=>.
>>>
>>>
>>> This jira was resolved after 2.0.0-RC1 was cut. Without the fix, Spark
>>> SQL effectively only uses the driver to list files when loading datasets
>>> and the driver-side file listing is very slow for datasets having many
>>> files and partitions. Since this bug causes a serious performance
>>> regression, I am giving -1.
>>>
>>> On Thu, Jun 23, 2016 at 1:25 AM, Pete Robbins <ro...@gmail.com>
>>> wrote:
>>>
>>>> I'm also seeing some of these same failures:
>>>>
>>>> - spilling with compression *** FAILED ***
>>>> I have seen this occassionaly
>>>>
>>>> - to UTC timestamp *** FAILED ***
>>>> This was fixed yesterday in branch-2.0 (
>>>> https://issues.apache.org/jira/browse/SPARK-16078
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D16078&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=SuVdXUNGdAhYgtA2fMLe5vZ2PFrPOaeO3i3cbhYU4tc&e=>)
>>>>
>>>>
>>>> - offset recovery *** FAILED ***
>>>> Haven't seen this for a while and thought the flaky test was fixed but
>>>> it popped up again in one of our builds.
>>>>
>>>> StateStoreSuite:
>>>> - maintenance *** FAILED ***
>>>> Just seen this has been failing for last 2 days on one build machine
>>>> (linux amd64)
>>>>
>>>>
>>>> On 23 June 2016 at 08:51, Sean Owen <so...@cloudera.com> wrote:
>>>>
>>>>> First pass of feedback on the RC: all the sigs, hashes, etc are fine.
>>>>> Licensing is up to date to the best of my knowledge.
>>>>>
>>>>> I'm hitting test failures, some of which may be spurious. Just putting
>>>>> them out there to see if they ring bells. This is Java 8 on Ubuntu 16.
>>>>>
>>>>>
>>>>> - spilling with compression *** FAILED ***
>>>>>   java.lang.Exception: Test failed with compression using codec
>>>>> org.apache.spark.io.SnappyCompressionCodec:
>>>>> assertion failed: expected cogroup to spill, but did not
>>>>>   at scala.Predef$.assert(Predef.scala:170)
>>>>>   at org.apache.spark.TestUtils$.assertSpilled(TestUtils.scala:170)
>>>>>   at org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org
>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=goarAptcJYfLg44f7BAwhbipqJlRFKz9Y6Z36HItiKg&e=>
>>>>> $apache$spark$util$collection$ExternalAppendOnlyMapSuite$$testSimpleSpilling(ExternalAppendOnlyMapSuite.scala:263)
>>>>> ...
>>>>>
>>>>> I feel like I've seen this before, and see some possibly relevant
>>>>> fixes, but they're in 2.0.0 already:
>>>>> https://github.com/apache/spark/pull/10990
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_10990&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=dFymYD9NRVHIJ5MKpmzPcH_NYwLjOWcZd7FUuQBpTUU&e=>
>>>>> Is this something where a native library needs to be installed or
>>>>> something?
>>>>>
>>>>>
>>>>> - to UTC timestamp *** FAILED ***
>>>>>   "2016-03-13 [02]:00:00.0" did not equal "2016-03-13 [10]:00:00.0"
>>>>> (DateTimeUtilsSuite.scala:506)
>>>>>
>>>>> I know, we talked about this for the 1.6.2 RC, but I reproduced this
>>>>> locally too. I will investigate, could still be spurious.
>>>>>
>>>>>
>>>>> StateStoreSuite:
>>>>> - maintenance *** FAILED ***
>>>>>   The code passed to eventually never returned normally. Attempted 627
>>>>> times over 10.000180116 seconds. Last failure message:
>>>>> StateStoreSuite.this.fileExists(provider, 1L, false) was true earliest
>>>>> file not deleted. (StateStoreSuite.scala:395)
>>>>>
>>>>> No idea.
>>>>>
>>>>>
>>>>> - offset recovery *** FAILED ***
>>>>>   The code passed to eventually never returned normally. Attempted 197
>>>>> times over 10.040864806 seconds. Last failure message:
>>>>> strings.forall({
>>>>>     ((x$1: Any) => DirectKafkaStreamSuite.collectedData.contains(x$1))
>>>>>   }) was false. (DirectKafkaStreamSuite.scala:250)
>>>>>
>>>>> Also something that was possibly fixed already for 2.0.0 and that I
>>>>> just back-ported into 1.6. Could be just a very similar failure.
>>>>>
>>>>> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> > Please vote on releasing the following candidate as Apache Spark
>>>>> version
>>>>> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
>>>>> passes
>>>>> > if a majority of at least 3+1 PMC votes are cast.
>>>>> >
>>>>> > [ ] +1 Release this package as Apache Spark 2.0.0
>>>>> > [ ] -1 Do not release this package because ...
>>>>> >
>>>>> >
>>>>> > The tag to be voted on is v2.0.0-rc1
>>>>> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>>>>> >
>>>>> > This release candidate resolves ~2400 issues:
>>>>> > https://s.apache.org/spark-2.0.0-rc1-jira
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__s.apache.org_spark-2D2.0.0-2Drc1-2Djira&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=ZD_PezvsJ1GyDhv7MhaeUrVba_uhED5mPkqKpfenKEE&e=>
>>>>> >
>>>>> > The release files, including signatures, digests, etc. can be found
>>>>> at:
>>>>> >
>>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__people.apache.org_-7Epwendell_spark-2Dreleases_spark-2D2.0.0-2Drc1-2Dbin_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wSbzZ2LyuDcNKaCijEPdt9rokQ0R9w66tn2jMfjKN2I&e=>
>>>>> >
>>>>> > Release artifacts are signed with the following key:
>>>>> > https://people.apache.org/keys/committer/pwendell.asc
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__people.apache.org_keys_committer_pwendell.asc&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=i1Uxw1NyUf2iuA3CXbyiEODD1RR24rAXUvkc42ut8Ao&e=>
>>>>> >
>>>>> > The staging repository for this release can be found at:
>>>>> >
>>>>> https://repository.apache.org/content/repositories/orgapachespark-1187/
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__repository.apache.org_content_repositories_orgapachespark-2D1187_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=QjsvnxXe6JBQqXwKw6r-fIIHI9E0ugeeICAqjRXRNwc&e=>
>>>>> >
>>>>> > The documentation corresponding to this release can be found at:
>>>>> >
>>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__people.apache.org_-7Epwendell_spark-2Dreleases_spark-2D2.0.0-2Drc1-2Ddocs_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=_6IZExLgc8WoxW0kft_weR7AvELgbFXnHZdezQ_IYGk&e=>
>>>>> >
>>>>> >
>>>>> > =======================================
>>>>> > == How can I help test this release? ==
>>>>> > =======================================
>>>>> > If you are a Spark user, you can help us test this release by taking
>>>>> an
>>>>> > existing Spark workload and running on this release candidate, then
>>>>> > reporting any regressions from 1.x.
>>>>> >
>>>>> > ================================================
>>>>> > == What justifies a -1 vote for this release? ==
>>>>> > ================================================
>>>>> > Critical bugs impacting major functionalities.
>>>>> >
>>>>> > Bugs already present in 1.x, missing features, or bugs related to new
>>>>> > features will not necessarily block this release. Note that
>>>>> historically
>>>>> > Spark documentation has been published on the website separately
>>>>> from the
>>>>> > main release so we do not need to block the release due to
>>>>> documentation
>>>>> > errors either.
>>>>> >
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>
>
> --
>
>
> *Sincerely yoursEgor Pakhomov*
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Egor Pahomov <pa...@gmail.com>.

-1 : SPARK-16228 [SQL]  - "Percentile" needs explicit cast to double,
otherwise it throws an error. I can not move my existing 100500 quires to
2.0 transparently.

2016-06-24 11:52 GMT-07:00 Matt Cheah <mc...@palantir.com>:

> -1 because of SPARK-16181 which is a correctness regression from 1.6.
> Looks like the patch is ready though:
> https://github.com/apache/spark/pull/13884 – it would be ideal for this
> patch to make it into the release.
>
> -Matt Cheah
>
> From: Nick Pentreath <ni...@gmail.com>
> Date: Friday, June 24, 2016 at 4:37 AM
> To: "dev@spark.apache.org" <de...@spark.apache.org>
> Subject: Re: [VOTE] Release Apache Spark 2.0.0 (RC1)
>
> I'm getting the following when trying to run ./dev/run-tests (not
> happening on master) from the extracted source tar. Anyone else seeing
> this?
>
> error: Could not access 'fc0a1475ef'
> **********************************************************************
> File "./dev/run-tests.py", line 69, in
> __main__.identify_changed_files_from_git_commits
> Failed example:
>     [x.name
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
> for x in determine_modules_for_files(
> identify_changed_files_from_git_commits("fc0a1475ef",
> target_ref="5da21f07"))]
> Exception raised:
>     Traceback (most recent call last):
>       File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315,
> in __run
>         compileflags, 1) in test.globs
>       File "<doctest
> __main__.identify_changed_files_from_git_commits[0]>", line 1, in <module>
>         [x.name
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
> for x in determine_modules_for_files(
> identify_changed_files_from_git_commits("fc0a1475ef",
> target_ref="5da21f07"))]
>       File "./dev/run-tests.py", line 86, in
> identify_changed_files_from_git_commits
>         universal_newlines=True)
>       File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line 573,
> in check_output
>         raise CalledProcessError(retcode, cmd, output=output)
>     CalledProcessError: Command '['git', 'diff', '--name-only',
> 'fc0a1475ef', '5da21f07']' returned non-zero exit status 1
> error: Could not access '50a0496a43'
> **********************************************************************
> File "./dev/run-tests.py", line 71, in
> __main__.identify_changed_files_from_git_commits
> Failed example:
>     'root' in [x.name
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
> for x in determine_modules_for_files(
>  identify_changed_files_from_git_commits("50a0496a43",
> target_ref="6765ef9"))]
> Exception raised:
>     Traceback (most recent call last):
>       File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315,
> in __run
>         compileflags, 1) in test.globs
>       File "<doctest
> __main__.identify_changed_files_from_git_commits[1]>", line 1, in <module>
>         'root' in [x.name
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=>
> for x in determine_modules_for_files(
>  identify_changed_files_from_git_commits("50a0496a43",
> target_ref="6765ef9"))]
>       File "./dev/run-tests.py", line 86, in
> identify_changed_files_from_git_commits
>         universal_newlines=True)
>       File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line 573,
> in check_output
>         raise CalledProcessError(retcode, cmd, output=output)
>     CalledProcessError: Command '['git', 'diff', '--name-only',
> '50a0496a43', '6765ef9']' returned non-zero exit status 1
> **********************************************************************
> 1 items had failures:
>    2 of   2 in __main__.identify_changed_files_from_git_commits
> ***Test Failed*** 2 failures.
>
>
>
> On Fri, 24 Jun 2016 at 06:59 Yin Huai <yh...@databricks.com> wrote:
>
>> -1 because of https://issues.apache.org/jira/browse/SPARK-16121
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D16121&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=9200NP4SpeJSUNrSrlWWEC7vFvjWSyCHnx5LD7Sj9u4&e=>.
>>
>>
>> This jira was resolved after 2.0.0-RC1 was cut. Without the fix, Spark
>> SQL effectively only uses the driver to list files when loading datasets
>> and the driver-side file listing is very slow for datasets having many
>> files and partitions. Since this bug causes a serious performance
>> regression, I am giving -1.
>>
>> On Thu, Jun 23, 2016 at 1:25 AM, Pete Robbins <ro...@gmail.com>
>> wrote:
>>
>>> I'm also seeing some of these same failures:
>>>
>>> - spilling with compression *** FAILED ***
>>> I have seen this occassionaly
>>>
>>> - to UTC timestamp *** FAILED ***
>>> This was fixed yesterday in branch-2.0 (
>>> https://issues.apache.org/jira/browse/SPARK-16078
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D16078&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=SuVdXUNGdAhYgtA2fMLe5vZ2PFrPOaeO3i3cbhYU4tc&e=>)
>>>
>>>
>>> - offset recovery *** FAILED ***
>>> Haven't seen this for a while and thought the flaky test was fixed but
>>> it popped up again in one of our builds.
>>>
>>> StateStoreSuite:
>>> - maintenance *** FAILED ***
>>> Just seen this has been failing for last 2 days on one build machine
>>> (linux amd64)
>>>
>>>
>>> On 23 June 2016 at 08:51, Sean Owen <so...@cloudera.com> wrote:
>>>
>>>> First pass of feedback on the RC: all the sigs, hashes, etc are fine.
>>>> Licensing is up to date to the best of my knowledge.
>>>>
>>>> I'm hitting test failures, some of which may be spurious. Just putting
>>>> them out there to see if they ring bells. This is Java 8 on Ubuntu 16.
>>>>
>>>>
>>>> - spilling with compression *** FAILED ***
>>>>   java.lang.Exception: Test failed with compression using codec
>>>> org.apache.spark.io.SnappyCompressionCodec:
>>>> assertion failed: expected cogroup to spill, but did not
>>>>   at scala.Predef$.assert(Predef.scala:170)
>>>>   at org.apache.spark.TestUtils$.assertSpilled(TestUtils.scala:170)
>>>>   at org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=goarAptcJYfLg44f7BAwhbipqJlRFKz9Y6Z36HItiKg&e=>
>>>> $apache$spark$util$collection$ExternalAppendOnlyMapSuite$$testSimpleSpilling(ExternalAppendOnlyMapSuite.scala:263)
>>>> ...
>>>>
>>>> I feel like I've seen this before, and see some possibly relevant
>>>> fixes, but they're in 2.0.0 already:
>>>> https://github.com/apache/spark/pull/10990
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_10990&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=dFymYD9NRVHIJ5MKpmzPcH_NYwLjOWcZd7FUuQBpTUU&e=>
>>>> Is this something where a native library needs to be installed or
>>>> something?
>>>>
>>>>
>>>> - to UTC timestamp *** FAILED ***
>>>>   "2016-03-13 [02]:00:00.0" did not equal "2016-03-13 [10]:00:00.0"
>>>> (DateTimeUtilsSuite.scala:506)
>>>>
>>>> I know, we talked about this for the 1.6.2 RC, but I reproduced this
>>>> locally too. I will investigate, could still be spurious.
>>>>
>>>>
>>>> StateStoreSuite:
>>>> - maintenance *** FAILED ***
>>>>   The code passed to eventually never returned normally. Attempted 627
>>>> times over 10.000180116 seconds. Last failure message:
>>>> StateStoreSuite.this.fileExists(provider, 1L, false) was true earliest
>>>> file not deleted. (StateStoreSuite.scala:395)
>>>>
>>>> No idea.
>>>>
>>>>
>>>> - offset recovery *** FAILED ***
>>>>   The code passed to eventually never returned normally. Attempted 197
>>>> times over 10.040864806 seconds. Last failure message:
>>>> strings.forall({
>>>>     ((x$1: Any) => DirectKafkaStreamSuite.collectedData.contains(x$1))
>>>>   }) was false. (DirectKafkaStreamSuite.scala:250)
>>>>
>>>> Also something that was possibly fixed already for 2.0.0 and that I
>>>> just back-ported into 1.6. Could be just a very similar failure.
>>>>
>>>> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com>
>>>> wrote:
>>>> > Please vote on releasing the following candidate as Apache Spark
>>>> version
>>>> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
>>>> passes
>>>> > if a majority of at least 3+1 PMC votes are cast.
>>>> >
>>>> > [ ] +1 Release this package as Apache Spark 2.0.0
>>>> > [ ] -1 Do not release this package because ...
>>>> >
>>>> >
>>>> > The tag to be voted on is v2.0.0-rc1
>>>> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>>>> >
>>>> > This release candidate resolves ~2400 issues:
>>>> > https://s.apache.org/spark-2.0.0-rc1-jira
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__s.apache.org_spark-2D2.0.0-2Drc1-2Djira&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=ZD_PezvsJ1GyDhv7MhaeUrVba_uhED5mPkqKpfenKEE&e=>
>>>> >
>>>> > The release files, including signatures, digests, etc. can be found
>>>> at:
>>>> >
>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__people.apache.org_-7Epwendell_spark-2Dreleases_spark-2D2.0.0-2Drc1-2Dbin_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wSbzZ2LyuDcNKaCijEPdt9rokQ0R9w66tn2jMfjKN2I&e=>
>>>> >
>>>> > Release artifacts are signed with the following key:
>>>> > https://people.apache.org/keys/committer/pwendell.asc
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__people.apache.org_keys_committer_pwendell.asc&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=i1Uxw1NyUf2iuA3CXbyiEODD1RR24rAXUvkc42ut8Ao&e=>
>>>> >
>>>> > The staging repository for this release can be found at:
>>>> >
>>>> https://repository.apache.org/content/repositories/orgapachespark-1187/
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__repository.apache.org_content_repositories_orgapachespark-2D1187_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=QjsvnxXe6JBQqXwKw6r-fIIHI9E0ugeeICAqjRXRNwc&e=>
>>>> >
>>>> > The documentation corresponding to this release can be found at:
>>>> >
>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__people.apache.org_-7Epwendell_spark-2Dreleases_spark-2D2.0.0-2Drc1-2Ddocs_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=_6IZExLgc8WoxW0kft_weR7AvELgbFXnHZdezQ_IYGk&e=>
>>>> >
>>>> >
>>>> > =======================================
>>>> > == How can I help test this release? ==
>>>> > =======================================
>>>> > If you are a Spark user, you can help us test this release by taking
>>>> an
>>>> > existing Spark workload and running on this release candidate, then
>>>> > reporting any regressions from 1.x.
>>>> >
>>>> > ================================================
>>>> > == What justifies a -1 vote for this release? ==
>>>> > ================================================
>>>> > Critical bugs impacting major functionalities.
>>>> >
>>>> > Bugs already present in 1.x, missing features, or bugs related to new
>>>> > features will not necessarily block this release. Note that
>>>> historically
>>>> > Spark documentation has been published on the website separately from
>>>> the
>>>> > main release so we do not need to block the release due to
>>>> documentation
>>>> > errors either.
>>>> >
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>
>>>>
>>>
>>


-- 


*Sincerely yoursEgor Pakhomov*

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Matt Cheah <mc...@palantir.com>.

-1 because of SPARK-16181 which is a correctness regression from 1.6. Looks like the patch is ready though: https://github.com/apache/spark/pull/13884 – it would be ideal for this patch to make it into the release.

-Matt Cheah

From: Nick Pentreath <ni...@gmail.com>>
Date: Friday, June 24, 2016 at 4:37 AM
To: "dev@spark.apache.org<ma...@spark.apache.org>" <de...@spark.apache.org>>
Subject: Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

I'm getting the following when trying to run ./dev/run-tests (not happening on master) from the extracted source tar. Anyone else seeing this?

error: Could not access 'fc0a1475ef'
**********************************************************************
File "./dev/run-tests.py", line 69, in __main__.identify_changed_files_from_git_commits
Failed example:
    [x.name<https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=> for x in determine_modules_for_files(             identify_changed_files_from_git_commits("fc0a1475ef", target_ref="5da21f07"))]
Exception raised:
    Traceback (most recent call last):
      File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315, in __run
        compileflags, 1) in test.globs
      File "<doctest __main__.identify_changed_files_from_git_commits[0]>", line 1, in <module>
        [x.name<https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=> for x in determine_modules_for_files(             identify_changed_files_from_git_commits("fc0a1475ef", target_ref="5da21f07"))]
      File "./dev/run-tests.py", line 86, in identify_changed_files_from_git_commits
        universal_newlines=True)
      File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line 573, in check_output
        raise CalledProcessError(retcode, cmd, output=output)
    CalledProcessError: Command '['git', 'diff', '--name-only', 'fc0a1475ef', '5da21f07']' returned non-zero exit status 1
error: Could not access '50a0496a43'
**********************************************************************
File "./dev/run-tests.py", line 71, in __main__.identify_changed_files_from_git_commits
Failed example:
    'root' in [x.name<https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=> for x in determine_modules_for_files(          identify_changed_files_from_git_commits("50a0496a43", target_ref="6765ef9"))]
Exception raised:
    Traceback (most recent call last):
      File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315, in __run
        compileflags, 1) in test.globs
      File "<doctest __main__.identify_changed_files_from_git_commits[1]>", line 1, in <module>
        'root' in [x.name<https://urldefense.proofpoint.com/v2/url?u=http-3A__x.name&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wx5Qjw-efxMVvKXnjUsSkkQcEF6zQHQLQaGtAK9pxIw&e=> for x in determine_modules_for_files(          identify_changed_files_from_git_commits("50a0496a43", target_ref="6765ef9"))]
      File "./dev/run-tests.py", line 86, in identify_changed_files_from_git_commits
        universal_newlines=True)
      File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line 573, in check_output
        raise CalledProcessError(retcode, cmd, output=output)
    CalledProcessError: Command '['git', 'diff', '--name-only', '50a0496a43', '6765ef9']' returned non-zero exit status 1
**********************************************************************
1 items had failures:
   2 of   2 in __main__.identify_changed_files_from_git_commits
***Test Failed*** 2 failures.



On Fri, 24 Jun 2016 at 06:59 Yin Huai <yh...@databricks.com>> wrote:
-1 because of https://issues.apache.org/jira/browse/SPARK-16121<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D16121&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=9200NP4SpeJSUNrSrlWWEC7vFvjWSyCHnx5LD7Sj9u4&e=>.

This jira was resolved after 2.0.0-RC1 was cut. Without the fix, Spark SQL effectively only uses the driver to list files when loading datasets and the driver-side file listing is very slow for datasets having many files and partitions. Since this bug causes a serious performance regression, I am giving -1.

On Thu, Jun 23, 2016 at 1:25 AM, Pete Robbins <ro...@gmail.com>> wrote:
I'm also seeing some of these same failures:

- spilling with compression *** FAILED ***
I have seen this occassionaly

- to UTC timestamp *** FAILED ***
This was fixed yesterday in branch-2.0 (https://issues.apache.org/jira/browse/SPARK-16078<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D16078&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=SuVdXUNGdAhYgtA2fMLe5vZ2PFrPOaeO3i3cbhYU4tc&e=>)

- offset recovery *** FAILED ***
Haven't seen this for a while and thought the flaky test was fixed but it popped up again in one of our builds.

StateStoreSuite:
- maintenance *** FAILED ***
Just seen this has been failing for last 2 days on one build machine (linux amd64)


On 23 June 2016 at 08:51, Sean Owen <so...@cloudera.com>> wrote:
First pass of feedback on the RC: all the sigs, hashes, etc are fine.
Licensing is up to date to the best of my knowledge.

I'm hitting test failures, some of which may be spurious. Just putting
them out there to see if they ring bells. This is Java 8 on Ubuntu 16.


- spilling with compression *** FAILED ***
  java.lang.Exception: Test failed with compression using codec
org.apache.spark.io.SnappyCompressionCodec:
assertion failed: expected cogroup to spill, but did not
  at scala.Predef$.assert(Predef.scala:170)
  at org.apache.spark.TestUtils$.assertSpilled(TestUtils.scala:170)
  at org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org<https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=goarAptcJYfLg44f7BAwhbipqJlRFKz9Y6Z36HItiKg&e=>$apache$spark$util$collection$ExternalAppendOnlyMapSuite$$testSimpleSpilling(ExternalAppendOnlyMapSuite.scala:263)
...

I feel like I've seen this before, and see some possibly relevant
fixes, but they're in 2.0.0 already:
https://github.com/apache/spark/pull/10990<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_10990&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=dFymYD9NRVHIJ5MKpmzPcH_NYwLjOWcZd7FUuQBpTUU&e=>
Is this something where a native library needs to be installed or something?


- to UTC timestamp *** FAILED ***
  "2016-03-13 [02]:00:00.0" did not equal "2016-03-13 [10]:00:00.0"
(DateTimeUtilsSuite.scala:506)

I know, we talked about this for the 1.6.2 RC, but I reproduced this
locally too. I will investigate, could still be spurious.


StateStoreSuite:
- maintenance *** FAILED ***
  The code passed to eventually never returned normally. Attempted 627
times over 10.000180116 seconds. Last failure message:
StateStoreSuite.this.fileExists(provider, 1L, false) was true earliest
file not deleted. (StateStoreSuite.scala:395)

No idea.


- offset recovery *** FAILED ***
  The code passed to eventually never returned normally. Attempted 197
times over 10.040864806 seconds. Last failure message:
strings.forall({
    ((x$1: Any) => DirectKafkaStreamSuite.collectedData.contains(x$1))
  }) was false. (DirectKafkaStreamSuite.scala:250)

Also something that was possibly fixed already for 2.0.0 and that I
just back-ported into 1.6. Could be just a very similar failure.

On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com>> wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
> if a majority of at least 3+1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.0.0
> [ ] -1 Do not release this package because ...
>
>
> The tag to be voted on is v2.0.0-rc1
> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>
> This release candidate resolves ~2400 issues:
> https://s.apache.org/spark-2.0.0-rc1-jira<https://urldefense.proofpoint.com/v2/url?u=https-3A__s.apache.org_spark-2D2.0.0-2Drc1-2Djira&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=ZD_PezvsJ1GyDhv7MhaeUrVba_uhED5mPkqKpfenKEE&e=>
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/<https://urldefense.proofpoint.com/v2/url?u=http-3A__people.apache.org_-7Epwendell_spark-2Dreleases_spark-2D2.0.0-2Drc1-2Dbin_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=wSbzZ2LyuDcNKaCijEPdt9rokQ0R9w66tn2jMfjKN2I&e=>
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc<https://urldefense.proofpoint.com/v2/url?u=https-3A__people.apache.org_keys_committer_pwendell.asc&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=i1Uxw1NyUf2iuA3CXbyiEODD1RR24rAXUvkc42ut8Ao&e=>
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1187/<https://urldefense.proofpoint.com/v2/url?u=https-3A__repository.apache.org_content_repositories_orgapachespark-2D1187_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=QjsvnxXe6JBQqXwKw6r-fIIHI9E0ugeeICAqjRXRNwc&e=>
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/<https://urldefense.proofpoint.com/v2/url?u=http-3A__people.apache.org_-7Epwendell_spark-2Dreleases_spark-2D2.0.0-2Drc1-2Ddocs_&d=DQMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=Y3d-oJvw2gK_2KXYjXY8_yzfAosPOqqaV4wtMg6ZPwM&s=_6IZExLgc8WoxW0kft_weR7AvELgbFXnHZdezQ_IYGk&e=>
>
>
> =======================================
> == How can I help test this release? ==
> =======================================
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.x.
>
> ================================================
> == What justifies a -1 vote for this release? ==
> ================================================
> Critical bugs impacting major functionalities.
>
> Bugs already present in 1.x, missing features, or bugs related to new
> features will not necessarily block this release. Note that historically
> Spark documentation has been published on the website separately from the
> main release so we do not need to block the release due to documentation
> errors either.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org<ma...@spark.apache.org>
For additional commands, e-mail: dev-help@spark.apache.org<ma...@spark.apache.org>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Nick Pentreath <ni...@gmail.com>.

I'm getting the following when trying to run ./dev/run-tests (not happening
on master) from the extracted source tar. Anyone else seeing this?

error: Could not access 'fc0a1475ef'
**********************************************************************
File "./dev/run-tests.py", line 69, in
__main__.identify_changed_files_from_git_commits
Failed example:
    [x.name for x in determine_modules_for_files(
identify_changed_files_from_git_commits("fc0a1475ef",
target_ref="5da21f07"))]
Exception raised:
    Traceback (most recent call last):
      File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315, in
__run
        compileflags, 1) in test.globs
      File "<doctest __main__.identify_changed_files_from_git_commits[0]>",
line 1, in <module>
        [x.name for x in determine_modules_for_files(
identify_changed_files_from_git_commits("fc0a1475ef",
target_ref="5da21f07"))]
      File "./dev/run-tests.py", line 86, in
identify_changed_files_from_git_commits
        universal_newlines=True)
      File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line 573,
in check_output
        raise CalledProcessError(retcode, cmd, output=output)
    CalledProcessError: Command '['git', 'diff', '--name-only',
'fc0a1475ef', '5da21f07']' returned non-zero exit status 1
error: Could not access '50a0496a43'
**********************************************************************
File "./dev/run-tests.py", line 71, in
__main__.identify_changed_files_from_git_commits
Failed example:
    'root' in [x.name for x in determine_modules_for_files(
 identify_changed_files_from_git_commits("50a0496a43",
target_ref="6765ef9"))]
Exception raised:
    Traceback (most recent call last):
      File "/Users/nick/miniconda2/lib/python2.7/doctest.py", line 1315, in
__run
        compileflags, 1) in test.globs
      File "<doctest __main__.identify_changed_files_from_git_commits[1]>",
line 1, in <module>
        'root' in [x.name for x in determine_modules_for_files(
 identify_changed_files_from_git_commits("50a0496a43",
target_ref="6765ef9"))]
      File "./dev/run-tests.py", line 86, in
identify_changed_files_from_git_commits
        universal_newlines=True)
      File "/Users/nick/miniconda2/lib/python2.7/subprocess.py", line 573,
in check_output
        raise CalledProcessError(retcode, cmd, output=output)
    CalledProcessError: Command '['git', 'diff', '--name-only',
'50a0496a43', '6765ef9']' returned non-zero exit status 1
**********************************************************************
1 items had failures:
   2 of   2 in __main__.identify_changed_files_from_git_commits
***Test Failed*** 2 failures.



On Fri, 24 Jun 2016 at 06:59 Yin Huai <yh...@databricks.com> wrote:

> -1 because of https://issues.apache.org/jira/browse/SPARK-16121.
>
> This jira was resolved after 2.0.0-RC1 was cut. Without the fix, Spark
> SQL effectively only uses the driver to list files when loading datasets
> and the driver-side file listing is very slow for datasets having many
> files and partitions. Since this bug causes a serious performance
> regression, I am giving -1.
>
> On Thu, Jun 23, 2016 at 1:25 AM, Pete Robbins <ro...@gmail.com> wrote:
>
>> I'm also seeing some of these same failures:
>>
>> - spilling with compression *** FAILED ***
>> I have seen this occassionaly
>>
>> - to UTC timestamp *** FAILED ***
>> This was fixed yesterday in branch-2.0 (
>> https://issues.apache.org/jira/browse/SPARK-16078)
>>
>> - offset recovery *** FAILED ***
>> Haven't seen this for a while and thought the flaky test was fixed but it
>> popped up again in one of our builds.
>>
>> StateStoreSuite:
>> - maintenance *** FAILED ***
>> Just seen this has been failing for last 2 days on one build machine
>> (linux amd64)
>>
>>
>> On 23 June 2016 at 08:51, Sean Owen <so...@cloudera.com> wrote:
>>
>>> First pass of feedback on the RC: all the sigs, hashes, etc are fine.
>>> Licensing is up to date to the best of my knowledge.
>>>
>>> I'm hitting test failures, some of which may be spurious. Just putting
>>> them out there to see if they ring bells. This is Java 8 on Ubuntu 16.
>>>
>>>
>>> - spilling with compression *** FAILED ***
>>>   java.lang.Exception: Test failed with compression using codec
>>> org.apache.spark.io.SnappyCompressionCodec:
>>> assertion failed: expected cogroup to spill, but did not
>>>   at scala.Predef$.assert(Predef.scala:170)
>>>   at org.apache.spark.TestUtils$.assertSpilled(TestUtils.scala:170)
>>>   at org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org
>>> $apache$spark$util$collection$ExternalAppendOnlyMapSuite$$testSimpleSpilling(ExternalAppendOnlyMapSuite.scala:263)
>>> ...
>>>
>>> I feel like I've seen this before, and see some possibly relevant
>>> fixes, but they're in 2.0.0 already:
>>> https://github.com/apache/spark/pull/10990
>>> Is this something where a native library needs to be installed or
>>> something?
>>>
>>>
>>> - to UTC timestamp *** FAILED ***
>>>   "2016-03-13 [02]:00:00.0" did not equal "2016-03-13 [10]:00:00.0"
>>> (DateTimeUtilsSuite.scala:506)
>>>
>>> I know, we talked about this for the 1.6.2 RC, but I reproduced this
>>> locally too. I will investigate, could still be spurious.
>>>
>>>
>>> StateStoreSuite:
>>> - maintenance *** FAILED ***
>>>   The code passed to eventually never returned normally. Attempted 627
>>> times over 10.000180116 seconds. Last failure message:
>>> StateStoreSuite.this.fileExists(provider, 1L, false) was true earliest
>>> file not deleted. (StateStoreSuite.scala:395)
>>>
>>> No idea.
>>>
>>>
>>> - offset recovery *** FAILED ***
>>>   The code passed to eventually never returned normally. Attempted 197
>>> times over 10.040864806 seconds. Last failure message:
>>> strings.forall({
>>>     ((x$1: Any) => DirectKafkaStreamSuite.collectedData.contains(x$1))
>>>   }) was false. (DirectKafkaStreamSuite.scala:250)
>>>
>>> Also something that was possibly fixed already for 2.0.0 and that I
>>> just back-ported into 1.6. Could be just a very similar failure.
>>>
>>> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com>
>>> wrote:
>>> > Please vote on releasing the following candidate as Apache Spark
>>> version
>>> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
>>> passes
>>> > if a majority of at least 3+1 PMC votes are cast.
>>> >
>>> > [ ] +1 Release this package as Apache Spark 2.0.0
>>> > [ ] -1 Do not release this package because ...
>>> >
>>> >
>>> > The tag to be voted on is v2.0.0-rc1
>>> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>>> >
>>> > This release candidate resolves ~2400 issues:
>>> > https://s.apache.org/spark-2.0.0-rc1-jira
>>> >
>>> > The release files, including signatures, digests, etc. can be found at:
>>> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>>> >
>>> > Release artifacts are signed with the following key:
>>> > https://people.apache.org/keys/committer/pwendell.asc
>>> >
>>> > The staging repository for this release can be found at:
>>> >
>>> https://repository.apache.org/content/repositories/orgapachespark-1187/
>>> >
>>> > The documentation corresponding to this release can be found at:
>>> >
>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>>> >
>>> >
>>> > =======================================
>>> > == How can I help test this release? ==
>>> > =======================================
>>> > If you are a Spark user, you can help us test this release by taking an
>>> > existing Spark workload and running on this release candidate, then
>>> > reporting any regressions from 1.x.
>>> >
>>> > ================================================
>>> > == What justifies a -1 vote for this release? ==
>>> > ================================================
>>> > Critical bugs impacting major functionalities.
>>> >
>>> > Bugs already present in 1.x, missing features, or bugs related to new
>>> > features will not necessarily block this release. Note that
>>> historically
>>> > Spark documentation has been published on the website separately from
>>> the
>>> > main release so we do not need to block the release due to
>>> documentation
>>> > errors either.
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>
>>>
>>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Yin Huai <yh...@databricks.com>.

-1 because of https://issues.apache.org/jira/browse/SPARK-16121.

This jira was resolved after 2.0.0-RC1 was cut. Without the fix, Spark
SQL effectively only uses the driver to list files when loading datasets
and the driver-side file listing is very slow for datasets having many
files and partitions. Since this bug causes a serious performance
regression, I am giving -1.

On Thu, Jun 23, 2016 at 1:25 AM, Pete Robbins <ro...@gmail.com> wrote:

> I'm also seeing some of these same failures:
>
> - spilling with compression *** FAILED ***
> I have seen this occassionaly
>
> - to UTC timestamp *** FAILED ***
> This was fixed yesterday in branch-2.0 (
> https://issues.apache.org/jira/browse/SPARK-16078)
>
> - offset recovery *** FAILED ***
> Haven't seen this for a while and thought the flaky test was fixed but it
> popped up again in one of our builds.
>
> StateStoreSuite:
> - maintenance *** FAILED ***
> Just seen this has been failing for last 2 days on one build machine
> (linux amd64)
>
>
> On 23 June 2016 at 08:51, Sean Owen <so...@cloudera.com> wrote:
>
>> First pass of feedback on the RC: all the sigs, hashes, etc are fine.
>> Licensing is up to date to the best of my knowledge.
>>
>> I'm hitting test failures, some of which may be spurious. Just putting
>> them out there to see if they ring bells. This is Java 8 on Ubuntu 16.
>>
>>
>> - spilling with compression *** FAILED ***
>>   java.lang.Exception: Test failed with compression using codec
>> org.apache.spark.io.SnappyCompressionCodec:
>> assertion failed: expected cogroup to spill, but did not
>>   at scala.Predef$.assert(Predef.scala:170)
>>   at org.apache.spark.TestUtils$.assertSpilled(TestUtils.scala:170)
>>   at org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org
>> $apache$spark$util$collection$ExternalAppendOnlyMapSuite$$testSimpleSpilling(ExternalAppendOnlyMapSuite.scala:263)
>> ...
>>
>> I feel like I've seen this before, and see some possibly relevant
>> fixes, but they're in 2.0.0 already:
>> https://github.com/apache/spark/pull/10990
>> Is this something where a native library needs to be installed or
>> something?
>>
>>
>> - to UTC timestamp *** FAILED ***
>>   "2016-03-13 [02]:00:00.0" did not equal "2016-03-13 [10]:00:00.0"
>> (DateTimeUtilsSuite.scala:506)
>>
>> I know, we talked about this for the 1.6.2 RC, but I reproduced this
>> locally too. I will investigate, could still be spurious.
>>
>>
>> StateStoreSuite:
>> - maintenance *** FAILED ***
>>   The code passed to eventually never returned normally. Attempted 627
>> times over 10.000180116 seconds. Last failure message:
>> StateStoreSuite.this.fileExists(provider, 1L, false) was true earliest
>> file not deleted. (StateStoreSuite.scala:395)
>>
>> No idea.
>>
>>
>> - offset recovery *** FAILED ***
>>   The code passed to eventually never returned normally. Attempted 197
>> times over 10.040864806 seconds. Last failure message:
>> strings.forall({
>>     ((x$1: Any) => DirectKafkaStreamSuite.collectedData.contains(x$1))
>>   }) was false. (DirectKafkaStreamSuite.scala:250)
>>
>> Also something that was possibly fixed already for 2.0.0 and that I
>> just back-ported into 1.6. Could be just a very similar failure.
>>
>> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com> wrote:
>> > Please vote on releasing the following candidate as Apache Spark version
>> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
>> passes
>> > if a majority of at least 3+1 PMC votes are cast.
>> >
>> > [ ] +1 Release this package as Apache Spark 2.0.0
>> > [ ] -1 Do not release this package because ...
>> >
>> >
>> > The tag to be voted on is v2.0.0-rc1
>> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>> >
>> > This release candidate resolves ~2400 issues:
>> > https://s.apache.org/spark-2.0.0-rc1-jira
>> >
>> > The release files, including signatures, digests, etc. can be found at:
>> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>> >
>> > Release artifacts are signed with the following key:
>> > https://people.apache.org/keys/committer/pwendell.asc
>> >
>> > The staging repository for this release can be found at:
>> > https://repository.apache.org/content/repositories/orgapachespark-1187/
>> >
>> > The documentation corresponding to this release can be found at:
>> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>> >
>> >
>> > =======================================
>> > == How can I help test this release? ==
>> > =======================================
>> > If you are a Spark user, you can help us test this release by taking an
>> > existing Spark workload and running on this release candidate, then
>> > reporting any regressions from 1.x.
>> >
>> > ================================================
>> > == What justifies a -1 vote for this release? ==
>> > ================================================
>> > Critical bugs impacting major functionalities.
>> >
>> > Bugs already present in 1.x, missing features, or bugs related to new
>> > features will not necessarily block this release. Note that historically
>> > Spark documentation has been published on the website separately from
>> the
>> > main release so we do not need to block the release due to documentation
>> > errors either.
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Pete Robbins <ro...@gmail.com>.

I'm also seeing some of these same failures:

- spilling with compression *** FAILED ***
I have seen this occassionaly

- to UTC timestamp *** FAILED ***
This was fixed yesterday in branch-2.0 (
https://issues.apache.org/jira/browse/SPARK-16078)

- offset recovery *** FAILED ***
Haven't seen this for a while and thought the flaky test was fixed but it
popped up again in one of our builds.

StateStoreSuite:
- maintenance *** FAILED ***
Just seen this has been failing for last 2 days on one build machine (linux
amd64)

On 23 June 2016 at 08:51, Sean Owen <so...@cloudera.com> wrote:

> First pass of feedback on the RC: all the sigs, hashes, etc are fine.
> Licensing is up to date to the best of my knowledge.
>
> I'm hitting test failures, some of which may be spurious. Just putting
> them out there to see if they ring bells. This is Java 8 on Ubuntu 16.
>
>
> - spilling with compression *** FAILED ***
>   java.lang.Exception: Test failed with compression using codec
> org.apache.spark.io.SnappyCompressionCodec:
> assertion failed: expected cogroup to spill, but did not
>   at scala.Predef$.assert(Predef.scala:170)
>   at org.apache.spark.TestUtils$.assertSpilled(TestUtils.scala:170)
>   at org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org
> $apache$spark$util$collection$ExternalAppendOnlyMapSuite$$testSimpleSpilling(ExternalAppendOnlyMapSuite.scala:263)
> ...
>
> I feel like I've seen this before, and see some possibly relevant
> fixes, but they're in 2.0.0 already:
> https://github.com/apache/spark/pull/10990
> Is this something where a native library needs to be installed or
> something?
>
>
> - to UTC timestamp *** FAILED ***
>   "2016-03-13 [02]:00:00.0" did not equal "2016-03-13 [10]:00:00.0"
> (DateTimeUtilsSuite.scala:506)
>
> I know, we talked about this for the 1.6.2 RC, but I reproduced this
> locally too. I will investigate, could still be spurious.
>
>
> StateStoreSuite:
> - maintenance *** FAILED ***
>   The code passed to eventually never returned normally. Attempted 627
> times over 10.000180116 seconds. Last failure message:
> StateStoreSuite.this.fileExists(provider, 1L, false) was true earliest
> file not deleted. (StateStoreSuite.scala:395)
>
> No idea.
>
>
> - offset recovery *** FAILED ***
>   The code passed to eventually never returned normally. Attempted 197
> times over 10.040864806 seconds. Last failure message:
> strings.forall({
>     ((x$1: Any) => DirectKafkaStreamSuite.collectedData.contains(x$1))
>   }) was false. (DirectKafkaStreamSuite.scala:250)
>
> Also something that was possibly fixed already for 2.0.0 and that I
> just back-ported into 1.6. Could be just a very similar failure.
>
> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com> wrote:
> > Please vote on releasing the following candidate as Apache Spark version
> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
> passes
> > if a majority of at least 3+1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Spark 2.0.0
> > [ ] -1 Do not release this package because ...
> >
> >
> > The tag to be voted on is v2.0.0-rc1
> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
> >
> > This release candidate resolves ~2400 issues:
> > https://s.apache.org/spark-2.0.0-rc1-jira
> >
> > The release files, including signatures, digests, etc. can be found at:
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/pwendell.asc
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapachespark-1187/
> >
> > The documentation corresponding to this release can be found at:
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
> >
> >
> > =======================================
> > == How can I help test this release? ==
> > =======================================
> > If you are a Spark user, you can help us test this release by taking an
> > existing Spark workload and running on this release candidate, then
> > reporting any regressions from 1.x.
> >
> > ================================================
> > == What justifies a -1 vote for this release? ==
> > ================================================
> > Critical bugs impacting major functionalities.
> >
> > Bugs already present in 1.x, missing features, or bugs related to new
> > features will not necessarily block this release. Note that historically
> > Spark documentation has been published on the website separately from the
> > main release so we do not need to block the release due to documentation
> > errors either.
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Sean Owen <so...@cloudera.com>.

First pass of feedback on the RC: all the sigs, hashes, etc are fine.
Licensing is up to date to the best of my knowledge.

I'm hitting test failures, some of which may be spurious. Just putting
them out there to see if they ring bells. This is Java 8 on Ubuntu 16.


- spilling with compression *** FAILED ***
  java.lang.Exception: Test failed with compression using codec
org.apache.spark.io.SnappyCompressionCodec:
assertion failed: expected cogroup to spill, but did not
  at scala.Predef$.assert(Predef.scala:170)
  at org.apache.spark.TestUtils$.assertSpilled(TestUtils.scala:170)
  at org.apache.spark.util.collection.ExternalAppendOnlyMapSuite.org$apache$spark$util$collection$ExternalAppendOnlyMapSuite$$testSimpleSpilling(ExternalAppendOnlyMapSuite.scala:263)
...

I feel like I've seen this before, and see some possibly relevant
fixes, but they're in 2.0.0 already:
https://github.com/apache/spark/pull/10990
Is this something where a native library needs to be installed or something?


- to UTC timestamp *** FAILED ***
  "2016-03-13 [02]:00:00.0" did not equal "2016-03-13 [10]:00:00.0"
(DateTimeUtilsSuite.scala:506)

I know, we talked about this for the 1.6.2 RC, but I reproduced this
locally too. I will investigate, could still be spurious.


StateStoreSuite:
- maintenance *** FAILED ***
  The code passed to eventually never returned normally. Attempted 627
times over 10.000180116 seconds. Last failure message:
StateStoreSuite.this.fileExists(provider, 1L, false) was true earliest
file not deleted. (StateStoreSuite.scala:395)

No idea.


- offset recovery *** FAILED ***
  The code passed to eventually never returned normally. Attempted 197
times over 10.040864806 seconds. Last failure message:
strings.forall({
    ((x$1: Any) => DirectKafkaStreamSuite.collectedData.contains(x$1))
  }) was false. (DirectKafkaStreamSuite.scala:250)

Also something that was possibly fixed already for 2.0.0 and that I
just back-ported into 1.6. Could be just a very similar failure.

On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com> wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
> if a majority of at least 3+1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.0.0
> [ ] -1 Do not release this package because ...
>
>
> The tag to be voted on is v2.0.0-rc1
> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>
> This release candidate resolves ~2400 issues:
> https://s.apache.org/spark-2.0.0-rc1-jira
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1187/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>
>
> =======================================
> == How can I help test this release? ==
> =======================================
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.x.
>
> ================================================
> == What justifies a -1 vote for this release? ==
> ================================================
> Critical bugs impacting major functionalities.
>
> Bugs already present in 1.x, missing features, or bugs related to new
> features will not necessarily block this release. Note that historically
> Spark documentation has been published on the website separately from the
> main release so we do not need to block the release due to documentation
> errors either.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by WangTaoTheTonic <ba...@aliyun.com>.

Do we have a feature list or release notes for 2.0 like before?



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-2-0-0-RC1-tp18019p18182.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Reynold Xin <rx...@databricks.com>.

Please consider this vote canceled and I will work on another RC soon.

On Tue, Jun 21, 2016 at 6:26 PM, Reynold Xin <rx...@databricks.com> wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
> if a majority of at least 3+1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.0.0
> [ ] -1 Do not release this package because ...
>
>
> The tag to be voted on is v2.0.0-rc1
> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>
> This release candidate resolves ~2400 issues:
> https://s.apache.org/spark-2.0.0-rc1-jira
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1187/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>
>
> =======================================
> == How can I help test this release? ==
> =======================================
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.x.
>
> ================================================
> == What justifies a -1 vote for this release? ==
> ================================================
> Critical bugs impacting major functionalities.
>
> Bugs already present in 1.x, missing features, or bugs related to new
> features will not necessarily block this release. Note that historically
> Spark documentation has been published on the website separately from the
> main release so we do not need to block the release due to documentation
> errors either.
>
>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Mark Hamstra <ma...@clearstorydata.com>.

SPARK-15893 is resolved as a duplicate of SPARK-15899.  SPARK-15899 is
Unresolved.

On Wed, Jun 22, 2016 at 4:04 PM, Ulanov, Alexander <alexander.ulanov@hpe.com
> wrote:

> -1
>
> Spark Unit tests fail on Windows. Still not resolved, though marked as
> resolved.
>
> https://issues.apache.org/jira/browse/SPARK-15893
>
> *From:* Reynold Xin [mailto:rxin@databricks.com]
> *Sent:* Tuesday, June 21, 2016 6:27 PM
> *To:* dev@spark.apache.org
> *Subject:* [VOTE] Release Apache Spark 2.0.0 (RC1)
>
>
>
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
> if a majority of at least 3+1 PMC votes are cast.
>
>
>
> [ ] +1 Release this package as Apache Spark 2.0.0
>
> [ ] -1 Do not release this package because ...
>
>
>
>
>
> The tag to be voted on is v2.0.0-rc1
> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>
>
>
> This release candidate resolves ~2400 issues:
> https://s.apache.org/spark-2.0.0-rc1-jira
>
>
>
> The release files, including signatures, digests, etc. can be found at:
>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>
>
>
> Release artifacts are signed with the following key:
>
> https://people.apache.org/keys/committer/pwendell.asc
>
>
>
> The staging repository for this release can be found at:
>
> https://repository.apache.org/content/repositories/orgapachespark-1187/
>
>
>
> The documentation corresponding to this release can be found at:
>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>
>
>
>
>
> =======================================
>
> == How can I help test this release? ==
>
> =======================================
>
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.x.
>
>
>
> ================================================
>
> == What justifies a -1 vote for this release? ==
>
> ================================================
>
> Critical bugs impacting major functionalities.
>
>
>
> Bugs already present in 1.x, missing features, or bugs related to new
> features will not necessarily block this release. Note that historically
> Spark documentation has been published on the website separately from the
> main release so we do not need to block the release due to documentation
> errors either.
>
>
>
>
>

RE: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by "Ulanov, Alexander" <al...@hpe.com>.

Here is the fix https://github.com/apache/spark/pull/13868
From: Reynold Xin [mailto:rxin@databricks.com]
Sent: Wednesday, June 22, 2016 6:43 PM
To: Ulanov, Alexander <al...@hpe.com>
Cc: Mark Hamstra <ma...@clearstorydata.com>; Marcelo Vanzin <va...@cloudera.com>; dev@spark.apache.org
Subject: Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Alex - if you have access to a windows box, can you fix the issue? I'm not sure how many Spark contributors have windows boxes.


On Wed, Jun 22, 2016 at 5:56 PM, Ulanov, Alexander <al...@hpe.com>> wrote:
Spark Unit tests fail on Windows in Spark 2.0. It can be considered as blocker since there are people that develop for Spark on Windows. The referenced issue is indeed Minor and has nothing to do with unit tests.

From: Mark Hamstra [mailto:mark@clearstorydata.com<ma...@clearstorydata.com>]
Sent: Wednesday, June 22, 2016 4:09 PM
To: Marcelo Vanzin <va...@cloudera.com>>
Cc: Ulanov, Alexander <al...@hpe.com>>; Reynold Xin <rx...@databricks.com>>; dev@spark.apache.org<ma...@spark.apache.org>
Subject: Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

It's also marked as Minor, not Blocker.

On Wed, Jun 22, 2016 at 4:07 PM, Marcelo Vanzin <va...@cloudera.com>> wrote:
On Wed, Jun 22, 2016 at 4:04 PM, Ulanov, Alexander
<al...@hpe.com>> wrote:
> -1
>
> Spark Unit tests fail on Windows. Still not resolved, though marked as
> resolved.

To be pedantic, it's marked as a duplicate
(https://issues.apache.org/jira/browse/SPARK-15899), which doesn't
mean necessarily that it's fixed.



> https://issues.apache.org/jira/browse/SPARK-15893
>
> From: Reynold Xin [mailto:rxin@databricks.com<ma...@databricks.com>]
> Sent: Tuesday, June 21, 2016 6:27 PM
> To: dev@spark.apache.org<ma...@spark.apache.org>
> Subject: [VOTE] Release Apache Spark 2.0.0 (RC1)
>
>
>
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
> if a majority of at least 3+1 PMC votes are cast.
>
>
>
> [ ] +1 Release this package as Apache Spark 2.0.0
>
> [ ] -1 Do not release this package because ...
>
>
>
>
>
> The tag to be voted on is v2.0.0-rc1
> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>
>
>
> This release candidate resolves ~2400 issues:
> https://s.apache.org/spark-2.0.0-rc1-jira
>
>
>
> The release files, including signatures, digests, etc. can be found at:
>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>
>
>
> Release artifacts are signed with the following key:
>
> https://people.apache.org/keys/committer/pwendell.asc
>
>
>
> The staging repository for this release can be found at:
>
> https://repository.apache.org/content/repositories/orgapachespark-1187/
>
>
>
> The documentation corresponding to this release can be found at:
>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>
>
>
>
>
> =======================================
>
> == How can I help test this release? ==
>
> =======================================
>
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.x.
>
>
>
> ================================================
>
> == What justifies a -1 vote for this release? ==
>
> ================================================
>
> Critical bugs impacting major functionalities.
>
>
>
> Bugs already present in 1.x, missing features, or bugs related to new
> features will not necessarily block this release. Note that historically
> Spark documentation has been published on the website separately from the
> main release so we do not need to block the release due to documentation
> errors either.
>
>
>
>

--
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org<ma...@spark.apache.org>
For additional commands, e-mail: dev-help@spark.apache.org<ma...@spark.apache.org>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Reynold Xin <rx...@databricks.com>.

Alex - if you have access to a windows box, can you fix the issue? I'm not
sure how many Spark contributors have windows boxes.


On Wed, Jun 22, 2016 at 5:56 PM, Ulanov, Alexander <alexander.ulanov@hpe.com
> wrote:

> Spark Unit tests fail on Windows in Spark 2.0. It can be considered as
> blocker since there are people that develop for Spark on Windows. The
> referenced issue is indeed Minor and has nothing to do with unit tests.
>
>
>
> *From:* Mark Hamstra [mailto:mark@clearstorydata.com]
> *Sent:* Wednesday, June 22, 2016 4:09 PM
> *To:* Marcelo Vanzin <va...@cloudera.com>
> *Cc:* Ulanov, Alexander <al...@hpe.com>; Reynold Xin <
> rxin@databricks.com>; dev@spark.apache.org
> *Subject:* Re: [VOTE] Release Apache Spark 2.0.0 (RC1)
>
>
>
> It's also marked as Minor, not Blocker.
>
>
>
> On Wed, Jun 22, 2016 at 4:07 PM, Marcelo Vanzin <va...@cloudera.com>
> wrote:
>
> On Wed, Jun 22, 2016 at 4:04 PM, Ulanov, Alexander
> <al...@hpe.com> wrote:
> > -1
> >
> > Spark Unit tests fail on Windows. Still not resolved, though marked as
> > resolved.
>
> To be pedantic, it's marked as a duplicate
> (https://issues.apache.org/jira/browse/SPARK-15899), which doesn't
> mean necessarily that it's fixed.
>
>
>
>
> > https://issues.apache.org/jira/browse/SPARK-15893
> >
> > From: Reynold Xin [mailto:rxin@databricks.com]
> > Sent: Tuesday, June 21, 2016 6:27 PM
> > To: dev@spark.apache.org
> > Subject: [VOTE] Release Apache Spark 2.0.0 (RC1)
> >
> >
> >
> > Please vote on releasing the following candidate as Apache Spark version
> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
> passes
> > if a majority of at least 3+1 PMC votes are cast.
> >
> >
> >
> > [ ] +1 Release this package as Apache Spark 2.0.0
> >
> > [ ] -1 Do not release this package because ...
> >
> >
> >
> >
> >
> > The tag to be voted on is v2.0.0-rc1
> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
> >
> >
> >
> > This release candidate resolves ~2400 issues:
> > https://s.apache.org/spark-2.0.0-rc1-jira
> >
> >
> >
> > The release files, including signatures, digests, etc. can be found at:
> >
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
> >
> >
> >
> > Release artifacts are signed with the following key:
> >
> > https://people.apache.org/keys/committer/pwendell.asc
> >
> >
> >
> > The staging repository for this release can be found at:
> >
> > https://repository.apache.org/content/repositories/orgapachespark-1187/
> >
> >
> >
> > The documentation corresponding to this release can be found at:
> >
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
> >
> >
> >
> >
> >
> > =======================================
> >
> > == How can I help test this release? ==
> >
> > =======================================
> >
> > If you are a Spark user, you can help us test this release by taking an
> > existing Spark workload and running on this release candidate, then
> > reporting any regressions from 1.x.
> >
> >
> >
> > ================================================
> >
> > == What justifies a -1 vote for this release? ==
> >
> > ================================================
> >
> > Critical bugs impacting major functionalities.
> >
> >
> >
> > Bugs already present in 1.x, missing features, or bugs related to new
> > features will not necessarily block this release. Note that historically
> > Spark documentation has been published on the website separately from the
> > main release so we do not need to block the release due to documentation
> > errors either.
> >
> >
> >
> >
>
>
> --
> Marcelo
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Mark Hamstra <ma...@clearstorydata.com>.

No, that isn't necessarily enough to be considered a blocker.  A blocker
would be something that would have large negative effects on a significant
number of people trying to run Spark.  Arguably, something that prevents a
minority of Spark developers from running unit tests on one OS does not
qualify.  That's not to say that we shouldn't fix this, but only that it
needn't block a 2.0.0 release.

On Wed, Jun 22, 2016 at 5:56 PM, Ulanov, Alexander <alexander.ulanov@hpe.com
> wrote:

> Spark Unit tests fail on Windows in Spark 2.0. It can be considered as
> blocker since there are people that develop for Spark on Windows. The
> referenced issue is indeed Minor and has nothing to do with unit tests.
>
>
>
> *From:* Mark Hamstra [mailto:mark@clearstorydata.com]
> *Sent:* Wednesday, June 22, 2016 4:09 PM
> *To:* Marcelo Vanzin <va...@cloudera.com>
> *Cc:* Ulanov, Alexander <al...@hpe.com>; Reynold Xin <
> rxin@databricks.com>; dev@spark.apache.org
> *Subject:* Re: [VOTE] Release Apache Spark 2.0.0 (RC1)
>
>
>
> It's also marked as Minor, not Blocker.
>
>
>
> On Wed, Jun 22, 2016 at 4:07 PM, Marcelo Vanzin <va...@cloudera.com>
> wrote:
>
> On Wed, Jun 22, 2016 at 4:04 PM, Ulanov, Alexander
> <al...@hpe.com> wrote:
> > -1
> >
> > Spark Unit tests fail on Windows. Still not resolved, though marked as
> > resolved.
>
> To be pedantic, it's marked as a duplicate
> (https://issues.apache.org/jira/browse/SPARK-15899), which doesn't
> mean necessarily that it's fixed.
>
>
>
>
> > https://issues.apache.org/jira/browse/SPARK-15893
> >
> > From: Reynold Xin [mailto:rxin@databricks.com]
> > Sent: Tuesday, June 21, 2016 6:27 PM
> > To: dev@spark.apache.org
> > Subject: [VOTE] Release Apache Spark 2.0.0 (RC1)
> >
> >
> >
> > Please vote on releasing the following candidate as Apache Spark version
> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
> passes
> > if a majority of at least 3+1 PMC votes are cast.
> >
> >
> >
> > [ ] +1 Release this package as Apache Spark 2.0.0
> >
> > [ ] -1 Do not release this package because ...
> >
> >
> >
> >
> >
> > The tag to be voted on is v2.0.0-rc1
> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
> >
> >
> >
> > This release candidate resolves ~2400 issues:
> > https://s.apache.org/spark-2.0.0-rc1-jira
> >
> >
> >
> > The release files, including signatures, digests, etc. can be found at:
> >
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
> >
> >
> >
> > Release artifacts are signed with the following key:
> >
> > https://people.apache.org/keys/committer/pwendell.asc
> >
> >
> >
> > The staging repository for this release can be found at:
> >
> > https://repository.apache.org/content/repositories/orgapachespark-1187/
> >
> >
> >
> > The documentation corresponding to this release can be found at:
> >
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
> >
> >
> >
> >
> >
> > =======================================
> >
> > == How can I help test this release? ==
> >
> > =======================================
> >
> > If you are a Spark user, you can help us test this release by taking an
> > existing Spark workload and running on this release candidate, then
> > reporting any regressions from 1.x.
> >
> >
> >
> > ================================================
> >
> > == What justifies a -1 vote for this release? ==
> >
> > ================================================
> >
> > Critical bugs impacting major functionalities.
> >
> >
> >
> > Bugs already present in 1.x, missing features, or bugs related to new
> > features will not necessarily block this release. Note that historically
> > Spark documentation has been published on the website separately from the
> > main release so we do not need to block the release due to documentation
> > errors either.
> >
> >
> >
> >
>
>
> --
> Marcelo
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>
>

RE: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by "Ulanov, Alexander" <al...@hpe.com>.

Spark Unit tests fail on Windows in Spark 2.0. It can be considered as blocker since there are people that develop for Spark on Windows. The referenced issue is indeed Minor and has nothing to do with unit tests.

From: Mark Hamstra [mailto:mark@clearstorydata.com]
Sent: Wednesday, June 22, 2016 4:09 PM
To: Marcelo Vanzin <va...@cloudera.com>
Cc: Ulanov, Alexander <al...@hpe.com>; Reynold Xin <rx...@databricks.com>; dev@spark.apache.org
Subject: Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

It's also marked as Minor, not Blocker.

On Wed, Jun 22, 2016 at 4:07 PM, Marcelo Vanzin <va...@cloudera.com>> wrote:
On Wed, Jun 22, 2016 at 4:04 PM, Ulanov, Alexander
<al...@hpe.com>> wrote:
> -1
>
> Spark Unit tests fail on Windows. Still not resolved, though marked as
> resolved.

To be pedantic, it's marked as a duplicate
(https://issues.apache.org/jira/browse/SPARK-15899), which doesn't
mean necessarily that it's fixed.



> https://issues.apache.org/jira/browse/SPARK-15893
>
> From: Reynold Xin [mailto:rxin@databricks.com<ma...@databricks.com>]
> Sent: Tuesday, June 21, 2016 6:27 PM
> To: dev@spark.apache.org<ma...@spark.apache.org>
> Subject: [VOTE] Release Apache Spark 2.0.0 (RC1)
>
>
>
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
> if a majority of at least 3+1 PMC votes are cast.
>
>
>
> [ ] +1 Release this package as Apache Spark 2.0.0
>
> [ ] -1 Do not release this package because ...
>
>
>
>
>
> The tag to be voted on is v2.0.0-rc1
> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>
>
>
> This release candidate resolves ~2400 issues:
> https://s.apache.org/spark-2.0.0-rc1-jira
>
>
>
> The release files, including signatures, digests, etc. can be found at:
>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>
>
>
> Release artifacts are signed with the following key:
>
> https://people.apache.org/keys/committer/pwendell.asc
>
>
>
> The staging repository for this release can be found at:
>
> https://repository.apache.org/content/repositories/orgapachespark-1187/
>
>
>
> The documentation corresponding to this release can be found at:
>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>
>
>
>
>
> =======================================
>
> == How can I help test this release? ==
>
> =======================================
>
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.x.
>
>
>
> ================================================
>
> == What justifies a -1 vote for this release? ==
>
> ================================================
>
> Critical bugs impacting major functionalities.
>
>
>
> Bugs already present in 1.x, missing features, or bugs related to new
> features will not necessarily block this release. Note that historically
> Spark documentation has been published on the website separately from the
> main release so we do not need to block the release due to documentation
> errors either.
>
>
>
>


--
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org<ma...@spark.apache.org>
For additional commands, e-mail: dev-help@spark.apache.org<ma...@spark.apache.org>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Mark Hamstra <ma...@clearstorydata.com>.

It's also marked as Minor, not Blocker.

On Wed, Jun 22, 2016 at 4:07 PM, Marcelo Vanzin <va...@cloudera.com> wrote:

> On Wed, Jun 22, 2016 at 4:04 PM, Ulanov, Alexander
> <al...@hpe.com> wrote:
> > -1
> >
> > Spark Unit tests fail on Windows. Still not resolved, though marked as
> > resolved.
>
> To be pedantic, it's marked as a duplicate
> (https://issues.apache.org/jira/browse/SPARK-15899), which doesn't
> mean necessarily that it's fixed.
>
>
>
> > https://issues.apache.org/jira/browse/SPARK-15893
> >
> > From: Reynold Xin [mailto:rxin@databricks.com]
> > Sent: Tuesday, June 21, 2016 6:27 PM
> > To: dev@spark.apache.org
> > Subject: [VOTE] Release Apache Spark 2.0.0 (RC1)
> >
> >
> >
> > Please vote on releasing the following candidate as Apache Spark version
> > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
> passes
> > if a majority of at least 3+1 PMC votes are cast.
> >
> >
> >
> > [ ] +1 Release this package as Apache Spark 2.0.0
> >
> > [ ] -1 Do not release this package because ...
> >
> >
> >
> >
> >
> > The tag to be voted on is v2.0.0-rc1
> > (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
> >
> >
> >
> > This release candidate resolves ~2400 issues:
> > https://s.apache.org/spark-2.0.0-rc1-jira
> >
> >
> >
> > The release files, including signatures, digests, etc. can be found at:
> >
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
> >
> >
> >
> > Release artifacts are signed with the following key:
> >
> > https://people.apache.org/keys/committer/pwendell.asc
> >
> >
> >
> > The staging repository for this release can be found at:
> >
> > https://repository.apache.org/content/repositories/orgapachespark-1187/
> >
> >
> >
> > The documentation corresponding to this release can be found at:
> >
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
> >
> >
> >
> >
> >
> > =======================================
> >
> > == How can I help test this release? ==
> >
> > =======================================
> >
> > If you are a Spark user, you can help us test this release by taking an
> > existing Spark workload and running on this release candidate, then
> > reporting any regressions from 1.x.
> >
> >
> >
> > ================================================
> >
> > == What justifies a -1 vote for this release? ==
> >
> > ================================================
> >
> > Critical bugs impacting major functionalities.
> >
> >
> >
> > Bugs already present in 1.x, missing features, or bugs related to new
> > features will not necessarily block this release. Note that historically
> > Spark documentation has been published on the website separately from the
> > main release so we do not need to block the release due to documentation
> > errors either.
> >
> >
> >
> >
>
>
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Marcelo Vanzin <va...@cloudera.com>.

On Wed, Jun 22, 2016 at 4:04 PM, Ulanov, Alexander
<al...@hpe.com> wrote:
> -1
>
> Spark Unit tests fail on Windows. Still not resolved, though marked as
> resolved.

To be pedantic, it's marked as a duplicate
(https://issues.apache.org/jira/browse/SPARK-15899), which doesn't
mean necessarily that it's fixed.



> https://issues.apache.org/jira/browse/SPARK-15893
>
> From: Reynold Xin [mailto:rxin@databricks.com]
> Sent: Tuesday, June 21, 2016 6:27 PM
> To: dev@spark.apache.org
> Subject: [VOTE] Release Apache Spark 2.0.0 (RC1)
>
>
>
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
> if a majority of at least 3+1 PMC votes are cast.
>
>
>
> [ ] +1 Release this package as Apache Spark 2.0.0
>
> [ ] -1 Do not release this package because ...
>
>
>
>
>
> The tag to be voted on is v2.0.0-rc1
> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>
>
>
> This release candidate resolves ~2400 issues:
> https://s.apache.org/spark-2.0.0-rc1-jira
>
>
>
> The release files, including signatures, digests, etc. can be found at:
>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>
>
>
> Release artifacts are signed with the following key:
>
> https://people.apache.org/keys/committer/pwendell.asc
>
>
>
> The staging repository for this release can be found at:
>
> https://repository.apache.org/content/repositories/orgapachespark-1187/
>
>
>
> The documentation corresponding to this release can be found at:
>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>
>
>
>
>
> =======================================
>
> == How can I help test this release? ==
>
> =======================================
>
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.x.
>
>
>
> ================================================
>
> == What justifies a -1 vote for this release? ==
>
> ================================================
>
> Critical bugs impacting major functionalities.
>
>
>
> Bugs already present in 1.x, missing features, or bugs related to new
> features will not necessarily block this release. Note that historically
> Spark documentation has been published on the website separately from the
> main release so we do not need to block the release due to documentation
> errors either.
>
>
>
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

RE: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by "Ulanov, Alexander" <al...@hpe.com>.

-1
Spark Unit tests fail on Windows. Still not resolved, though marked as resolved.
https://issues.apache.org/jira/browse/SPARK-15893
From: Reynold Xin [mailto:rxin@databricks.com]
Sent: Tuesday, June 21, 2016 6:27 PM
To: dev@spark.apache.org
Subject: [VOTE] Release Apache Spark 2.0.0 (RC1)

Please vote on releasing the following candidate as Apache Spark version 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes if a majority of at least 3+1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.0.0
[ ] -1 Do not release this package because ...

The tag to be voted on is v2.0.0-rc1 (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).

This release candidate resolves ~2400 issues: https://s.apache.org/spark-2.0.0-rc1-jira

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1187/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/

=======================================
== How can I help test this release? ==
=======================================
If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions from 1.x.

================================================
== What justifies a -1 vote for this release? ==
================================================
Critical bugs impacting major functionalities.

Bugs already present in 1.x, missing features, or bugs related to new features will not necessarily block this release. Note that historically Spark documentation has been published on the website separately from the main release so we do not need to block the release due to documentation errors either.

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Sean Owen <so...@cloudera.com>.

Hm, I thought that was to be added for 2.0. Imran I know you may have
been working alongside Mark on it; what do you think?

TD / Reynold would you object to it for 2.0?

On Wed, Jun 22, 2016 at 3:46 PM, Cody Koeninger <co...@koeninger.org> wrote:
> As far as I know the only thing blocking it at this point is lack of
> committer review / approval.
>
> It's technically adding a new feature after spark code-freeze, but it
> doesn't change existing code, and the kafka project didn't release
> 0.10 until the end of may.
>
>
> On Wed, Jun 22, 2016 at 9:39 AM, Sean Owen <so...@cloudera.com> wrote:
>> I profess ignorance again though I really should know by now, but,
>> what's opposing that? I personally thought this was going to be in 2.0
>> and didn't kind of notice it wasn't ...
>>
>> On Wed, Jun 22, 2016 at 3:29 PM, Cody Koeninger <co...@koeninger.org> wrote:
>>> I don't have a vote, but I'd just like to reiterate that I think kafka
>>> 0.10 support should be added to a 2.0 release candidate; if not now,
>>> then well before release.
>>>
>>> - it's a completely standalone jar, so shouldn't break anyone who's
>>> using the existing 0.8 support
>>> - it's like the 5th highest voted open ticket, and has been open for months
>>> - Luciano has said multiple times that he wants to merge that PR into
>>> Bahir if it isn't in a RC for spark 2.0, which I think would confuse
>>> users and cause maintenance problems
>>>
>>> On Wed, Jun 22, 2016 at 12:38 AM, Sean Owen <so...@cloudera.com> wrote:
>>>> While I'd officially -1 this while there are still many blockers, this
>>>> should certainly be tested as usual, because they're mostly doc and
>>>> "audit" type issues.
>>>>
>>>> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com> wrote:
>>>>> Please vote on releasing the following candidate as Apache Spark version
>>>>> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
>>>>> if a majority of at least 3+1 PMC votes are cast.
>>>>>
>>>>> [ ] +1 Release this package as Apache Spark 2.0.0
>>>>> [ ] -1 Do not release this package because ...
>>>>>
>>>>>
>>>>> The tag to be voted on is v2.0.0-rc1
>>>>> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>>>>>
>>>>> This release candidate resolves ~2400 issues:
>>>>> https://s.apache.org/spark-2.0.0-rc1-jira
>>>>>
>>>>> The release files, including signatures, digests, etc. can be found at:
>>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>>>>>
>>>>> Release artifacts are signed with the following key:
>>>>> https://people.apache.org/keys/committer/pwendell.asc
>>>>>
>>>>> The staging repository for this release can be found at:
>>>>> https://repository.apache.org/content/repositories/orgapachespark-1187/
>>>>>
>>>>> The documentation corresponding to this release can be found at:
>>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>>>>>
>>>>>
>>>>> =======================================
>>>>> == How can I help test this release? ==
>>>>> =======================================
>>>>> If you are a Spark user, you can help us test this release by taking an
>>>>> existing Spark workload and running on this release candidate, then
>>>>> reporting any regressions from 1.x.
>>>>>
>>>>> ================================================
>>>>> == What justifies a -1 vote for this release? ==
>>>>> ================================================
>>>>> Critical bugs impacting major functionalities.
>>>>>
>>>>> Bugs already present in 1.x, missing features, or bugs related to new
>>>>> features will not necessarily block this release. Note that historically
>>>>> Spark documentation has been published on the website separately from the
>>>>> main release so we do not need to block the release due to documentation
>>>>> errors either.
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Mark Grover <ma...@apache.org>.

Yeah, I am +1 for including Kafka 0.10 integration as well. We had to wait
for Kafka 0.10 because there were incompatibilities between the Kafka 0.9
and 0.10 API. And, yes, the code for 0.8.0 remains unchanged so there
shouldn't be any regression for existing users. It's only new code for 0.10.

The comments about python support lacking are correct but I do think it's
unfair to unblock this particular PR, without a wider policy of blocking
every PR on that.

On Wed, Jun 22, 2016 at 9:01 AM, Chris Fregly <ch...@fregly.com> wrote:

> +1 for 0.10 support.  this is huge.
>
> On Wed, Jun 22, 2016 at 8:17 AM, Cody Koeninger <co...@koeninger.org>
> wrote:
>
>> Luciano knows there are publicly available examples of how to use the
>> 0.10 connector, including TLS support, because he asked me about it
>> and I gave him a link
>>
>>
>> https://github.com/koeninger/kafka-exactly-once/blob/kafka-0.9/src/main/scala/example/TlsStream.scala
>>
>> If any committer at any time had said "I'd accept this PR, if only it
>> included X", I'd be happy to provide X.  Documentation updates and
>> python support for the 0.8 direct stream connector were done after the
>> original PR.
>>
>>
>>
>> On Wed, Jun 22, 2016 at 9:55 AM, Luciano Resende <lu...@gmail.com>
>> wrote:
>> >
>> >
>> > On Wed, Jun 22, 2016 at 7:46 AM, Cody Koeninger <co...@koeninger.org>
>> wrote:
>> >>
>> >> As far as I know the only thing blocking it at this point is lack of
>> >> committer review / approval.
>> >>
>> >> It's technically adding a new feature after spark code-freeze, but it
>> >> doesn't change existing code, and the kafka project didn't release
>> >> 0.10 until the end of may.
>> >>
>> >
>> >
>> > To be fair with the Kafka 0.10 PR assessment :
>> >
>> > I was expecting somewhat an easy transition from customer using 0.80 to
>> 0.10
>> > connector, but the 0.10 seems to have been treated as a completely new
>> > extension, also, there is no python support, no samples on the pr
>> > demonstrating how to use security capabilities and no documentation
>> updates.
>> >
>> > Thanks
>> >
>> > --
>> > Luciano Resende
>> > http://twitter.com/lresende1975
>> > http://lresende.blogspot.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>
>
> --
> *Chris Fregly*
> Research Scientist @ PipelineIO
> San Francisco, CA
> pipeline.io
> advancedspark.com
>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Chris Fregly <ch...@fregly.com>.

+1 for 0.10 support.  this is huge.

On Wed, Jun 22, 2016 at 8:17 AM, Cody Koeninger <co...@koeninger.org> wrote:

> Luciano knows there are publicly available examples of how to use the
> 0.10 connector, including TLS support, because he asked me about it
> and I gave him a link
>
>
> https://github.com/koeninger/kafka-exactly-once/blob/kafka-0.9/src/main/scala/example/TlsStream.scala
>
> If any committer at any time had said "I'd accept this PR, if only it
> included X", I'd be happy to provide X.  Documentation updates and
> python support for the 0.8 direct stream connector were done after the
> original PR.
>
>
>
> On Wed, Jun 22, 2016 at 9:55 AM, Luciano Resende <lu...@gmail.com>
> wrote:
> >
> >
> > On Wed, Jun 22, 2016 at 7:46 AM, Cody Koeninger <co...@koeninger.org>
> wrote:
> >>
> >> As far as I know the only thing blocking it at this point is lack of
> >> committer review / approval.
> >>
> >> It's technically adding a new feature after spark code-freeze, but it
> >> doesn't change existing code, and the kafka project didn't release
> >> 0.10 until the end of may.
> >>
> >
> >
> > To be fair with the Kafka 0.10 PR assessment :
> >
> > I was expecting somewhat an easy transition from customer using 0.80 to
> 0.10
> > connector, but the 0.10 seems to have been treated as a completely new
> > extension, also, there is no python support, no samples on the pr
> > demonstrating how to use security capabilities and no documentation
> updates.
> >
> > Thanks
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>


-- 
*Chris Fregly*
Research Scientist @ PipelineIO
San Francisco, CA
pipeline.io
advancedspark.com

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Cody Koeninger <co...@koeninger.org>.

Luciano knows there are publicly available examples of how to use the
0.10 connector, including TLS support, because he asked me about it
and I gave him a link

https://github.com/koeninger/kafka-exactly-once/blob/kafka-0.9/src/main/scala/example/TlsStream.scala

If any committer at any time had said "I'd accept this PR, if only it
included X", I'd be happy to provide X.  Documentation updates and
python support for the 0.8 direct stream connector were done after the
original PR.



On Wed, Jun 22, 2016 at 9:55 AM, Luciano Resende <lu...@gmail.com> wrote:
>
>
> On Wed, Jun 22, 2016 at 7:46 AM, Cody Koeninger <co...@koeninger.org> wrote:
>>
>> As far as I know the only thing blocking it at this point is lack of
>> committer review / approval.
>>
>> It's technically adding a new feature after spark code-freeze, but it
>> doesn't change existing code, and the kafka project didn't release
>> 0.10 until the end of may.
>>
>
>
> To be fair with the Kafka 0.10 PR assessment :
>
> I was expecting somewhat an easy transition from customer using 0.80 to 0.10
> connector, but the 0.10 seems to have been treated as a completely new
> extension, also, there is no python support, no samples on the pr
> demonstrating how to use security capabilities and no documentation updates.
>
> Thanks
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Luciano Resende <lu...@gmail.com>.

On Wed, Jun 22, 2016 at 7:46 AM, Cody Koeninger <co...@koeninger.org> wrote:

> As far as I know the only thing blocking it at this point is lack of
> committer review / approval.
>
> It's technically adding a new feature after spark code-freeze, but it
> doesn't change existing code, and the kafka project didn't release
> 0.10 until the end of may.
>
>

To be fair with the Kafka 0.10 PR assessment :

I was expecting somewhat an easy transition from customer using 0.80 to
0.10 connector, but the 0.10 seems to have been treated as a completely new
extension, also, there is no python support, no samples on the pr
demonstrating how to use security capabilities and no documentation
updates.

Thanks

-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Cody Koeninger <co...@koeninger.org>.

As far as I know the only thing blocking it at this point is lack of
committer review / approval.

It's technically adding a new feature after spark code-freeze, but it
doesn't change existing code, and the kafka project didn't release
0.10 until the end of may.


On Wed, Jun 22, 2016 at 9:39 AM, Sean Owen <so...@cloudera.com> wrote:
> I profess ignorance again though I really should know by now, but,
> what's opposing that? I personally thought this was going to be in 2.0
> and didn't kind of notice it wasn't ...
>
> On Wed, Jun 22, 2016 at 3:29 PM, Cody Koeninger <co...@koeninger.org> wrote:
>> I don't have a vote, but I'd just like to reiterate that I think kafka
>> 0.10 support should be added to a 2.0 release candidate; if not now,
>> then well before release.
>>
>> - it's a completely standalone jar, so shouldn't break anyone who's
>> using the existing 0.8 support
>> - it's like the 5th highest voted open ticket, and has been open for months
>> - Luciano has said multiple times that he wants to merge that PR into
>> Bahir if it isn't in a RC for spark 2.0, which I think would confuse
>> users and cause maintenance problems
>>
>> On Wed, Jun 22, 2016 at 12:38 AM, Sean Owen <so...@cloudera.com> wrote:
>>> While I'd officially -1 this while there are still many blockers, this
>>> should certainly be tested as usual, because they're mostly doc and
>>> "audit" type issues.
>>>
>>> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com> wrote:
>>>> Please vote on releasing the following candidate as Apache Spark version
>>>> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
>>>> if a majority of at least 3+1 PMC votes are cast.
>>>>
>>>> [ ] +1 Release this package as Apache Spark 2.0.0
>>>> [ ] -1 Do not release this package because ...
>>>>
>>>>
>>>> The tag to be voted on is v2.0.0-rc1
>>>> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>>>>
>>>> This release candidate resolves ~2400 issues:
>>>> https://s.apache.org/spark-2.0.0-rc1-jira
>>>>
>>>> The release files, including signatures, digests, etc. can be found at:
>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>>>>
>>>> Release artifacts are signed with the following key:
>>>> https://people.apache.org/keys/committer/pwendell.asc
>>>>
>>>> The staging repository for this release can be found at:
>>>> https://repository.apache.org/content/repositories/orgapachespark-1187/
>>>>
>>>> The documentation corresponding to this release can be found at:
>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>>>>
>>>>
>>>> =======================================
>>>> == How can I help test this release? ==
>>>> =======================================
>>>> If you are a Spark user, you can help us test this release by taking an
>>>> existing Spark workload and running on this release candidate, then
>>>> reporting any regressions from 1.x.
>>>>
>>>> ================================================
>>>> == What justifies a -1 vote for this release? ==
>>>> ================================================
>>>> Critical bugs impacting major functionalities.
>>>>
>>>> Bugs already present in 1.x, missing features, or bugs related to new
>>>> features will not necessarily block this release. Note that historically
>>>> Spark documentation has been published on the website separately from the
>>>> main release so we do not need to block the release due to documentation
>>>> errors either.
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Nicholas Chammas <ni...@gmail.com>.

For the clueless (like me):

https://bahir.apache.org/#home

Apache Bahir provides extensions to distributed analytic platforms such as
Apache Spark.

Initially Apache Bahir will contain streaming connectors that were a part
of Apache Spark prior to version 2.0:

   - streaming-akka
   - streaming-mqtt
   - streaming-twitter
   - streaming-zeromq

The Apache Bahir community welcomes the proposal of new extensions.

Nick


On Wed, Jun 22, 2016 at 10:40 AM Sean Owen <so...@cloudera.com> wrote:

> I profess ignorance again though I really should know by now, but,
> what's opposing that? I personally thought this was going to be in 2.0
> and didn't kind of notice it wasn't ...
>
> On Wed, Jun 22, 2016 at 3:29 PM, Cody Koeninger <co...@koeninger.org>
> wrote:
> > I don't have a vote, but I'd just like to reiterate that I think kafka
> > 0.10 support should be added to a 2.0 release candidate; if not now,
> > then well before release.
> >
> > - it's a completely standalone jar, so shouldn't break anyone who's
> > using the existing 0.8 support
> > - it's like the 5th highest voted open ticket, and has been open for
> months
> > - Luciano has said multiple times that he wants to merge that PR into
> > Bahir if it isn't in a RC for spark 2.0, which I think would confuse
> > users and cause maintenance problems
> >
> > On Wed, Jun 22, 2016 at 12:38 AM, Sean Owen <so...@cloudera.com> wrote:
> >> While I'd officially -1 this while there are still many blockers, this
> >> should certainly be tested as usual, because they're mostly doc and
> >> "audit" type issues.
> >>
> >> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com>
> wrote:
> >>> Please vote on releasing the following candidate as Apache Spark
> version
> >>> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and
> passes
> >>> if a majority of at least 3+1 PMC votes are cast.
> >>>
> >>> [ ] +1 Release this package as Apache Spark 2.0.0
> >>> [ ] -1 Do not release this package because ...
> >>>
> >>>
> >>> The tag to be voted on is v2.0.0-rc1
> >>> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
> >>>
> >>> This release candidate resolves ~2400 issues:
> >>> https://s.apache.org/spark-2.0.0-rc1-jira
> >>>
> >>> The release files, including signatures, digests, etc. can be found at:
> >>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
> >>>
> >>> Release artifacts are signed with the following key:
> >>> https://people.apache.org/keys/committer/pwendell.asc
> >>>
> >>> The staging repository for this release can be found at:
> >>>
> https://repository.apache.org/content/repositories/orgapachespark-1187/
> >>>
> >>> The documentation corresponding to this release can be found at:
> >>>
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
> >>>
> >>>
> >>> =======================================
> >>> == How can I help test this release? ==
> >>> =======================================
> >>> If you are a Spark user, you can help us test this release by taking an
> >>> existing Spark workload and running on this release candidate, then
> >>> reporting any regressions from 1.x.
> >>>
> >>> ================================================
> >>> == What justifies a -1 vote for this release? ==
> >>> ================================================
> >>> Critical bugs impacting major functionalities.
> >>>
> >>> Bugs already present in 1.x, missing features, or bugs related to new
> >>> features will not necessarily block this release. Note that
> historically
> >>> Spark documentation has been published on the website separately from
> the
> >>> main release so we do not need to block the release due to
> documentation
> >>> errors either.
> >>>
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> >> For additional commands, e-mail: dev-help@spark.apache.org
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Sean Owen <so...@cloudera.com>.

I profess ignorance again though I really should know by now, but,
what's opposing that? I personally thought this was going to be in 2.0
and didn't kind of notice it wasn't ...

On Wed, Jun 22, 2016 at 3:29 PM, Cody Koeninger <co...@koeninger.org> wrote:
> I don't have a vote, but I'd just like to reiterate that I think kafka
> 0.10 support should be added to a 2.0 release candidate; if not now,
> then well before release.
>
> - it's a completely standalone jar, so shouldn't break anyone who's
> using the existing 0.8 support
> - it's like the 5th highest voted open ticket, and has been open for months
> - Luciano has said multiple times that he wants to merge that PR into
> Bahir if it isn't in a RC for spark 2.0, which I think would confuse
> users and cause maintenance problems
>
> On Wed, Jun 22, 2016 at 12:38 AM, Sean Owen <so...@cloudera.com> wrote:
>> While I'd officially -1 this while there are still many blockers, this
>> should certainly be tested as usual, because they're mostly doc and
>> "audit" type issues.
>>
>> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com> wrote:
>>> Please vote on releasing the following candidate as Apache Spark version
>>> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
>>> if a majority of at least 3+1 PMC votes are cast.
>>>
>>> [ ] +1 Release this package as Apache Spark 2.0.0
>>> [ ] -1 Do not release this package because ...
>>>
>>>
>>> The tag to be voted on is v2.0.0-rc1
>>> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>>>
>>> This release candidate resolves ~2400 issues:
>>> https://s.apache.org/spark-2.0.0-rc1-jira
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>>>
>>> Release artifacts are signed with the following key:
>>> https://people.apache.org/keys/committer/pwendell.asc
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1187/
>>>
>>> The documentation corresponding to this release can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>>>
>>>
>>> =======================================
>>> == How can I help test this release? ==
>>> =======================================
>>> If you are a Spark user, you can help us test this release by taking an
>>> existing Spark workload and running on this release candidate, then
>>> reporting any regressions from 1.x.
>>>
>>> ================================================
>>> == What justifies a -1 vote for this release? ==
>>> ================================================
>>> Critical bugs impacting major functionalities.
>>>
>>> Bugs already present in 1.x, missing features, or bugs related to new
>>> features will not necessarily block this release. Note that historically
>>> Spark documentation has been published on the website separately from the
>>> main release so we do not need to block the release due to documentation
>>> errors either.
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Cody Koeninger <co...@koeninger.org>.

I don't have a vote, but I'd just like to reiterate that I think kafka
0.10 support should be added to a 2.0 release candidate; if not now,
then well before release.

- it's a completely standalone jar, so shouldn't break anyone who's
using the existing 0.8 support
- it's like the 5th highest voted open ticket, and has been open for months
- Luciano has said multiple times that he wants to merge that PR into
Bahir if it isn't in a RC for spark 2.0, which I think would confuse
users and cause maintenance problems

On Wed, Jun 22, 2016 at 12:38 AM, Sean Owen <so...@cloudera.com> wrote:
> While I'd officially -1 this while there are still many blockers, this
> should certainly be tested as usual, because they're mostly doc and
> "audit" type issues.
>
> On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com> wrote:
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
>> if a majority of at least 3+1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 2.0.0
>> [ ] -1 Do not release this package because ...
>>
>>
>> The tag to be voted on is v2.0.0-rc1
>> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>>
>> This release candidate resolves ~2400 issues:
>> https://s.apache.org/spark-2.0.0-rc1-jira
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1187/
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>>
>>
>> =======================================
>> == How can I help test this release? ==
>> =======================================
>> If you are a Spark user, you can help us test this release by taking an
>> existing Spark workload and running on this release candidate, then
>> reporting any regressions from 1.x.
>>
>> ================================================
>> == What justifies a -1 vote for this release? ==
>> ================================================
>> Critical bugs impacting major functionalities.
>>
>> Bugs already present in 1.x, missing features, or bugs related to new
>> features will not necessarily block this release. Note that historically
>> Spark documentation has been published on the website separately from the
>> main release so we do not need to block the release due to documentation
>> errors either.
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Posted by Sean Owen <so...@cloudera.com>.

While I'd officially -1 this while there are still many blockers, this
should certainly be tested as usual, because they're mostly doc and
"audit" type issues.

On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin <rx...@databricks.com> wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
> if a majority of at least 3+1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.0.0
> [ ] -1 Do not release this package because ...
>
>
> The tag to be voted on is v2.0.0-rc1
> (0c66ca41afade6db73c9aeddd5aed6e5dcea90df).
>
> This release candidate resolves ~2400 issues:
> https://s.apache.org/spark-2.0.0-rc1-jira
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1187/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/
>
>
> =======================================
> == How can I help test this release? ==
> =======================================
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.x.
>
> ================================================
> == What justifies a -1 vote for this release? ==
> ================================================
> Critical bugs impacting major functionalities.
>
> Bugs already present in 1.x, missing features, or bugs related to new
> features will not necessarily block this release. Note that historically
> Spark documentation has been published on the website separately from the
> main release so we do not need to block the release due to documentation
> errors either.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org