You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/20 06:29:26 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

HyukjinKwon opened a new pull request #30098:
URL: https://github.com/apache/spark/pull/30098


   ### What changes were proposed in this pull request?
   
   Some tests fail with PyArrow 2.0.0+:
   
   ```
   ======================================================================
   ERROR [0.774s]: test_grouped_over_window_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests)
   ----------------------------------------------------------------------
   Traceback (most recent call last):
     File "/__w/spark/spark/python/pyspark/sql/tests/test_pandas_grouped_map.py", line 595, in test_grouped_over_window_with_key
       .select('id', 'result').collect()
     File "/__w/spark/spark/python/pyspark/sql/dataframe.py", line 588, in collect
       sock_info = self._jdf.collectToPython()
     File "/__w/spark/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
       answer, self.gateway_client, self.target_id, self.name)
     File "/__w/spark/spark/python/pyspark/sql/utils.py", line 117, in deco
       raise converted from None
   pyspark.sql.utils.PythonException: 
     An exception was thrown from the Python worker. Please see the stack trace below.
   Traceback (most recent call last):
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 601, in main
       process()
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 593, in process
       serializer.dump_stream(out_iter, outfile)
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 255, in dump_stream
       return ArrowStreamSerializer.dump_stream(self, init_stream_yield_batches(), stream)
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 81, in dump_stream
       for batch in iterator:
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 248, in init_stream_yield_batches
       for series in iterator:
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 426, in mapper
       return f(keys, vals)
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 170, in <lambda>
       return lambda k, v: [(wrapped(k, v), to_arrow_type(return_type))]
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 158, in wrapped
       result = f(key, pd.concat(value_series, axis=1))
     File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/util.py", line 68, in wrapper
       return f(*args, **kwargs)
     File "/__w/spark/spark/python/pyspark/sql/tests/test_pandas_grouped_map.py", line 590, in f
       "{} != {}".format(expected_key[i][1], window_range)
   AssertionError: {'start': datetime.datetime(2018, 3, 15, 0, 0), 'end': datetime.datetime(2018, 3, 20, 0, 0)} != {'start': datetime.datetime(2018, 3, 15, 0, 0, tzinfo=<StaticTzInfo 'Etc/UTC'>), 'end': datetime.datetime(2018, 3, 20, 0, 0, tzinfo=<StaticTzInfo 'Etc/UTC'>)}
   ```
   
   https://github.com/apache/spark/runs/1278917457
   
   This PR proposes to set the upper bound of PyArrow in GitHub Actions build. This should be removed when we properly support PyArrow 2.0.0+ (SPARK-33189).
   
   ### Why are the changes needed?
   
   To make build pass.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, dev-only.
   
   ### How was this patch tested?
   
   GitHub Actions in this build will test it out.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234


   There are few things to note:
   - This is a temporary fix. Once PySpark supports PyArrow 2.0.0+ (SPARK-33189), we can remove this change.
   - We should port this back into other branches in case PyArrow 2.0.0+ support is not ported back, and in order to make the builds pass.
   - PyPy3 and Python 8 build will pass because the packages are pre-installed in the docker image (see SPARK-33162). It fails with Python 3.6 because it newly installs.
   - I didn't update documentation and `setup.py` yet. This PR currently aims to make the build pass first.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712664179






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712641429






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712659653


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34639/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-713221535


   Thank you guys.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712659671






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712677236


   cc @BryanCutler and @dongjoon-hyun 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712628698


   **[Test build #130033 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130033/testReport)** for PR 30098 at commit [`5b71752`](https://github.com/apache/spark/commit/5b717527319cc0bd5307ad762160a892daf2c8c5).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712649039


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34640/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712664179






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712647757


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34639/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712689050


   Merged to master, branch-3.0 and branch-2.4.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #30098:
URL: https://github.com/apache/spark/pull/30098


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712641429






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712641321


   **[Test build #130033 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130033/testReport)** for PR 30098 at commit [`5b71752`](https://github.com/apache/spark/commit/5b717527319cc0bd5307ad762160a892daf2c8c5).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] BryanCutler commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
BryanCutler commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-713008734


   +1, best to do this until we can get all tests passing for new releases.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712688959


   The tests passed. I am going to merge this to unblock other PRs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234


   There are few things to note:
   - This is a temporary fix. Once PySpark supports PyArrow 2.0.0+ (SPARK-33189), we can remove this change.
   - We should port this back into other branches in case PyArrow 2.0.0+ support is not ported back, and in order to make the builds pass.
   - PyPy3 and Python 8 build will pass because the packages are pre-installed in the docker image (see SPARK-33162). It fails with Python 3.6 because it newly installs the latest PyArrow 2.0.0.
   - I didn't update documentation and `setup.py` yet. This PR currently aims to make the build pass first.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712628698


   **[Test build #130033 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130033/testReport)** for PR 30098 at commit [`5b71752`](https://github.com/apache/spark/commit/5b717527319cc0bd5307ad762160a892daf2c8c5).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234


   There are few things to note:
   - This is a temporary fix. Once PySpark supports PyArrow 2.0.0+ (SPARK-33189), we can remove this change.
   - We should port this back into other branches in case PyArrow 2.0.0+ support is not ported back, and in order to make the builds pass.
   - PyPy3 and Python 3.8 build will pass because the packages are pre-installed in the docker image (see SPARK-33162). It fails with Python 3.6 because it newly installs the latest PyArrow 2.0.0.
   - I didn't update documentation and `setup.py` yet. This PR currently aims to make the build pass first.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234


   There are few things to note:
   - This is a temporary fix. Once PySpark supports PyArrow 2.0.0+, we can remove this change.
   - We should port this back into other branches in case PyArrow 2.0.0+ support is not ported back, and in order to make the builds pass.
   - PyPy3 and Python 8 build will pass (not tested) because the packages are pre-installed in the docker image. It fails with Python 3.6 because it newly installs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712963278


   +1, LGTM. Thanks, @HyukjinKwon .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234


   There are few things to note:
   - This is a temporary fix. Once PySpark supports PyArrow 2.0.0+ (SPARK-33189), we can remove this change.
   - We should port this back into other branches in case PyArrow 2.0.0+ support is not ported back, and in order to make the builds pass.
   - Python 3.8 build will pass because the packages are pre-installed in the docker image (see SPARK-33162), and PyPy3 does not have PyArrow. It fails with Python 3.6 because it newly installs the latest PyArrow 2.0.0.
   - I didn't update documentation and `setup.py` yet. This PR currently aims to make the build pass first.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234


   There are few things to note:
   - This is a temporary fix. Once PySpark supports PyArrow 2.0.0+ (SPARK-33189), we can remove this change.
   - We should port this back into other branches in case PyArrow 2.0.0+ support is not ported back, and in order to make the builds pass.
   - PyPy3 and Python 8 build will pass (not tested) because the packages are pre-installed in the docker image (see SPARK-33162). It fails with Python 3.6 because it newly installs.
   - I didn't update documentation and `setup.py` yet. This PR currently aims to make the build pass first.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712664159


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34640/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712659671






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org