You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/08/27 06:25:13 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request #25594: [SPARK-27992][PYTHON][TESTS] Add a test to make sure toPandas with Arrow optimization throws an exception per maxResultSize

HyukjinKwon opened a new pull request #25594: [SPARK-27992][PYTHON][TESTS] Add a test to make sure toPandas with Arrow optimization throws an exception per maxResultSize
URL: https://github.com/apache/spark/pull/25594
 
 
   ### What changes were proposed in this pull request?
   This PR proposes to add a test case for:
   
   ```bash
   ./bin/pyspark --conf spark.driver.maxResultSize=1m
   spark.conf.set("spark.sql.execution.arrow.enabled",True)
   ```
   
   ```python
   spark.range(10000000).toPandas()
   ```
   
   ```
   Empty DataFrame
   Columns: [id]
   Index: []
   ```
   
   which can result in partial results (see https://github.com/apache/spark/pull/25593#issuecomment-525153808). This regression was found between Spark 2.3 and Spark 2.4, and accidentally fixed.
   
   
   ### Why are the changes needed?
   To prevent the same regression in the future.
   
   ### Does this PR introduce any user-facing change?
   No.
   
   
   ### How was this patch tested?
   Test was added.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org