You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "panbingkun (via GitHub)" <gi...@apache.org> on 2023/07/01 03:08:41 UTC

[GitHub] [spark] panbingkun opened a new pull request, #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

panbingkun opened a new pull request, #41812:
URL: https://github.com/apache/spark/pull/41812

   ### What changes were proposed in this pull request?
   The pr aims to upgrade `pandas` from 2.0.2 to 2.0.3.
   
   ### Why are the changes needed?
   1.The new version brings some bug fixed, eg:
   - Bug in DataFrame.convert_dtype() and Series.convert_dtype() when trying to convert [ArrowDtype](https://pandas.pydata.org/docs/reference/api/pandas.ArrowDtype.html#pandas.ArrowDtype) with dtype_backend="nullable_numpy" ([GH53648](https://github.com/pandas-dev/pandas/issues/53648))
   
   - Bug in [read_csv()](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html#pandas.read_csv) when defining dtype with bool[pyarrow] for the "c" and "python" engines ([GH53390](https://github.com/pandas-dev/pandas/issues/53390))
   
   2.Release notes:
   https://pandas.pydata.org/docs/whatsnew/v2.0.3.html
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Pass GA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1628546152

   > So, also we don't need to test the CI with pandas 1.5.3, because we basically support the latest pandas.
   
   The API layer follows 2.0.3 but I think it's good to test with pandas 1.5.3 anyway? We support the same API layer with the latest pandas version, but it has to better work together with the lower pandas versions if it can.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1628336970

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1616630994

   <img width="902" alt="image" src="https://github.com/apache/spark/assets/15246973/1b422531-eaf5-403b-b9b6-a503430c1cca">
   
   But in my local env:
   <img width="1421" alt="image" src="https://github.com/apache/spark/assets/15246973/3d975a76-3cb9-4a2e-9205-5d4ae9fc0a91">


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1619365485

   > Oh, yes this PR seems fine because this PR only skipped the failed test with adding corresponding JIRA ticket instead of fixing the test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1619356741

   Oh, yes this PR seems fine because this PR only skipped the failed test with adding corresponding JIRA ticket instead of fixing the test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1628534317

   I apologize for the confusion. I will summarize the decision here to clarify the situation:
   
   By default, we should support all PySpark functionalities based on the latest version of pandas (currently 2.0.3), so this PR is appropriate change to reach our goal. So, also we don't need to test the CI with pandas 1.5.3, because we basically support the latest pandas.
   
   However, although we support PySpark based on the latest pandas, **we will not introduce any breaking changes from pandas 2.x in Apache Spark 3.5.0** since pandas 2.x introduces too many breaking changes so the users can get confuse if we introduce these all breaking changes in next minor release (Spark 3.5.0).
   
   For example, we should support new APIs or bug fixes introduced in pandas 2.x from Spark 3.5.0, but breaking changes such as API removals or behavior changes will be supported starting from Spark 4.0.0.
   
   The reason for aligning the CI with the latest version of pandas, even though we cannot immediately support all behaviors introduced in pandas 2.x, is that many users frequently install and use the latest versions of PySpark and pandas together.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "holdenk (via GitHub)" <gi...@apache.org>.
holdenk commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1627540672

   So are we running the tests anywhere on pandas 1.5.3? That seems important.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] bjornjorgensen commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "bjornjorgensen (via GitHub)" <gi...@apache.org>.
bjornjorgensen commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1629670914

   Ehh.. this is more confusing each time that I am reading it.
   
   "My point was that we should focus on supporting the latest version of pandas rather than pandas 1.5.3 anyway."
   No, we are soon going to release `Apache Spark 3.5`. And we need to make that a great release.
   
   Some `pandas API on spark` functions rely on `pandas` like `.info()`
   `.info()` have changed a lot from `pandas` version 1.5.3 to 2.0.3. We don't have any tests for this.
   
   Users need to install a `pandas` version to use `pandas API on spark`.
   If we are going to have them (users) install `pandas` version 2.0.3, and we are only supporting `pandas version 1.5.3 on spark` then users that are using functions like `.to_pandas()` will then have to use `pandas` version 2.0.3 
   
   What if we make one PR where we reverse this PR and the one that updated to `pandas` 2.0.2 and some others, so we are back to `pandas` 1.5.3. And right after we release Apache Spark 3.5.0 we reverse that PR. Can that be a solution?  
   
   
       


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1619343934

   Yes, it may seem a bit strange, but due to a significant number of breaking changes introduced in pandas 2.0.0, we have decided to support pandas 2.0.0 starting from the next major release (4.0.0) in order to minimize user confusion.
   
   The reason for testing with 2.0.x is to quickly carry out all the necessary tasks as soon as preparations for the 4.0.0 release begin, think of it as a sort of preliminary labeling process. :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] bjornjorgensen commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "bjornjorgensen (via GitHub)" <gi...@apache.org>.
bjornjorgensen commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1627714144

   @holdenk That is a good question. 
   https://github.com/apache/spark/pull/41908


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1627985575

   Yeah, good point. Let's manually check the CI with pandas 1.5.3 and fix them all as mentioned from https://github.com/apache/spark/pull/41812#issuecomment-1617061339.
   
   > Let's fix them all instead of downgrading (unless it breaks CI)
   
   Thanks, @bjornjorgensen for testing. Will create tickets to address the test failures with pandas 1.5.3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1619353472

   Based on the above reasons ` testing with 2.0.x is to quickly carry out all the necessary tasks as soon as preparations for the 4.0.0 release begin`
   is this PR OK? Also, submit a jira for incompatible PR as a record of the issues discovered?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1619371646

   > Oh, yes this PR seems fine because this PR only skipped the failed test with adding corresponding JIRA ticket instead of fixing the test.
   
   Okay, I have submitted a new Jira(https://issues.apache.org/jira/browse/SPARK-44289) to record this issue and updated the Jira ID in the `test_aggregate.py` file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] bjornjorgensen commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "bjornjorgensen (via GitHub)" <gi...@apache.org>.
bjornjorgensen commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1620005649

   OK, when we release spark 3.5 we need to add a note that pandas API on spark only supports pandas 1.5.3 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1616879381

   > I think we need to downgrade to pandas 1.5.3 for spark 3.5.
   
   Sorry, I couldn't get the point. Why we should downgrade?
   
   I believe maybe we should fix them instead of downgrading the pandas version if the reason of the test failure is a bug


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1628562064

   My point was that we should focus on supporting the latest version of pandas rather than pandas 1.5.3. @panbingkun @bjornjorgensen Are you happen to interested in addressing the failing tests in the CI for pandas 1.5.3?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1629982676

   @bjornjorgensen 
   
   > Some pandas API on spark functions rely on pandas like .info()
   .info() have changed a lot from pandas version 1.5.3 to 2.0.3. We don't have any tests for this.
   
   Thanks, I got the point.
   
   > What if we make one PR where we reverse this PR and the one that updated to pandas 2.0.2 and some others, so we are back to pandas 1.5.3
   
   Could you happen to made a PR for this and we can keep discussing there?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun closed pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun closed pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3
URL: https://github.com/apache/spark/pull/41812


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1619385732

   Great!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] bjornjorgensen commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "bjornjorgensen (via GitHub)" <gi...@apache.org>.
bjornjorgensen commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1619066772

   it feels strange that we are going to release spark 3.5 with support for pandas 1.5.3 and we are testing it with pandas 2.0.X 
   
   But if we have good control, then its OK :)    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1617061339

   Let's fix them all (unless it breaks CI)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3
URL: https://github.com/apache/spark/pull/41812


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1628557959

   Sure, I agree with that it's good to test with pandas 1.5.3 anyway if the changes for fixing CI with pandas 1.5.3 would not break the CI with the latest pandas.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1628546937

   We can also land some minimized fixes to make the lower version working.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] bjornjorgensen commented on pull request #41812: [SPARK-44267][PS][INFRA] Upgrade `pandas` to 2.0.3

Posted by "bjornjorgensen (via GitHub)" <gi...@apache.org>.
bjornjorgensen commented on PR #41812:
URL: https://github.com/apache/spark/pull/41812#issuecomment-1615799342

   Have a look at https://issues.apache.org/jira/browse/SPARK-43291?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel 
   
   I think we need to downgrade to pandas 1.5.3 for spark 3.5. 
   CC @itholic


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org