You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/09/10 13:42:29 UTC

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

jorisvandenbossche opened a new pull request #8162:
URL: https://github.com/apache/arrow/pull/8162


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alippai commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
alippai commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-691047149


   `StringToTzinfo` always creates `pytz.FixedOffset`, while it could create `timezone(timedelta(hours=*,minutes=*))`. I assume technically they are interchangeable, but this way it doesn't survive the pandas roundtrip. Does the pyarrow roundtrip test compare the string values?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-706269990


   @github-actions crossbow submit -g integration


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alippai edited a comment on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
alippai edited a comment on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-691047149


   `StringToTzinfo` always creates `pytz.FixedOffset`, while it could create `timezone(timedelta(hours=*,minutes=*))`. I assume technically they are interchangeable, but this way it doesn't survive the pandas roundtrip. Does the pyarrow roundtrip test compare the string values? Should pyarrow record the source type info?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-705456869


   @jorisvandenbossche seems like python3.5 build is failing


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-706026472


   Yes, it's an actual failure that I still needed to take a look at. But the problem is that I can't reproduce it locally (well, it might be specific to Python 3.5, but I don't have a py 3.5 development environment ... and conda not supporting py3.5 doesn't make it easier to quickly set that up).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche closed pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche closed pull request #8162:
URL: https://github.com/apache/arrow/pull/8162


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-706270759


   OK, I am just going to skip it for older pandas versions .. On the older pandas version, it's the assert_frame_equal that is failing, but the actual result and expected result are the same with old version or recent version of pandas (well the difference between expected/result is the same, but on recent versions pandas ignores this difference of pytz.UTC vs datetime.timezone.utc)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alippai commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
alippai commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-691047149


   `StringToTzinfo` always creates `pytz.FixedOffset`, while it could create `timezone(timedelta(hours=*,minutes=*))`. I assume technically they are interchangeable, but this way it doesn't survive the pandas roundtrip. Does the pyarrow roundtrip test compare the string values?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-705456869


   @jorisvandenbossche seems like python3.5 build is failing


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-706026472


   Yes, it's an actual failure that I still needed to take a look at. But the problem is that I can't reproduce it locally (well, it might be specific to Python 3.5, but I don't have a py 3.5 development environment ... and conda not supporting py3.5 doesn't make it easier to quickly set that up).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alippai edited a comment on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
alippai edited a comment on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-691047149


   `StringToTzinfo` always creates `pytz.FixedOffset`, while it could create `timezone(timedelta(hours=*,minutes=*))`. I assume technically they are interchangeable, but this way it doesn't survive the pandas roundtrip. Does the pyarrow roundtrip test compare the string values? Should pyarrow record the source type info?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-690299825


   https://issues.apache.org/jira/browse/ARROW-9962


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-694178655


   @alippai I think it's mostly for historical reasons that we always use pytz to recreate a timezone object (eg pandas also historically always used pytz, but nowadays accepts multiple timezone types). It probably makes some sense to use the stdlib timezone where possible (eg for a fixed offset timezone), so we could switch to that, but I personally wouldn't start with tracking the exact package / class. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alippai edited a comment on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
alippai edited a comment on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-691047149






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-706278606


   Revision: 0e2a734c1b1ac7aca11e0c17f8251a645579e4fd
   
   Submitted crossbow builds: [ursa-labs/crossbow @ actions-632](https://github.com/ursa-labs/crossbow/branches/all?query=actions-632)
   
   |Task|Status|
   |----|------|
   |test-conda-python-3.6-pandas-0.23|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.6-pandas-0.23)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.6-pandas-0.23)|
   |test-conda-python-3.7-dask-latest|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-dask-latest)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-dask-latest)|
   |test-conda-python-3.7-hdfs-2.9.2|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-hdfs-2.9.2)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-hdfs-2.9.2)|
   |test-conda-python-3.7-kartothek-latest|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-kartothek-latest)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-kartothek-latest)|
   |test-conda-python-3.7-kartothek-master|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-kartothek-master)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-kartothek-master)|
   |test-conda-python-3.7-pandas-latest|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-pandas-latest)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-pandas-latest)|
   |test-conda-python-3.7-pandas-master|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-pandas-master)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-pandas-master)|
   |test-conda-python-3.7-spark-branch-3.0|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-spark-branch-3.0)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-spark-branch-3.0)|
   |test-conda-python-3.7-turbodbc-latest|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-turbodbc-latest)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-turbodbc-latest)|
   |test-conda-python-3.7-turbodbc-master|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.7-turbodbc-master)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.7-turbodbc-master)|
   |test-conda-python-3.8-dask-master|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.8-dask-master)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.8-dask-master)|
   |test-conda-python-3.8-jpype|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.8-jpype)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.8-jpype)|
   |test-conda-python-3.8-pandas-latest|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.8-pandas-latest)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.8-pandas-latest)|
   |test-conda-python-3.8-spark-master|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-632-github-test-conda-python-3.8-spark-master)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-632-github-test-conda-python-3.8-spark-master)|


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alippai commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
alippai commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-694184019


   @jorisvandenbossche makes sense, thanks. TIL one more "standard" class is coming: https://docs.python.org/3.9/whatsnew/3.9.html#zoneinfo. Looks like it's replacing `pytz.timezone`. Also this timezone handling looks like an endless rabbit hole: https://blog.ganssle.io/articles/2018/03/pytz-fastest-footgun.html. At least for fixed offsets the new standard doesn't make any difference. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alippai commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

Posted by GitBox <gi...@apache.org>.
alippai commented on pull request #8162:
URL: https://github.com/apache/arrow/pull/8162#issuecomment-691047149






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org