You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/01/05 09:25:48 UTC

[GitHub] [arrow] AlenkaF opened a new pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

AlenkaF opened a new pull request #12081:
URL: https://github.com/apache/arrow/pull/12081


   This PR adds a check to the `_reconstruct_index` in `pandas_compat.py ` so that the roundtrip is correct for an empty `pandas.DataFrame` with and index.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot commented on pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #12081:
URL: https://github.com/apache/arrow/pull/12081#issuecomment-1015185515


   Benchmark runs are scheduled for baseline = 8254615b9af90ea35583c1f2903bcc3f7f966968 and contender = cec5a178e101e101d678776021d4469ec5f4947c. cec5a178e101e101d678776021d4469ec5f4947c is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/ac6c2b05bce247cab32d5d41a7bb3e64...3977f213efc44485badf35043651220d/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/17b1a24a33034e009466709a87fd2da9...f1749dc86368426c891734209a9f50b6/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/9b1896c4fe3142e0bf2d5fbb50f6be8c...9b3ee2250c7a4197b89ed785b2744a49/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche closed pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche closed pull request #12081:
URL: https://github.com/apache/arrow/pull/12081


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12081:
URL: https://github.com/apache/arrow/pull/12081#issuecomment-1005516466


   https://issues.apache.org/jira/browse/ARROW-10643


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12081:
URL: https://github.com/apache/arrow/pull/12081#issuecomment-1015185515


   Benchmark runs are scheduled for baseline = 8254615b9af90ea35583c1f2903bcc3f7f966968 and contender = cec5a178e101e101d678776021d4469ec5f4947c. cec5a178e101e101d678776021d4469ec5f4947c is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/ac6c2b05bce247cab32d5d41a7bb3e64...3977f213efc44485badf35043651220d/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/17b1a24a33034e009466709a87fd2da9...f1749dc86368426c891734209a9f50b6/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/9b1896c4fe3142e0bf2d5fbb50f6be8c...9b3ee2250c7a4197b89ed785b2744a49/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12081:
URL: https://github.com/apache/arrow/pull/12081#issuecomment-1015185515


   Benchmark runs are scheduled for baseline = 8254615b9af90ea35583c1f2903bcc3f7f966968 and contender = cec5a178e101e101d678776021d4469ec5f4947c. cec5a178e101e101d678776021d4469ec5f4947c is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/ac6c2b05bce247cab32d5d41a7bb3e64...3977f213efc44485badf35043651220d/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/17b1a24a33034e009466709a87fd2da9...f1749dc86368426c891734209a9f50b6/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/9b1896c4fe3142e0bf2d5fbb50f6be8c...9b3ee2250c7a4197b89ed785b2744a49/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] edponce commented on a change in pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

Posted by GitBox <gi...@apache.org>.
edponce commented on a change in pull request #12081:
URL: https://github.com/apache/arrow/pull/12081#discussion_r780837727



##########
File path: python/pyarrow/pandas_compat.py
##########
@@ -934,7 +934,7 @@ def _reconstruct_index(table, index_descriptors, all_columns):
                                                     descr['stop'],
                                                     step=descr['step'],
                                                     name=index_name)
-            if len(index_level) != len(table):
+            if len(index_level) != len(table) and len(table) != 0:
                 # Possibly the result of munged metadata

Review comment:
       Nit: I would exchange the checks to make it more clear when this code path applies, and should resolve better for interpreter/compiler.
   ```python
   if len(table) > 0 and len(index_level) != len(table):
      ...
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12081:
URL: https://github.com/apache/arrow/pull/12081#issuecomment-1015171005


   @jorisvandenbossche could you give a final look at this PR please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12081: ARROW-10643: [Python] Pandas<->pyarrow roundtrip failing to recreate index for empty dataframe

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12081:
URL: https://github.com/apache/arrow/pull/12081#issuecomment-1015185515


   Benchmark runs are scheduled for baseline = 8254615b9af90ea35583c1f2903bcc3f7f966968 and contender = cec5a178e101e101d678776021d4469ec5f4947c. cec5a178e101e101d678776021d4469ec5f4947c is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/ac6c2b05bce247cab32d5d41a7bb3e64...3977f213efc44485badf35043651220d/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/17b1a24a33034e009466709a87fd2da9...f1749dc86368426c891734209a9f50b6/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/9b1896c4fe3142e0bf2d5fbb50f6be8c...9b3ee2250c7a4197b89ed785b2744a49/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org