You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "pitrou (via GitHub)" <gi...@apache.org> on 2023/07/26 15:00:17 UTC

[GitHub] [arrow] pitrou opened a new issue, #36892: [C++] Potential regression on FieldRef/FieldPath non-flattening Get methods

pitrou opened a new issue, #36892:
URL: https://github.com/apache/arrow/issues/36892

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   I did not pay attention initially, but it seems https://github.com/apache/arrow/pull/35197 introduced a large regression on the [wide-dataframe benchmark](https://github.com/voltrondata-labs/benchmarks/blob/main/benchmarks/wide_dataframe_benchmark.py).
   
   See benchmark results here:
   https://conbench.ursa.dev/compare/runs/9cf73ac83f0a44179e6538b2c1c7babd...3d76cb5ffb8849bf8c3ea9b32d08b3b7/
   
   Note the benchmark is creating a dataframe with 10000 columns.
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] raulcd commented on issue #36892: [C++] Potential regression on FieldRef/FieldPath non-flattening Get methods

Posted by "raulcd (via GitHub)" <gi...@apache.org>.
raulcd commented on issue #36892:
URL: https://github.com/apache/arrow/issues/36892#issuecomment-1655445724

   @benibus @westonpace I was planning on waiting a couple days to create the next RC (Tuesday?) to see if this was able to make it to 13.0.0. If this sounds totally unreasonable let me know and I can create the new RC sooner as discussed on Zulip we might want to ship 13.0.0 with this known issue and apply the fix targeting 14.0.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on issue #36892: [C++] Potential regression on FieldRef/FieldPath non-flattening Get methods

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #36892:
URL: https://github.com/apache/arrow/issues/36892#issuecomment-1651989634

   @benibus Would you have time to take a look at this?
   It would be nice to add benchmarks for the non-flattening methods. They will also help measuring the slowdown before/after that PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] benibus commented on issue #36892: [C++] Potential regression on FieldRef/FieldPath non-flattening Get methods

Posted by "benibus (via GitHub)" <gi...@apache.org>.
benibus commented on issue #36892:
URL: https://github.com/apache/arrow/issues/36892#issuecomment-1660212833

   Should have a fix for this by the end of the week, provided my diagnosis is actually complete.
   
   At the very least, the PR introduced substantial overhead for calling `FieldPath::Get` on record batches specifically. It seems this is due to a few instances where usages of `RecordBatch::column_data()` were replaced with `columns()`, which is far less trivial (it loads every column atomically on each invocation).
   
   That being said, the two aren't exactly interchangeable in the implementation so there's some restructuring in order. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] raulcd commented on issue #36892: [C++] Potential regression on FieldRef/FieldPath non-flattening Get methods

Posted by "raulcd (via GitHub)" <gi...@apache.org>.
raulcd commented on issue #36892:
URL: https://github.com/apache/arrow/issues/36892#issuecomment-1658798280

   @benibus any news here? I plan to create the new RC tomorrow unless there's possibility for this to be solved soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou closed issue #36892: [C++] Potential regression on FieldRef/FieldPath non-flattening Get methods

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou closed issue #36892: [C++] Potential regression on FieldRef/FieldPath non-flattening Get methods
URL: https://github.com/apache/arrow/issues/36892


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org