You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/13 16:00:01 UTC

[GitHub] [arrow-rs] e-dard opened a new pull request #542: refactor: remove lifetime from DynComparator

e-dard opened a new pull request #542:
URL: https://github.com/apache/arrow-rs/pull/542


   # Which issue does this PR close?
   
   N/A
   
   # Rationale for this change
   
   The cost of building a comparator (initialising a `DynComparator`) is often significantly higher than the actual cost of executing the comparator's closure on two row IDs. Therefore it makes sense to build the comparator once, and re-use the returned `DynComparator` for each row you are comparing the two arrays on.
   
   Due to the explicit lifetime of `DynComparator` I recently found it problematic to store the `DynComparator` on an object that was used across threads in an async environment.
   
   By refactoring the `build_compare` implementations I was able to remove the lifetime and it's much easier to use in the Datafusion operator I'm improving.
   
   As a nice side-effect, the sort kernel benchmarks show an improvement in performance of around 2–5%.
   
   ```
   $ critcmp master pr
   
   group                                          master                          pr
   -----                                          ------                          --
   bool sort 2^12                                 1.03    310.8±1.34µs            1.00    302.8±7.78µs
   bool sort nulls 2^12                           1.01    287.4±2.22µs            1.00    284.0±3.23µs
   sort 2^10                                      1.04     98.7±3.58µs            1.00     94.6±0.50µs
   sort 2^12                                      1.05    510.7±5.56µs            1.00    486.2±9.94µs
   sort 2^12 limit 10                             1.05     48.1±0.38µs            1.00     45.6±0.30µs
   sort 2^12 limit 100                            1.04     52.8±0.37µs            1.00     50.6±0.41µs
   sort 2^12 limit 1000                           1.06    141.1±0.94µs            1.00    132.7±0.95µs
   sort 2^12 limit 2^12                           1.03    501.2±4.01µs            1.00    486.5±4.87µs
   sort nulls 2^10                                1.02     70.9±0.72µs            1.00     69.4±0.51µs
   sort nulls 2^12                                1.02    369.7±3.51µs            1.00   363.0±18.52µs
   sort nulls 2^12 limit 10                       1.01     70.6±1.22µs            1.00     70.0±1.27µs
   sort nulls 2^12 limit 100                      1.00     71.7±0.82µs            1.00     71.8±1.60µs
   sort nulls 2^12 limit 1000                     1.01     80.5±1.55µs            1.00     79.4±1.41µs
   sort nulls 2^12 limit 2^12                     1.05    375.4±4.78µs            1.00    356.1±3.04µs
   ```
   
   # What changes are included in this PR?
   
   I have refactored `DynComparator` to remove the explicit lifetime. I did this by removing the need for the `DynComparator` instantiations to close over references. Instead we just move in owned arrays by cheaply cloning the underlying `ArrayData`. 
   
   Further, I took the liberty of making `DynComparator` `Send+Sync`. Again, this helps when you want to store it on types that need to be `Send + Sync`.
   
   Relative to how often the closure will be called I don't see this as an expensive change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] nevi-me merged pull request #542: refactor: remove lifetime from DynComparator

Posted by GitBox <gi...@apache.org>.
nevi-me merged pull request #542:
URL: https://github.com/apache/arrow-rs/pull/542


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter commented on pull request #542: refactor: remove lifetime from DynComparator

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #542:
URL: https://github.com/apache/arrow-rs/pull/542#issuecomment-879237351


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/542?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#542](https://codecov.io/gh/apache/arrow-rs/pull/542?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5d21d29) into [master](https://codecov.io/gh/apache/arrow-rs/commit/80a57be3215ed7210971a501b10664cadfd59bea?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (80a57be) will **increase** coverage by `0.06%`.
   > The diff coverage is `81.25%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/542/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/542?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #542      +/-   ##
   ==========================================
   + Coverage   82.54%   82.60%   +0.06%     
   ==========================================
     Files         167      167              
     Lines       45957    45984      +27     
   ==========================================
   + Hits        37934    37985      +51     
   + Misses       8023     7999      -24     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/542?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [arrow/src/compute/kernels/sort.rs](https://codecov.io/gh/apache/arrow-rs/pull/542/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9zb3J0LnJz) | `94.15% <ø> (ø)` | |
   | [arrow/src/array/ord.rs](https://codecov.io/gh/apache/arrow-rs/pull/542/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L29yZC5ycw==) | `63.33% <81.25%> (ø)` | |
   | [arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow-rs/pull/542/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X3N0cmluZy5ycw==) | `97.85% <0.00%> (+0.08%)` | :arrow_up: |
   | [arrow/src/compute/kernels/take.rs](https://codecov.io/gh/apache/arrow-rs/pull/542/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy90YWtlLnJz) | `94.72% <0.00%> (+0.28%)` | :arrow_up: |
   | [arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow-rs/pull/542/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2J1aWxkZXIucnM=) | `86.11% <0.00%> (+0.33%)` | :arrow_up: |
   | [arrow/src/buffer/immutable.rs](https://codecov.io/gh/apache/arrow-rs/pull/542/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2J1ZmZlci9pbW11dGFibGUucnM=) | `97.84% <0.00%> (+0.55%)` | :arrow_up: |
   | [arrow/src/buffer/mutable.rs](https://codecov.io/gh/apache/arrow-rs/pull/542/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2J1ZmZlci9tdXRhYmxlLnJz) | `90.10% <0.00%> (+5.76%)` | :arrow_up: |
   | [arrow/src/array/transform/boolean.rs](https://codecov.io/gh/apache/arrow-rs/pull/542/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS9ib29sZWFuLnJz) | `84.61% <0.00%> (+7.69%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/542?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/542?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [80a57be...5d21d29](https://codecov.io/gh/apache/arrow-rs/pull/542?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #542: refactor: remove lifetime from DynComparator

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #542:
URL: https://github.com/apache/arrow-rs/pull/542#issuecomment-879228326


   I have filed #543 issue for this change


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org