You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/06/22 16:44:46 UTC
[GitHub] [arrow] wesm opened a new pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
wesm opened a new pull request #7516:
URL: https://github.com/apache/arrow/pull/7516
This uses pandas to generate a sorted text table when using `archery benchmark diff`. Example:
https://github.com/apache/arrow/pull/7506#issuecomment-647633470
There's some other incidental changes
* pandas is required for `archery benchmark diff`. I don't think there's value in reimplementing the stuff that pandas can do in a few lines of code (read JSON, create a sorted table and print it nicely for us).
* The default # of benchmarks repetitions has been changed from 10 to 1 (see ARROW-9155 for context). IMHO more interactive benchmark results is more useful than higher precision. If you need higher precision you can pass `--repetitions=10` on the command line
* `archery benchmark` was building the unit tests unnecessarily. This also occluded a bug ARROW-9209, which is fixed here
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-648473447
I’m going to update the bot tomorrow.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs edited a comment on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
kszucs edited a comment on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647677894
> @kszucs can you assist me with adapting ursabot for these changes?
Sure.
> I think we can use pandas's `DataFrame.to_html` to create a colorized table for GitHub, too https://pandas.pydata.org/pandas-docs/stable/user_guide/style.html
I'm afraid this is not going to work, because we can't embed any CSS into the comment, this is why we generate the ursabot responses as diffs.
>
> Changes that would be good to have in `ursabot benchmark`:
>
> * Pass through `--cc` and `--cxx` options
> * Pass through `--repetitions`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-648162680
+1. The bot changes can't be done here so going to go ahead and merge this so I can use it more easily without having to switch branches (to use this branch) before running benchmarks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs edited a comment on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
kszucs edited a comment on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647798092
Using pandas is not a problem, but the results cannot be improved much other than sorting the table.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647673667
Just a small question: why are `m` and `b` used for millions and billions, respectively? (I would probably expect `M` and `G`)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647677894
> @kszucs can you assist me with adapting ursabot for these changes?
Sure.
> I think we can use pandas's `DataFrame.to_html` to create a colorized table for GitHub, too https://pandas.pydata.org/pandas-docs/stable/user_guide/style.html
I'm afraid this is not going to work, because we can't embed any CSS into the comment, this is why we generate the ursabot responses as diffs.
>
> Changes that would be good to have in `ursabot benchmark`:
>
> * Pass through `--cc` and `--cxx` options
> * Pass through `--repetitions`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647641059
https://issues.apache.org/jira/browse/ARROW-9201
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647883524
I improved the output to show the `state.counters` stuff
```
benchmark baseline contender change % counters
40 UniqueInt64/5 6.442 GiB/sec 18.346 GiB/sec 184.782 {'iterations': 145, 'null_percent': 100.0}
0 UniqueInt64/11 6.500 GiB/sec 18.364 GiB/sec 182.522 {'iterations': 145, 'null_percent': 100.0}
11 UniqueUInt8/5 812.047 MiB/sec 1.755 GiB/sec 121.298 {'iterations': 142, 'null_percent': 100.0}
7 UniqueUInt8/1 683.943 MiB/sec 1.253 GiB/sec 87.593 {'iterations': 117, 'null_percent': 0.1}
38 UniqueUInt8/4 762.983 MiB/sec 950.521 MiB/sec 24.580 {'iterations': 133, 'null_percent': 99.0}
29 UniqueUInt8/2 659.082 MiB/sec 820.410 MiB/sec 24.478 {'iterations': 114, 'null_percent': 1.0}
5 UniqueInt64/1 2.656 GiB/sec 3.300 GiB/sec 24.223 {'iterations': 60, 'null_percent': 0.1}
32 UniqueInt64/4 5.627 GiB/sec 6.772 GiB/sec 20.349 {'iterations': 119, 'null_percent': 99.0}
25 UniqueInt64/10 5.234 GiB/sec 6.294 GiB/sec 20.254 {'iterations': 110, 'null_percent': 99.0}
39 UniqueString100bytes/11 26.815 GiB/sec 31.122 GiB/sec 16.061 {'iterations': 48, 'null_percent': 100.0}
23 UniqueString10bytes/5 2.691 GiB/sec 3.113 GiB/sec 15.667 {'iterations': 48, 'null_percent': 100.0}
34 UniqueString100bytes/5 26.944 GiB/sec 31.015 GiB/sec 15.108 {'iterations': 48, 'null_percent': 100.0}
6 UniqueString10bytes/11 2.699 GiB/sec 3.096 GiB/sec 14.721 {'iterations': 49, 'null_percent': 100.0}
21 UniqueString100bytes/7 1.947 GiB/sec 2.217 GiB/sec 13.866 {'iterations': 3, 'null_percent': 0.1}
28 UniqueInt64/2 2.622 GiB/sec 2.904 GiB/sec 10.770 {'iterations': 59, 'null_percent': 1.0}
13 UniqueInt64/3 2.157 GiB/sec 2.343 GiB/sec 8.644 {'iterations': 48, 'null_percent': 10.0}
33 UniqueString100bytes/4 24.286 GiB/sec 26.030 GiB/sec 7.181 {'iterations': 43, 'null_percent': 99.0}
22 UniqueInt64/7 2.542 GiB/sec 2.707 GiB/sec 6.497 {'iterations': 56, 'null_percent': 0.1}
20 UniqueString100bytes/10 22.536 GiB/sec 23.985 GiB/sec 6.432 {'iterations': 40, 'null_percent': 99.0}
35 UniqueString10bytes/1 788.817 MiB/sec 836.008 MiB/sec 5.983 {'iterations': 14, 'null_percent': 0.1}
17 UniqueString10bytes/7 592.671 MiB/sec 628.054 MiB/sec 5.970 {'iterations': 10, 'null_percent': 0.1}
3 UniqueString10bytes/4 2.515 GiB/sec 2.658 GiB/sec 5.687 {'iterations': 45, 'null_percent': 99.0}
19 UniqueString10bytes/10 2.402 GiB/sec 2.529 GiB/sec 5.269 {'iterations': 42, 'null_percent': 99.0}
9 UniqueString100bytes/1 3.929 GiB/sec 4.077 GiB/sec 3.762 {'iterations': 7, 'null_percent': 0.1}
30 UniqueString10bytes/8 593.560 MiB/sec 610.253 MiB/sec 2.812 {'iterations': 10, 'null_percent': 1.0}
12 UniqueString10bytes/2 788.505 MiB/sec 808.396 MiB/sec 2.523 {'iterations': 14, 'null_percent': 1.0}
37 UniqueString100bytes/8 1.965 GiB/sec 1.998 GiB/sec 1.697 {'iterations': 3, 'null_percent': 1.0}
1 UniqueString100bytes/2 3.984 GiB/sec 4.025 GiB/sec 1.028 {'iterations': 7, 'null_percent': 1.0}
36 UniqueString100bytes/3 4.262 GiB/sec 4.293 GiB/sec 0.725 {'iterations': 8, 'null_percent': 10.0}
8 BuildStringDictionary 85.507 MiB/sec 85.687 MiB/sec 0.211 {'iterations': 198}
16 UniqueString100bytes/9 2.121 GiB/sec 2.111 GiB/sec -0.469 {'iterations': 4, 'null_percent': 10.0}
4 UniqueString100bytes/6 2.056 GiB/sec 2.043 GiB/sec -0.626 {'iterations': 4, 'null_percent': 0.0}
10 UniqueUInt8/3 453.281 MiB/sec 448.407 MiB/sec -1.075 {'iterations': 79, 'null_percent': 10.0}
14 UniqueString100bytes/0 4.100 GiB/sec 4.055 GiB/sec -1.089 {'iterations': 7, 'null_percent': 0.0}
24 UniqueInt64/8 2.473 GiB/sec 2.443 GiB/sec -1.202 {'iterations': 55, 'null_percent': 1.0}
26 UniqueString10bytes/9 615.880 MiB/sec 608.453 MiB/sec -1.206 {'iterations': 11, 'null_percent': 10.0}
42 UniqueString10bytes/6 651.430 MiB/sec 640.128 MiB/sec -1.735 {'iterations': 11, 'null_percent': 0.0}
27 UniqueUInt8/0 1.775 GiB/sec 1.738 GiB/sec -2.063 {'iterations': 318, 'null_percent': 0.0}
31 UniqueInt64/9 2.076 GiB/sec 2.033 GiB/sec -2.067 {'iterations': 46, 'null_percent': 10.0}
15 BuildDictionary 1.535 GiB/sec 1.503 GiB/sec -2.079 {'iterations': 277}
41 UniqueInt64/0 3.915 GiB/sec 3.827 GiB/sec -2.262 {'iterations': 87, 'null_percent': 0.0}
43 UniqueString10bytes/3 802.729 MiB/sec 784.279 MiB/sec -2.298 {'iterations': 14, 'null_percent': 10.0}
18 UniqueInt64/6 3.284 GiB/sec 3.178 GiB/sec -3.229 {'iterations': 72, 'null_percent': 0.0}
2 UniqueString10bytes/0 895.983 MiB/sec 849.150 MiB/sec -5.227 {'iterations': 16, 'null_percent': 0.0}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647758158
I’m sort of -1 on using anything but pandas for data munging and data presentation in our tooling. It’s not a very large dependency and has everything we need.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm edited a comment on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
wesm edited a comment on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647758158
I’m sort of -1 on using anything but pandas for data munging and data presentation in our tooling. It’s not a very large dependency and has everything we need. FWIW, the current Ursabot output doesn't even sort the results, which is really needed to easily make sense of what got faster or slower at a glance.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] fsaintjacques commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
fsaintjacques commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647698721
ursabot uses `tabulate` which I think is smaller dependencies.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647798092
Using pandas is not a problem, but other than sorting the results we cannot really improve the look and feel.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm closed pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
wesm closed pull request #7516:
URL: https://github.com/apache/arrow/pull/7516
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #7516: ARROW-9201: [Archery] More user-friendly console output for benchmark diffs, add repetitions argument, don't build unit tests
Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-647641007
@kszucs can you assist me with adapting ursabot for these changes? I think we can use pandas's `DataFrame.to_html` to create a colorized table for GitHub, too https://pandas.pydata.org/pandas-docs/stable/user_guide/style.html
Changes that would be good to have in `ursabot benchmark`:
* Pass through `--cc` and `--cxx` options
* Pass through `--repetitions`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org