You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/08 16:06:03 UTC

[GitHub] [arrow] bkietz opened a new pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

bkietz opened a new pull request #9140:
URL: https://github.com/apache/arrow/pull/9140


   Enables diffing two cached JSON benchmark results. Also groups regressions and non-regressions for easier inspection:
   
   ```shell-session
   $ export FILTERS="--benchmark-filter=value-parsing --suite-filter=IntegerFormatting"
   $ archery benchmark run $FILTERS --output=baseline.json
   $ git checkout $BRANCH
   $ archery benchmark run $FILTERS --output=contender.json
   $ archery benchmark diff contender.json baseline.json
   ---------------------------------------------------------------------------------------
   Non-regressions: (1)
   ---------------------------------------------------------------------------------------
                      benchmark            baseline           contender  change % counters
    IntegerFormatting<Int8Type>  106.163m items/sec  108.091m items/sec     1.816       {}
   
   -----------------------------------------------------------------------------------------
   Regressions: (9)
   -----------------------------------------------------------------------------------------
                        benchmark            baseline           contender  change % counters
     IntegerFormatting<UInt8Type>  112.739m items/sec  102.576m items/sec    -9.015       {}
    IntegerFormatting<UInt32Type>   61.029m items/sec   54.603m items/sec   -10.530       {}
     IntegerFormatting<Int16Type>   86.396m items/sec   74.601m items/sec   -13.653       {}
     IntegerFormatting<Int32Type>   61.305m items/sec   51.841m items/sec   -15.437       {}
    IntegerFormatting<UInt16Type>   88.665m items/sec   74.442m items/sec   -16.041       {}
    IntegerFormatting<UInt64Type>   34.248m items/sec   27.239m items/sec   -20.464       {}
     IntegerFormatting<Int64Type>   38.401m items/sec   27.475m items/sec   -28.451       {}
      FloatFormatting<DoubleType>    5.642m items/sec    3.614m items/sec   -35.939       {}
       FloatFormatting<FloatType>    5.823m items/sec    3.608m items/sec   -38.038       {}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#discussion_r554038453



##########
File path: dev/archery/archery/cli.py
##########
@@ -586,14 +585,29 @@ def _get_comparisons_as_json(comparisons):
     return buf.getvalue()
 
 
-def _format_comparisons_with_pandas(comparisons_json):
+def _format_comparisons_with_pandas(comparisons_json, no_counters):
     import pandas as pd

Review comment:
       @kszucs I don't see `pandas` in dev/archery/requirements.txt or ci/conda_env_archery.yml. Should it be added?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#issuecomment-757818838


   I took the liberty of pushing some unrelated improvements to the benchmark build options.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#discussion_r555286272



##########
File path: ci/conda_env_archery.yml
##########
@@ -15,8 +15,28 @@
 # specific language governing permissions and limitations
 # under the License.
 
+# cli
 click
-gitpython
+
+# bot, crossbow
+github3.py
+jinja2
+jira
+pygit2
 pygithub
 ruamel.yaml
+setuptools_scm
+toolz

Review comment:
       Are you sure about all these? We should only list direct dependencies here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] xhochy commented on a change in pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
xhochy commented on a change in pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#discussion_r554190275



##########
File path: dev/archery/archery/benchmark/core.py
##########
@@ -27,12 +27,13 @@ def median(values):
 
 
 class Benchmark:
-    def __init__(self, name, unit, less_is_better, values, stats=None):
+    def __init__(self, name, unit, less_is_better, values, counters={}):

Review comment:
       ```suggestion
       def __init__(self, name, unit, less_is_better, values, counters=None):
   ```

##########
File path: dev/archery/archery/benchmark/core.py
##########
@@ -27,12 +27,13 @@ def median(values):
 
 
 class Benchmark:
-    def __init__(self, name, unit, less_is_better, values, stats=None):
+    def __init__(self, name, unit, less_is_better, values, counters={}):
         self.name = name
         self.unit = unit
         self.less_is_better = less_is_better
         self.values = sorted(values)
         self.median = median(self.values)
+        self.counters = counters

Review comment:
       ```suggestion
           self.counters = counters or {}
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#discussion_r555778511



##########
File path: ci/conda_env_archery.yml
##########
@@ -15,8 +15,28 @@
 # specific language governing permissions and limitations
 # under the License.
 
+# cli
 click
-gitpython
+
+# bot, crossbow
+github3.py
+jinja2
+jira
+pygit2
 pygithub
 ruamel.yaml
+setuptools_scm
+toolz

Review comment:
       I derived this list from [setup.py](https://github.com/apache/arrow/blob/master/dev/archery/setup.py#L27-L34)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#issuecomment-756893154


   https://issues.apache.org/jira/browse/ARROW-11189


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#issuecomment-756893154


   https://issues.apache.org/jira/browse/ARROW-11189


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] xhochy commented on a change in pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
xhochy commented on a change in pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#discussion_r554190275



##########
File path: dev/archery/archery/benchmark/core.py
##########
@@ -27,12 +27,13 @@ def median(values):
 
 
 class Benchmark:
-    def __init__(self, name, unit, less_is_better, values, stats=None):
+    def __init__(self, name, unit, less_is_better, values, counters={}):

Review comment:
       ```suggestion
       def __init__(self, name, unit, less_is_better, values, counters=None):
   ```

##########
File path: dev/archery/archery/benchmark/core.py
##########
@@ -27,12 +27,13 @@ def median(values):
 
 
 class Benchmark:
-    def __init__(self, name, unit, less_is_better, values, stats=None):
+    def __init__(self, name, unit, less_is_better, values, counters={}):
         self.name = name
         self.unit = unit
         self.less_is_better = less_is_better
         self.values = sorted(values)
         self.median = median(self.values)
+        self.counters = counters

Review comment:
       ```suggestion
           self.counters = counters or {}
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #9140:
URL: https://github.com/apache/arrow/pull/9140


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#discussion_r554895087



##########
File path: dev/archery/archery/cli.py
##########
@@ -586,14 +585,29 @@ def _get_comparisons_as_json(comparisons):
     return buf.getvalue()
 
 
-def _format_comparisons_with_pandas(comparisons_json):
+def _format_comparisons_with_pandas(comparisons_json, no_counters):
     import pandas as pd

Review comment:
       +1




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #9140: ARROW-11189: [Developer] support benchmark diff between JSONs

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #9140:
URL: https://github.com/apache/arrow/pull/9140#discussion_r554038453



##########
File path: dev/archery/archery/cli.py
##########
@@ -586,14 +585,29 @@ def _get_comparisons_as_json(comparisons):
     return buf.getvalue()
 
 
-def _format_comparisons_with_pandas(comparisons_json):
+def _format_comparisons_with_pandas(comparisons_json, no_counters):
     import pandas as pd

Review comment:
       @kszucs I don't see `pandas` in dev/archery/requirements.txt or ci/conda_env_archery.yml. Should it be added?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org