You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Diana Clarke (Jira)" <ji...@apache.org> on 2021/07/06 19:04:00 UTC
[jira] [Updated] (ARROW-13266) [JS] Improve benchmark names
[ https://issues.apache.org/jira/browse/ARROW-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Diana Clarke updated ARROW-13266:
---------------------------------
Description:
1) I found the double usage of "name" confusing.
{code}"name": "name: 'lat', length: 1,000,000, type: Float32",{code}
Perhaps `column` instead?
{code}"name": "column: 'lat', length: 1,000,000, type: Float32",{code}
2) The names could be more informative (and there are currently duplicates). I see the following in the json.
{code}
"name": "Table.from",
"name": "readBatches",
"name": "serialize",
"name": "name: 'lat', length: 1,000,000, type: Float32",
"name": "name: 'lng', length: 1,000,000, type: Float32",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'lat', length: 1,000,000, type: Float32",
"name": "name: 'lng', length: 1,000,000, type: Float32",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'lat', length: 1,000,000, type: Float32",
"name": "name: 'lng', length: 1,000,000, type: Float32",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'lat', length: 1,000,000, type: Float32",
"name": "name: 'lng', length: 1,000,000, type: Float32",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "length: 1,000,000",
"name": "name: 'lat', length: 1,000,000, type: Float32, test: gt, value: 0",
"name": "name: 'lng', length: 1,000,000, type: Float32, test: gt, value: 0",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle",
{code}
Yet I do see informative names in the code (like {{DataFrame Count By...}} & {{DataFrame Filter-Scan Count...}}):
- https://github.com/apache/arrow/blob/5ca16287a389afceabdd4b487d2e43e62745abcc/js/perf/index.ts#L124
- https://github.com/apache/arrow/blob/5ca16287a389afceabdd4b487d2e43e62745abcc/js/perf/index.ts#L114
Perhaps add the suite name? And make the values json rather than comma separated values as one string?
Something like this:
{code}
...
"name": "DataFrame Count By"
"values": {
"column": "lng",
"length": "1,000,000",
"type": "Float32",
"test": "gt",
"value": "0"
}
...
{code}
was:
1) I found the double usage of "name" confusing.
{code}"name": "name: 'lat', length: 1,000,000, type: Float32",{code}
Perhaps `column` instead?
{code}"name": "column: 'lat', length: 1,000,000, type: Float32",{code}
2) The names could be more informative (and there are currently duplicates). I see the following in the json.
{code}
"name": "Table.from",
"name": "readBatches",
"name": "serialize",
"name": "name: 'lat', length: 1,000,000, type: Float32",
"name": "name: 'lng', length: 1,000,000, type: Float32",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'lat', length: 1,000,000, type: Float32",
"name": "name: 'lng', length: 1,000,000, type: Float32",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'lat', length: 1,000,000, type: Float32",
"name": "name: 'lng', length: 1,000,000, type: Float32",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'lat', length: 1,000,000, type: Float32",
"name": "name: 'lng', length: 1,000,000, type: Float32",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
"name": "length: 1,000,000",
"name": "name: 'lat', length: 1,000,000, type: Float32, test: gt, value: 0",
"name": "name: 'lng', length: 1,000,000, type: Float32, test: gt, value: 0",
"name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle",
{code}
Yet I do see informative names in the code (like `DataFrame Count By...` & `DataFrame Filter-Scan Count...`):
- https://github.com/apache/arrow/blob/5ca16287a389afceabdd4b487d2e43e62745abcc/js/perf/index.ts#L124
- https://github.com/apache/arrow/blob/5ca16287a389afceabdd4b487d2e43e62745abcc/js/perf/index.ts#L114
Perhaps add the suite name? And make the values json rather than comma separated values as one string?
Something like this:
{code}
...
"name": "DataFrame Count By"
"values": {
"column": "lng",
"length": "1,000,000",
"type": "Float32",
"test": "gt",
"value": "0"
}
...
{code}
> [JS] Improve benchmark names
> ----------------------------
>
> Key: ARROW-13266
> URL: https://issues.apache.org/jira/browse/ARROW-13266
> Project: Apache Arrow
> Issue Type: Bug
> Components: JavaScript
> Reporter: Diana Clarke
> Assignee: Diana Clarke
> Priority: Minor
>
> 1) I found the double usage of "name" confusing.
> {code}"name": "name: 'lat', length: 1,000,000, type: Float32",{code}
>
> Perhaps `column` instead?
>
> {code}"name": "column: 'lat', length: 1,000,000, type: Float32",{code}
> 2) The names could be more informative (and there are currently duplicates). I see the following in the json.
> {code}
> "name": "Table.from",
> "name": "readBatches",
> "name": "serialize",
> "name": "name: 'lat', length: 1,000,000, type: Float32",
> "name": "name: 'lng', length: 1,000,000, type: Float32",
> "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
> "name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
> "name": "name: 'lat', length: 1,000,000, type: Float32",
> "name": "name: 'lng', length: 1,000,000, type: Float32",
> "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
> "name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
> "name": "name: 'lat', length: 1,000,000, type: Float32",
> "name": "name: 'lng', length: 1,000,000, type: Float32",
> "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
> "name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
> "name": "name: 'lat', length: 1,000,000, type: Float32",
> "name": "name: 'lng', length: 1,000,000, type: Float32",
> "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
> "name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, Utf8>",
> "name": "length: 1,000,000",
> "name": "name: 'lat', length: 1,000,000, type: Float32, test: gt, value: 0",
> "name": "name: 'lng', length: 1,000,000, type: Float32, test: gt, value: 0",
> "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle",
> {code}
> Yet I do see informative names in the code (like {{DataFrame Count By...}} & {{DataFrame Filter-Scan Count...}}):
> - https://github.com/apache/arrow/blob/5ca16287a389afceabdd4b487d2e43e62745abcc/js/perf/index.ts#L124
> - https://github.com/apache/arrow/blob/5ca16287a389afceabdd4b487d2e43e62745abcc/js/perf/index.ts#L114
>
> Perhaps add the suite name? And make the values json rather than comma separated values as one string?
>
> Something like this:
>
> {code}
> ...
> "name": "DataFrame Count By"
> "values": {
> "column": "lng",
> "length": "1,000,000",
> "type": "Float32",
> "test": "gt",
> "value": "0"
> }
> ...
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)