You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/15 16:52:08 UTC
[GitHub] [arrow] pitrou commented on pull request #14100: ARROW-4709: [C++] Optimize for ordered JSON fields
pitrou commented on PR #14100:
URL: https://github.com/apache/arrow/pull/14100#issuecomment-1315595870
Right, it seems the speedup is relatively minor. I get these results:
```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Non-regressions: (39)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
ParseJSONFields/ordered:1/schema:1/sparsity:0/num_fields:1000 141.204 MiB/sec 163.772 MiB/sec 15.983 {'family_index': 5, 'per_family_instance_index': 24, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:0/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 23, 'json_size': 4544425.0}
ParseJSONFields/ordered:1/schema:0/sparsity:0/num_fields:1000 137.627 MiB/sec 158.607 MiB/sec 15.244 {'family_index': 5, 'per_family_instance_index': 26, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:0/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 22, 'json_size': 4544425.0}
ParseJSONFields/ordered:1/schema:1/sparsity:0/num_fields:100 149.699 MiB/sec 167.002 MiB/sec 11.559 {'family_index': 5, 'per_family_instance_index': 12, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:0/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 259, 'json_size': 424102.0}
ParseJSONFields/ordered:1/schema:0/sparsity:0/num_fields:100 146.855 MiB/sec 163.307 MiB/sec 11.202 {'family_index': 5, 'per_family_instance_index': 14, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:0/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 253, 'json_size': 424102.0}
ChunkJSONLineDelimited 97.748902 104.282934 6.685 {'family_index': 1, 'per_family_instance_index': 0, 'run_name': 'ChunkJSONLineDelimited', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 7071295, 'json_size': 150361.0}
ParseJSONBlockWithSchema 136.226 MiB/sec 138.375 MiB/sec 1.577 {'family_index': 2, 'per_family_instance_index': 0, 'run_name': 'ParseJSONBlockWithSchema', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 664, 'json_size': 150361.0}
ParseJSONFields/ordered:1/schema:1/sparsity:0/num_fields:10 193.421 MiB/sec 195.560 MiB/sec 1.106 {'family_index': 5, 'per_family_instance_index': 0, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:0/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 291, 'json_size': 483895.0}
ParseJSONFields/ordered:1/schema:0/sparsity:0/num_fields:10 192.316 MiB/sec 194.350 MiB/sec 1.058 {'family_index': 5, 'per_family_instance_index': 2, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:0/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 287, 'json_size': 483895.0}
ParseJSONFields/ordered:0/schema:1/sparsity:0/num_fields:10 179.680 MiB/sec 180.962 MiB/sec 0.714 {'family_index': 5, 'per_family_instance_index': 1, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:0/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 272, 'json_size': 484344.0}
ReadJSONBlockWithSchemaSingleThread 117.169 MiB/sec 117.609 MiB/sec 0.376 {'family_index': 3, 'per_family_instance_index': 0, 'run_name': 'ReadJSONBlockWithSchemaSingleThread', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 6, 'json_size': 15026882.0}
ParseJSONFields/ordered:0/schema:1/sparsity:0/num_fields:100 143.278 MiB/sec 141.908 MiB/sec -0.957 {'family_index': 5, 'per_family_instance_index': 13, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:0/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 248, 'json_size': 424088.0}
ParseJSONFields/ordered:1/schema:1/sparsity:10/num_fields:100 144.507 MiB/sec 142.969 MiB/sec -1.064 {'family_index': 5, 'per_family_instance_index': 16, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:10/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 248, 'json_size': 425955.0}
ParseJSONFields/ordered:0/schema:1/sparsity:10/num_fields:100 139.161 MiB/sec 137.485 MiB/sec -1.204 {'family_index': 5, 'per_family_instance_index': 17, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:10/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 241, 'json_size': 422790.0}
ParseJSONFields/ordered:1/schema:0/sparsity:10/num_fields:10 184.568 MiB/sec 182.084 MiB/sec -1.346 {'family_index': 5, 'per_family_instance_index': 6, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:10/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 281, 'json_size': 482610.0}
ParseJSONFields/ordered:0/schema:0/sparsity:10/num_fields:100 135.707 MiB/sec 133.874 MiB/sec -1.351 {'family_index': 5, 'per_family_instance_index': 19, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:10/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 236, 'json_size': 422790.0}
ParseJSONFields/ordered:1/schema:1/sparsity:10/num_fields:10 190.419 MiB/sec 187.804 MiB/sec -1.373 {'family_index': 5, 'per_family_instance_index': 4, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:10/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 289, 'json_size': 482610.0}
ParseJSONFields/ordered:0/schema:1/sparsity:90/num_fields:10 63.273 MiB/sec 62.386 MiB/sec -1.402 {'family_index': 5, 'per_family_instance_index': 9, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:90/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 87, 'json_size': 530883.0}
ParseJSONFields/ordered:1/schema:1/sparsity:90/num_fields:10 62.917 MiB/sec 62.032 MiB/sec -1.406 {'family_index': 5, 'per_family_instance_index': 8, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:90/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 88, 'json_size': 524228.0}
ParseJSONFields/ordered:1/schema:0/sparsity:10/num_fields:100 141.608 MiB/sec 139.480 MiB/sec -1.503 {'family_index': 5, 'per_family_instance_index': 18, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:10/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 244, 'json_size': 425955.0}
ParseJSONFields/ordered:0/schema:0/sparsity:0/num_fields:100 140.454 MiB/sec 138.317 MiB/sec -1.521 {'family_index': 5, 'per_family_instance_index': 15, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:0/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 244, 'json_size': 424088.0}
ParseJSONFields/ordered:0/schema:1/sparsity:10/num_fields:10 177.692 MiB/sec 174.088 MiB/sec -2.028 {'family_index': 5, 'per_family_instance_index': 5, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:10/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 268, 'json_size': 485740.0}
ParseJSONFields/ordered:1/schema:1/sparsity:10/num_fields:1000 135.271 MiB/sec 132.217 MiB/sec -2.258 {'family_index': 5, 'per_family_instance_index': 28, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:10/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 24, 'json_size': 4085536.0}
ParseJSONFields/ordered:0/schema:0/sparsity:0/num_fields:10 180.369 MiB/sec 176.281 MiB/sec -2.267 {'family_index': 5, 'per_family_instance_index': 3, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:0/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 274, 'json_size': 484344.0}
ParseJSONFields/ordered:0/schema:0/sparsity:90/num_fields:100 48.815 MiB/sec 47.630 MiB/sec -2.427 {'family_index': 5, 'per_family_instance_index': 23, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:90/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 84, 'json_size': 425534.0}
ParseJSONFields/ordered:0/schema:0/sparsity:10/num_fields:10 173.820 MiB/sec 169.515 MiB/sec -2.477 {'family_index': 5, 'per_family_instance_index': 7, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:10/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 262, 'json_size': 485740.0}
ChunkJSONPrettyPrinted 305.981 MiB/sec 298.223 MiB/sec -2.536 {'family_index': 0, 'per_family_instance_index': 0, 'run_name': 'ChunkJSONPrettyPrinted', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1033, 'json_size': 215361.0}
ParseJSONFields/ordered:1/schema:1/sparsity:90/num_fields:100 49.710 MiB/sec 48.375 MiB/sec -2.685 {'family_index': 5, 'per_family_instance_index': 20, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:90/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 84, 'json_size': 430278.0}
ParseJSONFields/ordered:0/schema:1/sparsity:90/num_fields:100 49.334 MiB/sec 47.915 MiB/sec -2.878 {'family_index': 5, 'per_family_instance_index': 21, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:90/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 85, 'json_size': 425534.0}
ParseJSONFields/ordered:0/schema:1/sparsity:10/num_fields:1000 130.316 MiB/sec 126.470 MiB/sec -2.951 {'family_index': 5, 'per_family_instance_index': 29, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:10/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 23, 'json_size': 4088946.0}
ParseJSONFields/ordered:0/schema:1/sparsity:0/num_fields:1000 134.779 MiB/sec 130.799 MiB/sec -2.953 {'family_index': 5, 'per_family_instance_index': 25, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:0/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 22, 'json_size': 4546025.0}
ParseJSONFields/ordered:1/schema:0/sparsity:90/num_fields:100 49.652 MiB/sec 48.147 MiB/sec -3.031 {'family_index': 5, 'per_family_instance_index': 22, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:90/num_fields:100', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 85, 'json_size': 430278.0}
ParseJSONFields/ordered:0/schema:0/sparsity:90/num_fields:10 62.658 MiB/sec 60.611 MiB/sec -3.267 {'family_index': 5, 'per_family_instance_index': 11, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:90/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 86, 'json_size': 530883.0}
ParseJSONFields/ordered:1/schema:0/sparsity:10/num_fields:1000 131.295 MiB/sec 126.847 MiB/sec -3.388 {'family_index': 5, 'per_family_instance_index': 30, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:10/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 24, 'json_size': 4085536.0}
ParseJSONFields/ordered:0/schema:0/sparsity:10/num_fields:1000 126.625 MiB/sec 121.335 MiB/sec -4.178 {'family_index': 5, 'per_family_instance_index': 31, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:10/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 23, 'json_size': 4088946.0}
ParseJSONFields/ordered:1/schema:0/sparsity:90/num_fields:1000 43.432 MiB/sec 41.614 MiB/sec -4.187 {'family_index': 5, 'per_family_instance_index': 34, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:90/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 70, 'json_size': 457089.0}
ParseJSONFields/ordered:0/schema:0/sparsity:0/num_fields:1000 131.544 MiB/sec 125.765 MiB/sec -4.393 {'family_index': 5, 'per_family_instance_index': 27, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:0/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 21, 'json_size': 4546025.0}
ParseJSONFields/ordered:0/schema:1/sparsity:90/num_fields:1000 43.140 MiB/sec 41.204 MiB/sec -4.488 {'family_index': 5, 'per_family_instance_index': 33, 'run_name': 'ParseJSONFields/ordered:0/schema:1/sparsity:90/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 70, 'json_size': 454665.0}
ParseJSONFields/ordered:1/schema:1/sparsity:90/num_fields:1000 43.505 MiB/sec 41.523 MiB/sec -4.556 {'family_index': 5, 'per_family_instance_index': 32, 'run_name': 'ParseJSONFields/ordered:1/schema:1/sparsity:90/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 70, 'json_size': 457089.0}
ParseJSONFields/ordered:0/schema:0/sparsity:90/num_fields:1000 43.536 MiB/sec 41.527 MiB/sec -4.615 {'family_index': 5, 'per_family_instance_index': 35, 'run_name': 'ParseJSONFields/ordered:0/schema:0/sparsity:90/num_fields:1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 70, 'json_size': 454665.0}
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Regressions: (2)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
ParseJSONFields/ordered:1/schema:0/sparsity:90/num_fields:10 63.104 MiB/sec 60.157 MiB/sec -4.670 {'family_index': 5, 'per_family_instance_index': 10, 'run_name': 'ParseJSONFields/ordered:1/schema:0/sparsity:90/num_fields:10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 88, 'json_size': 524228.0}
ReadJSONBlockWithSchemaMultiThread/real_time 752.152 MiB/sec 695.624 MiB/sec -7.515 {'family_index': 4, 'per_family_instance_index': 0, 'run_name': 'ReadJSONBlockWithSchemaMultiThread/real_time', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 36, 'json_size': 15026882.0}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org