You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "pitrou (via GitHub)" <gi...@apache.org> on 2023/06/20 15:06:07 UTC

[GitHub] [arrow] pitrou opened a new issue, #36176: [C++] Large regression in sort performance

pitrou opened a new issue, #36176:
URL: https://github.com/apache/arrow/issues/36176

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   Following GH-33206, a large performance regression appeared in single-key table sorting:
   ```
        ArraySortIndicesInt64Narrow/32768/10000      2.683 GiB/sec     2.165 GiB/sec   -19.296                                   {'family_index': 0, 'per_family_instance_index': 0, 'run_name': 'ArraySortIndicesInt64Narrow/32768/10000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 59563, 'null_percent': 0.01}
   TableSortIndicesInt64Narrow/1048576/100/1/32  16.175M items/sec 10.063M items/sec   -37.786 {'family_index': 12, 'per_family_instance_index': 9, 'run_name': 'TableSortIndicesInt64Narrow/1048576/100/1/32', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 11, 'chunks': 32.0, 'columns': 1.0, 'null_percent': 1.0}
     TableSortIndicesInt64Narrow/1048576/4/1/32  19.455M items/sec 11.938M items/sec   -38.639 {'family_index': 12, 'per_family_instance_index': 10, 'run_name': 'TableSortIndicesInt64Narrow/1048576/4/1/32', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 13, 'chunks': 32.0, 'columns': 1.0, 'null_percent': 25.0}
     TableSortIndicesInt64Narrow/1048576/0/1/32  18.017M items/sec 10.212M items/sec   -43.320  {'family_index': 12, 'per_family_instance_index': 11, 'run_name': 'TableSortIndicesInt64Narrow/1048576/0/1/32', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 12, 'chunks': 32.0, 'columns': 1.0, 'null_percent': 0.0}
      TableSortIndicesInt64Narrow/1048576/4/1/4  35.396M items/sec 15.609M items/sec   -55.901   {'family_index': 12, 'per_family_instance_index': 22, 'run_name': 'TableSortIndicesInt64Narrow/1048576/4/1/4', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 25, 'chunks': 4.0, 'columns': 1.0, 'null_percent': 25.0}
    TableSortIndicesInt64Narrow/1048576/100/1/4  34.966M items/sec 12.808M items/sec   -63.370  {'family_index': 12, 'per_family_instance_index': 21, 'run_name': 'TableSortIndicesInt64Narrow/1048576/100/1/4', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 23, 'chunks': 4.0, 'columns': 1.0, 'null_percent': 1.0}
      TableSortIndicesInt64Narrow/1048576/0/1/4  36.230M items/sec 13.005M items/sec   -64.103    {'family_index': 12, 'per_family_instance_index': 23, 'run_name': 'TableSortIndicesInt64Narrow/1048576/0/1/4', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 24, 'chunks': 4.0, 'columns': 1.0, 'null_percent': 0.0}
      TableSortIndicesInt64Narrow/1048576/4/1/1  87.635M items/sec 20.321M items/sec   -76.811   {'family_index': 12, 'per_family_instance_index': 34, 'run_name': 'TableSortIndicesInt64Narrow/1048576/4/1/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 58, 'chunks': 1.0, 'columns': 1.0, 'null_percent': 25.0}
    TableSortIndicesInt64Narrow/1048576/100/1/1 151.551M items/sec 15.919M items/sec   -89.496  {'family_index': 12, 'per_family_instance_index': 33, 'run_name': 'TableSortIndicesInt64Narrow/1048576/100/1/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 99, 'chunks': 1.0, 'columns': 1.0, 'null_percent': 1.0}
      TableSortIndicesInt64Narrow/1048576/0/1/1 177.335M items/sec 16.015M items/sec   -90.969   {'family_index': 12, 'per_family_instance_index': 35, 'run_name': 'TableSortIndicesInt64Narrow/1048576/0/1/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 117, 'chunks': 1.0, 'columns': 1.0, 'null_percent': 0.0}
   ```
   
   
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on issue #36176: [C++] Large regression in sort performance

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on issue #36176:
URL: https://github.com/apache/arrow/issues/36176#issuecomment-1598977300

   @benibus Can you take a look? I expect the cause to be quite simple actually.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] benibus commented on issue #36176: [C++] Large regression in single-key table sort performance

Posted by "benibus (via GitHub)" <gi...@apache.org>.
benibus commented on issue #36176:
URL: https://github.com/apache/arrow/issues/36176#issuecomment-1599013503

   Yeah... the single key path for tables (but not batches) was removed entirely - so that makes sense. I'll re-add it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou closed issue #36176: [C++] Large regression in single-key table sort performance

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou closed issue #36176: [C++] Large regression in single-key table sort performance
URL: https://github.com/apache/arrow/issues/36176


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org