You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "felipecrv (via GitHub)" <gi...@apache.org> on 2023/06/29 23:08:52 UTC

[GitHub] [arrow] felipecrv opened a new pull request, #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

felipecrv opened a new pull request, #36401:
URL: https://github.com/apache/arrow/pull/36401

   ### Rationale for this change
   
   `arrow-compute-vector-test` is too big and takes a long time to run because of that.
   
   ### What changes are included in this PR?
   
   Extracting two tests.
   
   Timings on my machine (Debug builds with ASAN).
   
   ```
   debug/arrow-compute-vector-test > /dev/null  11.54s user 0.47s system 99% cpu 12.023 total
   debug/arrow-compute-vector-sort-test > /dev/null  13.30s user 0.26s system 99% cpu 13.579 total
   debug/arrow-compute-vector-selection-test > /dev/null  6.97s user 0.22s system 99% cpu 7.207 total
   ```
   
   ### Are these changes tested?
   
   Yes.
   
   ### Are there any user-facing changes?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on PR #36401:
URL: https://github.com/apache/arrow/pull/36401#issuecomment-1614316070

   Nice timings here too:
   ```
   /build/build-test ~/arrow/dev/cpp
   Test project /build/build-test
       Start 26: arrow-compute-vector-sort-test
       Start 25: arrow-compute-vector-test
       Start 27: arrow-compute-vector-selection-test
   1/3 Test #27: arrow-compute-vector-selection-test ...   Passed    2.94 sec
   2/3 Test #25: arrow-compute-vector-test .............   Passed    4.84 sec
   3/3 Test #26: arrow-compute-vector-sort-test ........   Passed    6.15 sec
   
   100% tests passed, 0 tests failed out of 3
   
   Label Time Summary:
   arrow_compute    =  13.94 sec*proc (3 tests)
   unittest         =  13.94 sec*proc (3 tests)
   
   Total Test time (real) =   6.15 sec
   ~/arrow/dev/cpp
   
   real	0m6,166s
   user	0m13,900s
   sys	0m0,092s
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on code in PR #36401:
URL: https://github.com/apache/arrow/pull/36401#discussion_r1251136740


##########
cpp/src/arrow/compute/kernels/CMakeLists.txt:
##########
@@ -72,12 +72,20 @@ add_arrow_compute_test(vector_test
                        vector_hash_test.cc
                        vector_nested_test.cc
                        vector_replace_test.cc
-                       vector_selection_test.cc
-                       vector_sort_test.cc
                        vector_run_end_encode_test.cc
                        select_k_test.cc
                        test_util.cc)
 
+add_arrow_compute_test(vector_sort_test
+                       SOURCES
+                       vector_sort_test.cc
+                       test_util.cc)
+
+add_arrow_compute_test(vector_selection_test
+                       SOURCES
+                       vector_selection_test.cc
+                       test_util.cc)

Review Comment:
   That's possible. We can revisit later :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou merged pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou merged PR #36401:
URL: https://github.com/apache/arrow/pull/36401


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] felipecrv commented on a diff in pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "felipecrv (via GitHub)" <gi...@apache.org>.
felipecrv commented on code in PR #36401:
URL: https://github.com/apache/arrow/pull/36401#discussion_r1247915334


##########
cpp/src/arrow/compute/kernels/CMakeLists.txt:
##########
@@ -72,12 +72,20 @@ add_arrow_compute_test(vector_test
                        vector_hash_test.cc
                        vector_nested_test.cc
                        vector_replace_test.cc
-                       vector_selection_test.cc
-                       vector_sort_test.cc
                        vector_run_end_encode_test.cc
                        select_k_test.cc
                        test_util.cc)
 
+add_arrow_compute_test(vector_sort_test
+                       SOURCES
+                       vector_sort_test.cc
+                       test_util.cc)
+
+add_arrow_compute_test(vector_selection_test
+                       SOURCES
+                       vector_selection_test.cc
+                       test_util.cc)

Review Comment:
   I was tempted to do it, but I was afraid it would make things worse because `test_util.cc` is so small.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on PR #36401:
URL: https://github.com/apache/arrow/pull/36401#issuecomment-1618913193

   The CI failures seem unrelated, I'll merge. Thanks @felipecrv !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #36401:
URL: https://github.com/apache/arrow/pull/36401#issuecomment-1614311986

   Revision: 021fb90c83d3f40846cc3062e4e6162927ab07ef
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-ccd2ce168e](https://github.com/ursacomputing/crossbow/branches/all?query=actions-ccd2ce168e)
   
   |Task|Status|
   |----|------|
   |test-alpine-linux-cpp|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-alpine-linux-cpp)](https://github.com/ursacomputing/crossbow/actions/runs/5420867351/jobs/9855562073)|
   |test-build-cpp-fuzz|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-build-cpp-fuzz)](https://github.com/ursacomputing/crossbow/actions/runs/5420868172/jobs/9855563973)|
   |test-conda-cpp|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-conda-cpp)](https://github.com/ursacomputing/crossbow/actions/runs/5420866822/jobs/9855561052)|
   |test-conda-cpp-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-ccd2ce168e-azure-test-conda-cpp-valgrind)](https://github.com/ursacomputing/crossbow/runs/14680220879)|
   |test-cuda-cpp|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-cuda-cpp)](https://github.com/ursacomputing/crossbow/actions/runs/5420869845/jobs/9855568022)|
   |test-debian-11-cpp-amd64|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-debian-11-cpp-amd64)](https://github.com/ursacomputing/crossbow/actions/runs/5420867543/jobs/9855562574)|
   |test-debian-11-cpp-i386|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-debian-11-cpp-i386)](https://github.com/ursacomputing/crossbow/actions/runs/5420869585/jobs/9855567472)|
   |test-fedora-35-cpp|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-fedora-35-cpp)](https://github.com/ursacomputing/crossbow/actions/runs/5420867071/jobs/9855561546)|
   |test-ubuntu-20.04-cpp|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-ubuntu-20.04-cpp)](https://github.com/ursacomputing/crossbow/actions/runs/5420869353/jobs/9855567062)|
   |test-ubuntu-20.04-cpp-20|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-ubuntu-20.04-cpp-20)](https://github.com/ursacomputing/crossbow/actions/runs/5420868541/jobs/9855564909)|
   |test-ubuntu-20.04-cpp-bundled|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-ubuntu-20.04-cpp-bundled)](https://github.com/ursacomputing/crossbow/actions/runs/5420868921/jobs/9855565855)|
   |test-ubuntu-20.04-cpp-minimal-with-formats|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-ubuntu-20.04-cpp-minimal-with-formats)](https://github.com/ursacomputing/crossbow/actions/runs/5420867995/jobs/9855563562)|
   |test-ubuntu-20.04-cpp-thread-sanitizer|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-ubuntu-20.04-cpp-thread-sanitizer)](https://github.com/ursacomputing/crossbow/actions/runs/5420868724/jobs/9855565392)|
   |test-ubuntu-22.04-cpp|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-ccd2ce168e-github-test-ubuntu-22.04-cpp)](https://github.com/ursacomputing/crossbow/actions/runs/5420869149/jobs/9855566357)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on code in PR #36401:
URL: https://github.com/apache/arrow/pull/36401#discussion_r1247588520


##########
cpp/src/arrow/compute/kernels/CMakeLists.txt:
##########
@@ -72,12 +72,20 @@ add_arrow_compute_test(vector_test
                        vector_hash_test.cc
                        vector_nested_test.cc
                        vector_replace_test.cc
-                       vector_selection_test.cc
-                       vector_sort_test.cc
                        vector_run_end_encode_test.cc
                        select_k_test.cc
                        test_util.cc)
 
+add_arrow_compute_test(vector_sort_test
+                       SOURCES
+                       vector_sort_test.cc
+                       test_util.cc)
+
+add_arrow_compute_test(vector_selection_test
+                       SOURCES
+                       vector_selection_test.cc
+                       test_util.cc)

Review Comment:
   At some point we might want to create an object library for `test_util.cc` instead of recompiling it for each target. Can be left as a subsequent PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] felipecrv commented on pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "felipecrv (via GitHub)" <gi...@apache.org>.
felipecrv commented on PR #36401:
URL: https://github.com/apache/arrow/pull/36401#issuecomment-1614716861

   > I think you need to run cmake-format for the lint step to pass... or apply the following patch:
   
   I ran the CMake linter locally and was so confused: it only says the checks doesn't pass.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] conbench-apache-arrow[bot] commented on pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "conbench-apache-arrow[bot] (via GitHub)" <gi...@apache.org>.
conbench-apache-arrow[bot] commented on PR #36401:
URL: https://github.com/apache/arrow/pull/36401#issuecomment-1625070355

   Conbench analyzed the 6 benchmark runs on commit `1bfa241b`.
   
   There were 9 benchmark results indicating a performance regression:
   
   - Commit Run on `ursa-i9-9960x` at [2023-07-07 05:54:58Z](http://conbench.ursa.dev/compare/runs/947fafe129994956be7b8d9dae95fdfe...36494bee8cb447d28d4502dcd131a7ed/)
     - [engine=arrow, format=native, language=R, memory_map=False, query_id=TPCH-16, scale_factor=1](http://conbench.ursa.dev/compare/benchmarks/064a796486ac7b5d800088b779b1ccb3...064a7c6e84c370128000ef68b09f9e61)
   
   - Commit Run on `arm64-m6g-linux-compute` at [2023-07-03 19:09:54Z](http://conbench.ursa.dev/compare/runs/e40fb8ec8081434aa103d0520a361c26...f32fea2fa1354a8eb859809ae3db8820/)
     - [params=simple_expression/batch_size:10000/null_prob:0/bool_true_prob:50/real_time, source=cpp-micro, suite=arrow-acero-filter-benchmark](http://conbench.ursa.dev/compare/benchmarks/064a2db4dd7473198000c18e4ea4e914...064a31d68b747dc88000a588c49e57e6)
   - and 7 more (see the report linked below)
   
   The [full Conbench report](https://github.com/apache/arrow/runs/14852537434) has more details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on PR #36401:
URL: https://github.com/apache/arrow/pull/36401#issuecomment-1614308649

   @github-actions crossbow submit -g cpp


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #36401: GH-35636: [C++] Extract two expensive test suites from compute-vector-test

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on PR #36401:
URL: https://github.com/apache/arrow/pull/36401#issuecomment-1614314865

   I think you need to run cmake-format for the lint step to pass... or apply the following patch:
   ```diff
   diff --git a/cpp/src/arrow/compute/kernels/CMakeLists.txt b/cpp/src/arrow/compute/kernels/CMakeLists.txt
   index a84b36a52..9084f279a 100644
   --- a/cpp/src/arrow/compute/kernels/CMakeLists.txt
   +++ b/cpp/src/arrow/compute/kernels/CMakeLists.txt
   @@ -76,14 +76,9 @@ add_arrow_compute_test(vector_test
                           select_k_test.cc
                           test_util.cc)
    
   -add_arrow_compute_test(vector_sort_test
   -                       SOURCES
   -                       vector_sort_test.cc
   -                       test_util.cc)
   +add_arrow_compute_test(vector_sort_test SOURCES vector_sort_test.cc test_util.cc)
    
   -add_arrow_compute_test(vector_selection_test
   -                       SOURCES
   -                       vector_selection_test.cc
   +add_arrow_compute_test(vector_selection_test SOURCES vector_selection_test.cc
                           test_util.cc)
    
    add_arrow_benchmark(vector_hash_benchmark PREFIX "arrow-compute")
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org