You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2019/10/03 19:20:00 UTC

[jira] [Created] (ARROW-6786) [C++] arrow-dataset-file-parquet-test is slow

Antoine Pitrou created ARROW-6786:
-------------------------------------

             Summary: [C++] arrow-dataset-file-parquet-test is slow
                 Key: ARROW-6786
                 URL: https://issues.apache.org/jira/browse/ARROW-6786
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
            Reporter: Antoine Pitrou


It takes 15 seconds in debug mode (probably more with ASAN /  UBSAN /etc.) to run 2 tests that simply iterated through a generated in-memory dataset:
{code}
$ ./build-test/debug/arrow-dataset-file-parquet-test 
Running main() from /home/conda/feedstock_root/build_artifacts/gtest_1551008230529/work/googletest/src/gtest_main.cc
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from TestParquetFileFormat
[ RUN      ] TestParquetFileFormat.ScanRecordBatchReader
[       OK ] TestParquetFileFormat.ScanRecordBatchReader (7338 ms)
[ RUN      ] TestParquetFileFormat.Inspect
[       OK ] TestParquetFileFormat.Inspect (6222 ms)
[----------] 2 tests from TestParquetFileFormat (13560 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (13560 ms total)
[  PASSED  ] 2 tests.
{code}

Unless it is stressing something in particular, the number of repetitions or the batch size can probably be reduced dramatically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)