You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2021/07/24 05:04:18 UTC

[arrow] branch master updated (91b751b -> 03533fe)

This is an automated email from the ASF dual-hosted git repository.

westonpace pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git.


    from 91b751b  ARROW-13440: Added a basic mapping generator that does not queue incoming jobs.  This allows it to forward async-reentrant pressure to the source.  Fixed some issues in the CSV reader that were preventing it from running truly parallel.  Performance is now significantly better but still not quite the same as the threaded reader.  For the NY taxi dataset the streaming read time went from ~7 seconds to ~1.6 seconds.  However, the file reader is still at ~0.8 seconds.  I'll [...]
     add 03533fe  Revert "ARROW-13440: Added a basic mapping generator that does not queue incoming jobs.  This allows it to forward async-reentrant pressure to the source.  Fixed some issues in the CSV reader that were preventing it from running truly parallel.  Performance is now significantly better but still not quite the same as the threaded reader.  For the NY taxi dataset the streaming read time went from ~7 seconds to ~1.6 seconds.  However, the file reader is still at ~0.8 second [...]

No new revisions were added by this update.

Summary of changes:
 cpp/src/arrow/csv/reader.cc                |  17 +---
 cpp/src/arrow/testing/gtest_util.cc        |  27 ++---
 cpp/src/arrow/testing/gtest_util.h         |   3 -
 cpp/src/arrow/util/async_generator.h       |  90 ++---------------
 cpp/src/arrow/util/async_generator_test.cc | 157 +++++++----------------------
 5 files changed, 59 insertions(+), 235 deletions(-)