You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/02/10 10:41:00 UTC

[jira] [Commented] (IMPALA-6636) Use async IO in ORC scanner

    [ https://issues.apache.org/jira/browse/IMPALA-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490113#comment-17490113 ] 

ASF subversion and git services commented on IMPALA-6636:
---------------------------------------------------------

Commit 97dda2b27da99367f4d07699aa046b16cda16dd4 in impala's branch refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=97dda2b ]

IMPALA-6636: Use async IO in ORC scanner

This patch implements async IO in the ORC scanner. For each ORC stripe,
we begin with iterating the column streams. If a column stream is
possible for async IO, it will create ColumnRange, register
ScannerContext::Stream for that ORC stream, and start the stream. We
modify HdfsOrcScanner::ScanRangeInputStream::read to check whether there
is a matching ColumnRange for the given offset and length. If so, the
reading continue through HdfsOrcScanner::ColumnRange::read.

We leverage existing async IO methods from HdfsParquetScanner class for
initial memory allocations. We moved related methods such as
DivideReservationBetweenColumns and ComputeIdealReservation up to
HdfsColumnarScanner class.

Planner calculates the memory reservation differently between async
Parquet and async ORC. In async Parquet, the planner calculates the
column memory reservation and relies on the backend to divide them as
needed. In async ORC, the planner needs to split the column's memory
reservation based on the estimated number of streams for that column
type. For example, a string column with a 4MB memory estimate will need
to split that estimate into four 1MB because it might use dictionary
encoding with four streams (PRESENT, DATA, DICTIONARY_DATA, and LENGTH
stream). This splitting is required because each async IO stream needs
to start with an 8KB (min_buffer_size) initial memory reservation.

To show the improvement from ORC async IO, we contrast the total time
and geomean (in milliseconds) to run full TPC-DS 10 TB, 19 executors,
with varying ORC_ASYNC_IO and DISABLE_DATA_CACHE options as follow:

+----------------------+------------------+------------------+
| Total time           | ORC_ASYNC_READ=0 | ORC_ASYNC_READ=1 |
+----------------------+------------------+------------------+
| DISABLE_DATA_CACHE=0 |          3511075 |          3484736 |
| DISABLE_DATA_CACHE=1 |          5243337 |          4370095 |
+----------------------+------------------+------------------+

+----------------------+------------------+------------------+
| Geomean              | ORC_ASYNC_READ=0 | ORC_ASYNC_READ=1 |
+----------------------+------------------+------------------+
| DISABLE_DATA_CACHE=0 |      12786.58042 |      12454.80365 |
| DISABLE_DATA_CACHE=1 |      23081.10888 |      16692.31512 |
+----------------------+------------------+------------------+

Testing:
- Pass core tests.
- Pass core e2e tests with ORC_ASYNC_READ=1.

Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074
Reviewed-on: http://gerrit.cloudera.org:8080/15370
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Use async IO in ORC scanner
> ---------------------------
>
>                 Key: IMPALA-6636
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6636
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Quanlong Huang
>            Assignee: Riza Suminto
>            Priority: Critical
>
> Though ORC-262 has no progress, we can still prefech data and let the ORC lib reading from an in-memory InputStream.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org