You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Michael Smith (Jira)" <ji...@apache.org> on 2022/08/11 17:48:00 UTC

[jira] [Resolved] (IMPALA-11448) Always assign Ozone I/O to remote thread group to improve performance

     [ https://issues.apache.org/jira/browse/IMPALA-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Smith resolved IMPALA-11448.
------------------------------------
    Resolution: Fixed

This is fixed by IMPALA-11457.

> Always assign Ozone I/O to remote thread group to improve performance
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-11448
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11448
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 4.0.0
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>            Priority: Major
>         Attachments: Screen Shot 2022-07-20 at 10.35.24 AM.png, Screen Shot 2022-07-20 at 11.21.52 AM.png
>
>
> IMPALA-9400 added initial support for Ozone (o3fs/ofs) by assuming all Ozone I/O as remote, which is a valid assumption.
> However, the Impala's internal logic will assign the I/O to a single local disk I/O thread , severely limiting the I/O parallelism. This is evident when running the debug build, which fails at the following check:
> {noformat}
> Log file created at: 2022/07/18 18:15:02
> Running on machine: rhel05.ozone.cisco.local
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> F0718 18:15:02.269232 105827 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000030] Check failed: !IsOzonePath(file)
> F0718 18:15:02.269235 105832 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000003] Check failed: !IsOzonePath(file) F0718 18:15:02.269273 105834 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000014] Check failed: !IsOzonePath(file)
> F0718 18:15:02.269235 105832 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000003] Check failed: !IsOzonePath(file) F0718 18:15:02.269273 105834 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000014] Check failed: !IsOzonePath(file)
> {noformat}
> The is_remote parameter of a scan range is always false for Ozone:
> {noformat}
> TScanRangeParams {
>   01: scan_range (struct) = TScanRange {
>     01: hdfs_file_split (struct) = THdfsFileSplit {
>       01: relative_path (string) = "base_1/2b4595d335caddf5-4c9efd320000000e_1114488364_data.0.parq",
>       02: offset (i64) = 0,
>       03: length (i64) = 15982993,
>       04: partition_id (i64) = 172,
>       05: file_length (i64) = 15982993,
>       06: file_compression (i32) = 0,
>       07: mtime (i64) = 1657968669139,
>       08: is_erasure_coded (bool) = false,
>       09: partition_path_hash (i32) = -343315716,
>     },
>   },
>   02: volume_id (i32) = 65535,
>   03: try_hdfs_cache (bool) = false,
>   04: is_remote (bool) = false,
> {noformat}
> Because Ozone does not yet have short circuit read support, I think a quick fix is to always force Ozone to use the remote I/O thread group assigned io it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)