You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Rajesh Balamohan (Jira)" <ji...@apache.org> on 2022/11/03 11:51:00 UTC
[jira] [Created] (HIVE-26699) Iceberg: S3 fadvise can hurt JSON parsing significantly in DWX
Rajesh Balamohan created HIVE-26699:
---------------------------------------
Summary: Iceberg: S3 fadvise can hurt JSON parsing significantly in DWX
Key: HIVE-26699
URL: https://issues.apache.org/jira/browse/HIVE-26699
Project: Hive
Issue Type: Improvement
Reporter: Rajesh Balamohan
Hive reads JSON metadata information (TableMetadataParser::read()) multiple times; E.g during query compilation, AM split computation, stats computation, during commits etc.
With large JSON files (due to multiple inserts), it takes a lot longer time with S3 FS with "fs.s3a.experimental.input.fadvise" set to "random". (e.g in the order of 10x).To be on safer side, it will be good to set this to "normal" mode in configs, when reading iceberg tables.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)