You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/12/14 21:29:00 UTC

[jira] [Work logged] (HIVE-26699) Iceberg: S3 fadvise can hurt JSON parsing significantly in DWX

     [ https://issues.apache.org/jira/browse/HIVE-26699?focusedWorklogId=833578&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-833578 ]

ASF GitHub Bot logged work on HIVE-26699:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Dec/22 21:28
            Start Date: 14/Dec/22 21:28
    Worklog Time Spent: 10m 
      Work Description: ayushtkn opened a new pull request, #3862:
URL: https://github.com/apache/hive/pull/3862

   ### What changes were proposed in this pull request?
   
   Use fsadvise as normal
   
   ### Why are the changes needed?
   
   Performance benefits
   
   ### Does this PR introduce _any_ user-facing change?
   
   No




Issue Time Tracking
-------------------

            Worklog Id:     (was: 833578)
    Remaining Estimate: 0h
            Time Spent: 10m

> Iceberg: S3 fadvise can hurt JSON parsing significantly in DWX
> --------------------------------------------------------------
>
>                 Key: HIVE-26699
>                 URL: https://issues.apache.org/jira/browse/HIVE-26699
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive reads JSON metadata information (TableMetadataParser::read()) multiple times; E.g during query compilation, AM split computation, stats computation, during commits  etc.
>  
> With large JSON files (due to multiple inserts), it takes a lot longer time with S3 FS with "fs.s3a.experimental.input.fadvise" set to "random". (e.g in the order of 10x).To be on safer side, it will be good to set this to "normal" mode in configs, when reading iceberg tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)