You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/06/05 15:27:00 UTC

[jira] [Work logged] (HIVE-22979) Support total file size in statistics annotation

     [ https://issues.apache.org/jira/browse/HIVE-22979?focusedWorklogId=441882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-441882 ]

ASF GitHub Bot logged work on HIVE-22979:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Jun/20 15:26
            Start Date: 05/Jun/20 15:26
    Worklog Time Spent: 10m 
      Work Description: belugabehr commented on pull request #941:
URL: https://github.com/apache/hive/pull/941#issuecomment-639564210


   Not sure if this is related, but just make sure the broader context is documented here,... 
   
   HoS had this concept of "rawSize" vs "size" of the table.  I think it was storing what you are talking about here.  Please check that out and see if this is in line with HoS.  (I know HoS is deprecated, but related changes may already be in the codebase elsewhere).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 441882)
    Time Spent: 1h  (was: 50m)

> Support total file size in statistics annotation
> ------------------------------------------------
>
>                 Key: HIVE-22979
>                 URL: https://issues.apache.org/jira/browse/HIVE-22979
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 4.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-22979.1.patch, HIVE-22979.2.patch
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Hive statistics annotation provide estimated Statistics for each operator. The data size provided in TableScanOperator is raw data size (after decompression and decoding), but there are some optimizations that can be performed based on total file size on disk (scan cost estimation).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)