You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/04/02 13:55:00 UTC

[jira] [Work logged] (HIVE-24928) In case of non-native tables use basic statistics from HiveStorageHandler

     [ https://issues.apache.org/jira/browse/HIVE-24928?focusedWorklogId=576104&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-576104 ]

ASF GitHub Bot logged work on HIVE-24928:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Apr/21 13:54
            Start Date: 02/Apr/21 13:54
    Worklog Time Spent: 10m 
      Work Description: lcspinter commented on pull request #2111:
URL: https://github.com/apache/hive/pull/2111#issuecomment-812540900


   @pvary @kgyrtkirk Could you please have a second look at this PR? Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 576104)
    Time Spent: 2h 10m  (was: 2h)

> In case of non-native tables use basic statistics from HiveStorageHandler
> -------------------------------------------------------------------------
>
>                 Key: HIVE-24928
>                 URL: https://issues.apache.org/jira/browse/HIVE-24928
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 4.0.0
>            Reporter: László Pintér
>            Assignee: László Pintér
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When we are running `ANALYZE TABLE ... COMPUTE STATISTICS` or `ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS` all the basic statistics are collected by the BasicStatsTask class. This class tries to estimate the statistics by scanning the directory of the table. 
> In the case of non-native tables (iceberg, hbase), the table directory might contain metadata files as well, which would be counted by the BasicStatsTask when calculating basic stats. 
> Instead of having this logic, the HiveStorageHandler implementation should provide basic statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)