You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Ashutosh Chauhan (JIRA)" <ji...@apache.org> on 2015/11/10 23:59:11 UTC

[jira] [Updated] (HIVE-12309) TableScan should use column stats when available for better data size estimate

     [ https://issues.apache.org/jira/browse/HIVE-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashutosh Chauhan updated HIVE-12309:
------------------------------------
    Affects Version/s: 0.14.0
                       1.0.0
                       1.2.0
                       1.1.0

> TableScan should use column stats when available for better data size estimate
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-12309
>                 URL: https://issues.apache.org/jira/browse/HIVE-12309
>             Project: Hive
>          Issue Type: Improvement
>          Components: Statistics
>    Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Ashutosh Chauhan
>             Fix For: 2.0.0
>
>         Attachments: HIVE-12309.2.patch, HIVE-12309.patch
>
>
> Currently, all other operators use column stats to figure out data size, whereas TableScan relies on rawDataSize. This inconsistency can result in an inconsistency where TS may have lower Datasize then subsequent operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)