You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/07/18 11:50:00 UTC

[jira] [Work logged] (HIVE-21831) Stats should be reset correctly during load of a partitioned ACID table

     [ https://issues.apache.org/jira/browse/HIVE-21831?focusedWorklogId=278921&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278921 ]

ASF GitHub Bot logged work on HIVE-21831:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Jul/19 11:49
            Start Date: 18/Jul/19 11:49
    Worklog Time Spent: 10m 
      Work Description: dlavati commented on pull request #659: HIVE-21831: Stats should be reset correctly during load of a partitioned ACID table
URL: https://github.com/apache/hive/pull/659
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 278921)
    Time Spent: 0.5h  (was: 20m)

> Stats should be reset correctly during load of a partitioned ACID table
> -----------------------------------------------------------------------
>
>                 Key: HIVE-21831
>                 URL: https://issues.apache.org/jira/browse/HIVE-21831
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Import/Export
>    Affects Versions: 3.0.0, 3.1.0, 3.1.1
>            Reporter: David Lavati
>            Assignee: David Lavati
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-21831.01.patch, HIVE-21831.02.patch, HIVE-21831.02.patch
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While running something similar to the following example, I noticed that an import of a partitioned ACID table using the ORC format fails to provide table statistics:
> {code:java}
> set hive.stats.autogather=true;
> set hive.stats.column.autogather=true;
> set hive.fetch.task.conversion=none;
> set hive.support.concurrency=true;
> set hive.default.fileformat.managed=ORC;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create transactional table int_src (foo int, bar int);
> insert into int_src select 1,1;
> create transactional table int_exp(foo int) partitioned by (bar int);
> insert into int_exp select * from int_src;
> select count(*) from int_exp;
> create transactional table int_imp(foo int) partitioned by (bar int);
> EXPORT TABLE int_exp to '/tmp/expint';
> IMPORT TABLE int_imp FROM '/tmp/expint';
> select count(*) FROM int_imp;
> {code}
> The count returned 0 (opposed to 1, but even for 100k order of records it was 0) and correct statistics were only available after running compute statistics.
>  
> This was unique to ACID + partitioning + ORC, but this isn't the expected behavior.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)