You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Manpreet Singh (JIRA)" <ji...@apache.org> on 2018/06/26 04:34:00 UTC

[jira] [Created] (HIVE-19991) msck repair table command not able to retrieve achieved data.

Manpreet Singh created HIVE-19991:
-------------------------------------

             Summary: msck repair table command not able to retrieve achieved data.
                 Key: HIVE-19991
                 URL: https://issues.apache.org/jira/browse/HIVE-19991
             Project: Hive
          Issue Type: Bug
    Affects Versions: 1.1.0
            Reporter: Manpreet Singh


 Observed an issue when customer used msck repair tablename command on a archeived data (data copied via hadoop fs -cp from another location) by the command did not loaded the partitions in the table and showed zero results.

 

Please find below complete steps for both archived and unarchived.

 Copied table's partition data to another location & define a similar table structure for new location an ran "msck repair table" command on the destination and then ran a select * from <destination> table it produced zero results, same thing is working for non achieved data.

 

1. Created two table with similar structure and different loctation.

create table sau_test1 ( a int, b string) partitioned by (dt string) stored as parquet location '/user/hive/warehouse/sau_test1';

create table sau_arch ( a int, b string) partitioned by (dt string) stored as parquet location'/user/hive/warehouse/sau_arch';

2. Inserted data in source table

insert into sau_test1 partition(dt='dt1') select 1,'A1';

select * from sau_test1;

+---------------+-------------++-----------------+
|sau_test1.a|sau_test1.b|sau_test1.dt|

+---------------+-------------++-----------------+
|1|A1|dt1|

+---------------+-------------++-----------------+

3. Copied the content of directory source directory to destination directory.

hadoop fs -cp /user/hive/warehouse/sau_test1/* /user/hive/warehouse/sau_arch/

4. Running msck repair table <destination table> and checking results. — for unachieved data

msck repair table sau_arch ;
select * from sau_arch;

select * from sau_arch;
+--------------+------------++----------------+
|sau_arch.a|sau_arch.b|sau_arch.dt|

+--------------+------------++----------------+
|1|A1|dt1|

+--------------+------------++----------------+

5. Customer wants the same functionality for archived data and hence tried below steps.

a) Dropped table partition in destination table "alter table sau_arch drop partition(dt='dt1');"

b) set hive.archive.enabled=true;
alter table sau_test1 archive partition ( dt='dt1');

c) copied the hdfs files from source table to destination tables.

hdfs dfs -ls /user/hive/warehouse/sau_test1/dt=dt1/

drwxr-xr-x - hive supergroup 0 2018-06-08 13:26 /user/hive/warehouse/sau_test1/dt=dt1/data.har
-rw-r--r-- 3 hive supergroup 0 2018-06-08 13:26 /user/hive/warehouse/sau_test1/dt=dt1/data.har/_SUCCESS
-rw-r--r-- 3 hive supergroup 305 2018-06-08 13:26 /user/hive/warehouse/sau_test1/dt=dt1/data.har/_index
-rw-r--r-- 3 hive supergroup 23 2018-06-08 13:26 /user/hive/warehouse/sau_test1/dt=dt1/data.har/_masterindex
-rw-r--r-- 3 hive supergroup 286 2018-06-08 13:26 /user/hive/warehouse/sau_test1/dt=dt1/data.har/part-0
$ hdfs dfs -ls /user/hive/warehouse/sau_arch/dt=dt1/

drwxr-xr-x - ngdb supergroup 0 2018-06-08 13:27 /user/hive/warehouse/sau_arch/dt=dt1/data.har
-rw-r--r-- 3 ngdb supergroup 0 2018-06-08 13:27 /user/hive/warehouse/sau_arch/dt=dt1/data.har/_SUCCESS
-rw-r--r-- 3 ngdb supergroup 305 2018-06-08 13:27 /user/hive/warehouse/sau_arch/dt=dt1/data.har/_index
-rw-r--r-- 3 ngdb supergroup 23 2018-06-08 13:27 /user/hive/warehouse/sau_arch/dt=dt1/data.har/_masterindex
-rw-r--r-- 3 ngdb supergroup 286 2018-06-08 13:27 /user/hive/warehouse/sau_arch/dt=dt1/data.har/part-0

d) msck repair table sau_arch;
e)select * from sau_arch . - No results shown

+--------------+------------++----------------+
|sau_arch.a|sau_arch.b|sau_arch.dt|

+--------------+------------++----------------+
+--------------+------------++----------------+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)