You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Mohit Gupta (JIRA)" <ji...@apache.org> on 2012/05/18 12:16:07 UTC
[jira] [Created] (HIVE-3037) Hive 0.7.1 - Dynamic Partition - Incorrect results

Mohit Gupta created HIVE-3037:
---------------------------------

             Summary: Hive 0.7.1 - Dynamic Partition - Incorrect results
                 Key: HIVE-3037
                 URL: https://issues.apache.org/jira/browse/HIVE-3037
             Project: Hive
          Issue Type: Bug
            Reporter: Mohit Gupta


Following are the settings and table structures:

SET hive.exec.compress.output=true;
set hive.exec.dynamic.partition=true;
set hive.auto.convert.join = true;
set hive.exec.dynamic.partition.mode=nonstrict;

 
create external table report
(campaign_id bigint, channel_id bigint,leads bigint )  partitioned by (dt string)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ","  STORED AS TEXTFILE LOCATION "s3://****/test/report";
 
 
create table tmp( dt string, campaign_id bigint, channel_id bigint ,leads bigint);

insert over table tmp
"some query";

Now,
hive>  select * from tmp;
OK
2012-05-15      8449    0       3099
2012-05-15      8449    2349    1
2012-05-15      8449    10181   5
Time taken: 0.318 seconds

* table tmp has 3 rows.

Now, I am trying to insert them into report table ( partitioned by dt)
hive>  insert overwrite table report
    >  partition ( dt )
    >  select campaign_id,channel_id,leads,dt from tmp;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201205180728_0025, Tracking URL = http://***.ec2.internal:9100/jobdetails.jsp?jobid=job_201205180728_0025
Kill Command = /home/hadoop/.versions/0.20/bin/../bin/hadoop job  -Dmapred.job.tracker=**.ec2.internal:9001 -kill job_201205180728_0025
2012-05-18 10:07:33,607 Stage-1 map = 0%,  reduce = 0%
2012-05-18 10:07:37,636 Stage-1 map = 5%,  reduce = 0%
2012-05-18 10:07:39,650 Stage-1 map = 68%,  reduce = 0%
2012-05-18 10:07:40,658 Stage-1 map = 100%,  reduce = 0%
2012-05-18 10:07:43,678 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201205180728_0025
Ended Job = 1638517814, job is filtered out (removed at runtime).
Launching Job 2 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201205180728_0026, Tracking URL = http://**.ec2.internal:9100/jobdetails.jsp?jobid=job_201205180728_0026
Kill Command = /home/hadoop/.versions/0.20/bin/../bin/hadoop job  -Dmapred.job.tracker=**.ec2.internal:9001 -kill job_201205180728_0026
2012-05-18 10:07:48,471 Stage-3 map = 0%,  reduce = 0%
2012-05-18 10:07:51,492 Stage-3 map = 50%,  reduce = 0%
2012-05-18 10:07:54,514 Stage-3 map = 100%,  reduce = 0%
2012-05-18 10:07:57,534 Stage-3 map = 100%,  reduce = 100%
Ended Job = job_201205180728_0026
Loading data to table default.report partition (dt=null)
        Loading partition {dt=2012-05-15}
Partition default.report{dt=2012-05-15} stats: [num_files: 1, num_rows: 0, total_size: 32]
Table default.report stats: [num_partitions: 1, num_files: 1, num_rows: 0, total_size: 32]
3 Rows loaded to report
OK
Time taken: 29.446 seconds
hive> select * from report;
OK
8449    0       3099    2012-05-15
Time taken: 0.406 seconds

*only one row got loaded into report table!! ( though it penultimate query says that "3" Rows loaded to report)


 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira