You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@doris.apache.org by GitBox <gi...@apache.org> on 2019/08/19 08:11:58 UTC

[GitHub] [incubator-doris] kangpinghuang opened a new issue #1668: Data loaded into wrong partition

kangpinghuang opened a new issue #1668: Data loaded into wrong partition
URL: https://github.com/apache/incubator-doris/issues/1668
 
 
   **Describe the bug**
   
   There is a partitioned table, partition by stat_date.
   show create table like this:
   
   CREATE TABLE `t` (
     `p_id_` bigint(20) NOT NULL ,
     `v_id_mod__` bigint(20) NOT NULL ,
     `v_id__` bigint(20) NOT NULL ,
     `s_id_` bigint(20) NOT NULL ,
     `stat_date_` bigint(20) NOT NULL ,
   ) ENGINE=OLAP
   DUPLICATE KEY(`p_id_`, `v_id_mod__`, `v_id__`, `s_id_`, `stat_date_`)
   PARTITION BY RANGE(`stat_date_`) (
   PARTITION p20181201 VALUES LESS THAN ("1543680000"),
   PARTITION p20181202 VALUES LESS THAN ("1543766400"),
   PARTITION p20181203 VALUES LESS THAN ("1543852800"),
   PARTITION p20181204 VALUES LESS THAN ("1543939200"),
   PARTITION p20181205 VALUES LESS THAN ("1544025600"),
   PARTITION p20190805 VALUES LESS THAN ("1565020800"),
   PARTITION p20190806 VALUES LESS THAN ("1565107200"),
   PARTITION p20190807 VALUES LESS THAN ("1565193600"),
   PARTITION p20190808 VALUES LESS THAN ("1565280000"),
   PARTITION p20190809 VALUES LESS THAN ("1565366400"),
   PARTITION p20190810 VALUES LESS THAN ("1565452800"),
   PARTITION p20190811 VALUES LESS THAN ("1565539200"))
   DISTRIBUTED BY HASH(`v_id__`) BUCKETS 32
   PROPERTIES (
   "storage_type" = "COLUMN"
   );
   
   there is two sql result in inconsistent results:
   select stat_date_,sum(1) from f group by stat_date_;
   +------------+-----------+
   | stat_date_ | sum(1)    |
   +------------+-----------+
   | 1565366400 | 232581496 |
   +------------+-----------+
   
   select stat_date_,sum(1) from f where stat_date_=1565366400 group by stat_date_;
   +------------+-----------+
   | stat_date_ | sum(1)    |
   +------------+-----------+
   | 1565366400 | 203781984 |
   +------------+-----------+
   
   ===================
   
   explain the two sql, result shows that:
   1. the first sql read data from two partition
   2. the second sql read data from one partition
   
   and more, select count(*) from two different partition equals to the first sql result.
   So, the probably reason for the inconsistency is the data is loaded into wrong partition.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@doris.apache.org
For additional commands, e-mail: dev-help@doris.apache.org