You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by "wang (Jira)" <ji...@apache.org> on 2022/06/27 03:17:00 UTC

[jira] [Created] (KYLIN-5203) From Kylin or Hive, the same query Sql, but the results are inconsistent

wang created KYLIN-5203:
---------------------------

             Summary: From Kylin or Hive, the same query Sql, but the results are inconsistent
                 Key: KYLIN-5203
                 URL: https://issues.apache.org/jira/browse/KYLIN-5203
             Project: Kylin
          Issue Type: Bug
          Components: Query Engine
    Affects Versions: v3.1.2
            Reporter: wang


SQL(SUM, COUNT):

SELECT 
    SUM(t1.a1),
    COUNT(1)
FROM
    T1 JOIN T2 ON...
    JOIN T3 ON...
    JOIN T4 ON...
    ...
    JOIN T9 ON...
WHERE
    T1.c1 = '10000'
    T1.date between '2022-06-11' and '2022-06-21'
    {color:#FF0000}T9.b_type IN ('7', '11', '12');{color}

Result:
    |Hive|2134980.9451|36330|
    |Kylin|1135892.3346|19765|


If remove T9 Filter:
SELECT 
    SUM(t1.a1),
    COUNT(1)
FROM
    T1 JOIN T2 ON...
    JOIN T3 ON...
    JOIN T4 ON...
    ...
    JOIN T9 ON...
WHERE
    T1.c1 = '10000'
    T1.date between '2022-06-11' and '2022-06-21';


Result:
    |Hive|3184089.5551|65333|
    |Kylin|3184089.5551|65333|

理论上,Hive和kylin的结果一致,但是不加上T9表的过滤条件,结果一致,加上Filter,结果丢失;
In theory, the results of Hive and kylin are the same, but the filter conditions of the T9 table are not added, the results are the same, and the results are lost when Filter is added;


env:
    Hive, 
    一共九张表,主表Fact Table是分区表,其余八张表中,两个千万大表,剩下的是维表,表类型是分桶表
    There are nine tables. The main table, Fact Table, is a partition table. The other eight tables, there are two large tables. The rest are dimension tables , bucket tables.


    Kylin:
    Create Intermediate Flat Hive Table
    Redistribute Flat Hive Table
    Extract Fact Table Distinct Columns(Map Input)
    Segment: 
        Source Count: ???

    From log, the same data count

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)