You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "yubo (JIRA)" <ji...@apache.org> on 2016/10/21 02:12:58 UTC

[jira] [Created] (KYLIN-2116) when hive field delimitor exists in table field values, fields order is wrong

yubo created KYLIN-2116:
---------------------------

             Summary: when hive field delimitor exists in table field values, fields order is wrong
                 Key: KYLIN-2116
                 URL: https://issues.apache.org/jira/browse/KYLIN-2116
             Project: Kylin
          Issue Type: Bug
          Components: Job Engine
    Affects Versions: v1.5.2
            Reporter: yubo
            Assignee: Dong Li


in #1 step when creating temp hive table,there is delimitor 
 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177' 
 
when this delimitor exists in some fields, fields order is  wrong 


test details:

when we search with same sql, different results are returned as below. 
25 in kylin and 24 in hive. 

We guess there maybe some problem within #2 Step(Extract Fact Table Distinct Columns) when building cube. 





1 search in kylin 
 select distinct visit_hour from KYLIN_REPORT_DB.session_behavior_channel_oms where  visit_date >= '2016-10-19' and visit_date <= '2016-10-19' 
 Results (25) 
19 
17 
18 
15 
16 
13 
14 
11 
12 
21 
神马搜索 
20 
08 
09 
04 
22 
05 
23 
06 
07 
00 
01 
02 
03 
10 

2. #2 Step Name: Extract Fact Table Distinct Columns output 

hadoop fs -cat /kylin/kylin_metadata/kylin-e8bb517d-6c29-4f89-a83e-871e142e3d48/channel_first_stage_flow_cube/fact_distinct_columns/VISIT_HOUR 
00 
01 
02 
03 
04 
05 
06 
07 
08 
09 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
神马搜索 


3. hive table 

hive -e " select distinct visit_hour from KYLIN_REPORT_DB.session_behavior_channel_oms where  visit_date >= '2016-10-19' and visit_date <= '2016-10-19' " 
WARNING: Use "yarn jar" to launch YARN applications. 

Logging initialized using configuration in file:/etc/hive/2.3.4.0-3485/0/hive-log4j.properties 
Query ID = hdfs_20161020164441_dcea3e55-1a8b-4f3a-9378-7dcda008001b 
Total jobs = 1 
Launching Job 1 out of 1 


Status: Running (Executing on YARN cluster with App id application_1476342479107_13013) 

-------------------------------------------------------------------------------- 
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED 
-------------------------------------------------------------------------------- 
Map 1 ..........   SUCCEEDED    564        564        0        0       0       0 
Reducer 2 ......   SUCCEEDED     15         15        0        0       0       0 
-------------------------------------------------------------------------------- 
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 168.70 s   
-------------------------------------------------------------------------------- 
OK 
03 
12 
13 
22 
05 
14 
08 
17 
00 
02 
18 
06 
23 
01 
19 
07 
10 
15 
20 
16 
11 
04 
09 
21 
Time taken: 172.907 seconds, Fetched: 24 row(s) 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)