You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Dong Li (JIRA)" <ji...@apache.org> on 2016/10/31 05:42:59 UTC

[jira] [Resolved] (KYLIN-2116) when hive field delimitor exists in table field values, fields order is wrong

     [ https://issues.apache.org/jira/browse/KYLIN-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dong Li resolved KYLIN-2116.
----------------------------
       Resolution: Fixed
    Fix Version/s: v1.6.0

> when hive field delimitor exists in table field values, fields order is wrong
> -----------------------------------------------------------------------------
>
>                 Key: KYLIN-2116
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2116
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v1.5.2
>            Reporter: yubo
>            Assignee: Dong Li
>            Priority: Minor
>             Fix For: v1.6.0
>
>
> in #1 step when creating temp hive table,there is delimitor 
>  
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177' 
>  
> when this delimitor exists in some fields, fields order is  wrong 
> test details:
> when we search with same sql, different results are returned as below. 
> 25 in kylin and 24 in hive. 
> We guess there maybe some problem within #2 Step(Extract Fact Table Distinct Columns) when building cube. 
> 1 search in kylin 
>  select distinct visit_hour from KYLIN_REPORT_DB.session_behavior_channel_oms where  visit_date >= '2016-10-19' and visit_date <= '2016-10-19' 
>  Results (25) 
> 19 
> 17 
> 18 
> 15 
> 16 
> 13 
> 14 
> 11 
> 12 
> 21 
> 神马搜索 
> 20 
> 08 
> 09 
> 04 
> 22 
> 05 
> 23 
> 06 
> 07 
> 00 
> 01 
> 02 
> 03 
> 10 
> 2. #2 Step Name: Extract Fact Table Distinct Columns output 
> hadoop fs -cat /kylin/kylin_metadata/kylin-e8bb517d-6c29-4f89-a83e-871e142e3d48/channel_first_stage_flow_cube/fact_distinct_columns/VISIT_HOUR 
> 00 
> 01 
> 02 
> 03 
> 04 
> 05 
> 06 
> 07 
> 08 
> 09 
> 10 
> 11 
> 12 
> 13 
> 14 
> 15 
> 16 
> 17 
> 18 
> 19 
> 20 
> 21 
> 22 
> 23 
> 神马搜索 
> 3. hive table 
> hive -e " select distinct visit_hour from KYLIN_REPORT_DB.session_behavior_channel_oms where  visit_date >= '2016-10-19' and visit_date <= '2016-10-19' " 
> WARNING: Use "yarn jar" to launch YARN applications. 
> Logging initialized using configuration in file:/etc/hive/2.3.4.0-3485/0/hive-log4j.properties 
> Query ID = hdfs_20161020164441_dcea3e55-1a8b-4f3a-9378-7dcda008001b 
> Total jobs = 1 
> Launching Job 1 out of 1 
> Status: Running (Executing on YARN cluster with App id application_1476342479107_13013) 
> -------------------------------------------------------------------------------- 
>         VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED 
> -------------------------------------------------------------------------------- 
> Map 1 ..........   SUCCEEDED    564        564        0        0       0       0 
> Reducer 2 ......   SUCCEEDED     15         15        0        0       0       0 
> -------------------------------------------------------------------------------- 
> VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 168.70 s   
> -------------------------------------------------------------------------------- 
> OK 
> 03 
> 12 
> 13 
> 22 
> 05 
> 14 
> 08 
> 17 
> 00 
> 02 
> 18 
> 06 
> 23 
> 01 
> 19 
> 07 
> 10 
> 15 
> 20 
> 16 
> 11 
> 04 
> 09 
> 21 
> Time taken: 172.907 seconds, Fetched: 24 row(s) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)