You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Wang, Gang (JIRA)" <ji...@apache.org> on 2018/01/03 06:16:03 UTC

[jira] [Commented] (KYLIN-1403) Kylin Hive Column Cardinality Job unable to read bucketed table

    [ https://issues.apache.org/jira/browse/KYLIN-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309196#comment-16309196 ] 

Wang, Gang commented on KYLIN-1403:
-----------------------------------

Tested in Hive 1.2 Kylin 2.1, HCatlog works good in format TXT, Parquet and ORC.
This may be not a issue anymore.

set hive.enforce.bucketing = true;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nostrick;
create table testBucket_parquet (x int,y int) partitioned by(z int) clustered by(x) into 10 buckets STORED AS PARQUET;
insert into table testBucket_parquet partition(z) values (1, 1, 1);
insert into table testBucket_parquet partition(z) values (2, 1, 1);
insert into table testBucket_parquet partition(z) values (2, 1, 2);
insert into table testBucket_parquet partition(z) values (1, 1, 2);


> Kylin Hive Column Cardinality Job unable to read bucketed table
> ---------------------------------------------------------------
>
>                 Key: KYLIN-1403
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1403
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v1.2, v1.3.0
>         Environment: - Tested against apache-kylin-1.2-HBase1.1-incubating-SNAPSHOT-bin and apache-kylin-1.3-HBase-1.1-SNAPSHOT-bin
> - Environment is HDP 2.3.4 
> - Hive version: hive-1.2.1.2.3.4.0
> - HBase version: HBase 1.1.2.2.3.4.0-3485
>            Reporter: Sebastian Zimmermann
>            Assignee: Wang, Gang
>              Labels: newbie
>
> This issue is connected with https://issues.apache.org/jira/browse/KYLIN-1402 and states the findings while investigating on the StringIndexOutOfBoundsException.
> While trying to find out why the outputfile created in the cardinality job is empty, we discovered that the only difference between this non-working job and all our other jobs (which work without problems), is that the underlying table is bucketed. 
> The data folder is dbfolder/db/table/partition/bucketfolder/file
> Kylin checks for data in dbfolder/db/table/partition and so is unable to find the data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)