You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Wang, Gang (JIRA)" <ji...@apache.org> on 2018/01/03 06:16:03 UTC
[jira] [Commented] (KYLIN-1403) Kylin Hive Column Cardinality Job
unable to read bucketed table
[ https://issues.apache.org/jira/browse/KYLIN-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309196#comment-16309196 ]
Wang, Gang commented on KYLIN-1403:
-----------------------------------
Tested in Hive 1.2 Kylin 2.1, HCatlog works good in format TXT, Parquet and ORC.
This may be not a issue anymore.
set hive.enforce.bucketing = true;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nostrick;
create table testBucket_parquet (x int,y int) partitioned by(z int) clustered by(x) into 10 buckets STORED AS PARQUET;
insert into table testBucket_parquet partition(z) values (1, 1, 1);
insert into table testBucket_parquet partition(z) values (2, 1, 1);
insert into table testBucket_parquet partition(z) values (2, 1, 2);
insert into table testBucket_parquet partition(z) values (1, 1, 2);
> Kylin Hive Column Cardinality Job unable to read bucketed table
> ---------------------------------------------------------------
>
> Key: KYLIN-1403
> URL: https://issues.apache.org/jira/browse/KYLIN-1403
> Project: Kylin
> Issue Type: Bug
> Affects Versions: v1.2, v1.3.0
> Environment: - Tested against apache-kylin-1.2-HBase1.1-incubating-SNAPSHOT-bin and apache-kylin-1.3-HBase-1.1-SNAPSHOT-bin
> - Environment is HDP 2.3.4
> - Hive version: hive-1.2.1.2.3.4.0
> - HBase version: HBase 1.1.2.2.3.4.0-3485
> Reporter: Sebastian Zimmermann
> Assignee: Wang, Gang
> Labels: newbie
>
> This issue is connected with https://issues.apache.org/jira/browse/KYLIN-1402 and states the findings while investigating on the StringIndexOutOfBoundsException.
> While trying to find out why the outputfile created in the cardinality job is empty, we discovered that the only difference between this non-working job and all our other jobs (which work without problems), is that the underlying table is bucketed.
> The data folder is dbfolder/db/table/partition/bucketfolder/file
> Kylin checks for data in dbfolder/db/table/partition and so is unable to find the data.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)