You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by ShaoFeng Shi <sh...@apache.org> on 2018/06/12 06:11:36 UTC

[KYLIN-3388] Hive data may become inconsistent after redistribution

Hello Kylin users,

Recently Yanghong Zhong from eBay team reported that the source data may
become inconsistent after the "Redistribute flat hive table" step. This is
caused by a bug in Hive for "distribute by rand()" statement. While Kylin
depends on this to make the data distribution more even. For more
information, please check:

https://issues.apache.org/jira/browse/KYLIN-3388

Before a hot-fix is released, we recommend you disable the redistribution
feature to ensure data's accuracy, by setting:

kylin.source.hive.redistribute-flat-table=false


in conf/kylin.properties. A restart is needed to take effect.

Thanks for the attention.

-- 
Best regards,

Shaofeng Shi 史少锋