You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Sonny Heer <so...@gmail.com> on 2017/12/06 22:03:45 UTC

incorrect cardinality

We have a table in hive which has a gender column (char(1)).  The group by
shows the following:


M 8946041

8 9

F 14215364

  215400


Kylin shows:


10 GENDER char(1) 274693
Looking at the HiveColumnCardinalityJob code I don't see anything obviously
wrong.  Any idea why that value is wrong in the UI?

Thanks

Re: incorrect cardinality

Posted by Ge Silas <go...@live.cn>.
I am sorry Sonny.

Please ignore the above response…

Thanks,
Silas

On 8 Dec 2017, at 9:48 AM, Ge Silas <go...@live.cn>> wrote:

Hi Sonny,

What was the sampling percentage you used?

Best regards,
Silas

On 7 Dec 2017, at 6:03 AM, Sonny Heer <so...@gmail.com>> wrote:

We have a table in hive which has a gender column (char(1)).  The group by shows the following:


M 8946041
8 9
F 14215364
  215400

Kylin shows:


10      GENDER  char(1)         274693

Looking at the HiveColumnCardinalityJob code I don't see anything obviously wrong.  Any idea why that value is wrong in the UI?

Thanks



Re: incorrect cardinality

Posted by ShaoFeng Shi <sh...@apache.org>.
Kylin uses HyperLogLog to collect the cardinality of each column, so it is
inaccurate, but the error rate should be acceptable.

Please check whether the column order is different in Kylin and Hive, for
example, col "GENDER" is 4th in Kylin but 5th in Hive. In that case you
need re-sync the table.

2017-12-08 10:06 GMT+08:00 Ge Silas <go...@live.cn>:

> Can you please try “Calculate Cardinality” in System tab?
>
> Thanks,
> Silas
>
> On 8 Dec 2017, at 10:03 AM, Shuangyin Ge <go...@live.cn> wrote:
>
> I am sorry Sonny.
>
> Please ignore the above response…
>
> Thanks,
> Silas
>
> On 8 Dec 2017, at 9:48 AM, Ge Silas <go...@live.cn> wrote:
>
> Hi Sonny,
>
> What was the sampling percentage you used?
>
> Best regards,
> Silas
>
> On 7 Dec 2017, at 6:03 AM, Sonny Heer <so...@gmail.com> wrote:
>
> We have a table in hive which has a gender column (char(1)).  The group by
> shows the following:
>
>
> M 8946041
> 8 9
> F 14215364
>   215400
>
> Kylin shows:
>
> 10 GENDER char(1) 274693
> Looking at the HiveColumnCardinalityJob code I don't see anything
> obviously wrong.  Any idea why that value is wrong in the UI?
>
> Thanks
>
>
>
>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: incorrect cardinality

Posted by Ge Silas <go...@live.cn>.
Can you please try “Calculate Cardinality” in System tab?

Thanks,
Silas

On 8 Dec 2017, at 10:03 AM, Shuangyin Ge <go...@live.cn>> wrote:

I am sorry Sonny.

Please ignore the above response…

Thanks,
Silas

On 8 Dec 2017, at 9:48 AM, Ge Silas <go...@live.cn>> wrote:

Hi Sonny,

What was the sampling percentage you used?

Best regards,
Silas

On 7 Dec 2017, at 6:03 AM, Sonny Heer <so...@gmail.com>> wrote:

We have a table in hive which has a gender column (char(1)).  The group by shows the following:


M 8946041
8 9
F 14215364
  215400

Kylin shows:


10      GENDER  char(1)         274693

Looking at the HiveColumnCardinalityJob code I don't see anything obviously wrong.  Any idea why that value is wrong in the UI?

Thanks




Re: incorrect cardinality

Posted by Ge Silas <go...@live.cn>.
Hi Sonny,

What was the sampling percentage you used?

Best regards,
Silas

On 7 Dec 2017, at 6:03 AM, Sonny Heer <so...@gmail.com>> wrote:

We have a table in hive which has a gender column (char(1)).  The group by shows the following:


M 8946041
8 9
F 14215364
  215400

Kylin shows:


10      GENDER  char(1)         274693

Looking at the HiveColumnCardinalityJob code I don't see anything obviously wrong.  Any idea why that value is wrong in the UI?

Thanks