You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Wang, Gang (JIRA)" <ji...@apache.org> on 2017/10/23 05:38:00 UTC

[jira] [Created] (KYLIN-2956) building trie dictionary blocked on value of length over 4095

Wang, Gang created KYLIN-2956:
---------------------------------

             Summary: building trie dictionary blocked on value of length over 4095 
                 Key: KYLIN-2956
                 URL: https://issues.apache.org/jira/browse/KYLIN-2956
             Project: Kylin
          Issue Type: Bug
          Components: General
            Reporter: Wang, Gang


In the new release, Kylin will check the value length when building trie dictionary, in class _TrieDictionaryBuilder_ method _buildTrieBytes_ , through method:
_private void positiveShortPreCheck(int i, String fieldName) {
    if (!BytesUtil.isPositiveShort(i)) {
        throw new IllegalStateException(fieldName + " is not positive short, usually caused by too long dict value.");
    }
} _

_public static boolean isPositiveShort(int i) {
    return (i & 0xFFFF7000) == 0;
}_

And 0xFFFF7000 in binary:  1111 1111 1111 1111 0111 0000 0000 0000, so the value length should be less than  0000 0000 0000 0000 0001 0000 0001 1111, values 4095 in decimalism.

I wonder why is 0xFFFF7000, should
         0xFFFF8000: 1111 1111 1111 1111 1000 0000 0000 0000
support max length:  0000 0000 0000 0000 0111 1111 1111 1111  (32767) 
be what you want? And 32767 may be too lagrge, I prefer use 0xFFFFE000,
          0xFFFFE000: 1111 1111 1111 1111 1110 0000 0000 0000, 
support max length: 0000 0000 0000 0000 0001 1111 1111 1111  (8191) 
     




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)