You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kylin.apache.org by flycshi <fl...@gmail.com> on 2017/07/07 01:04:57 UTC

Re: Lookup Table Enumerator high memory

in DictionaryGeneratorCLI.class for method processSegment
to get lookup table by column 
        // snapshot
        Set<String> toSnapshot = Sets.newHashSet();
        Set<TableRef> toCheckLookup = Sets.newHashSet();
        for (DimensionDesc dim : cubeSeg.getCubeDesc().getDimensions()) {
            TableRef table = dim.getTableRef();
            if (*cubeSeg.getModel().isLookupTable(table)*) {
                toSnapshot.add(table.getTableIdentity());
                toCheckLookup.add(table);
            }
        }

when lookup table is larger, this step easily to failed to load lookup table

the judgement of here , whether can consider to add more judgement
for example, if the column is not a derived dimension and even if the column
belong to a lookup table, and the lookup table will not to load in memory

otherwise， every lookup table will load in memory，this lead to big lookup
table unable to use in kylin completely。

Looking forward to your reply，thanks.


--
View this message in context: http://apache-kylin.74782.x6.nabble.com/Lookup-Table-Enumerator-high-memory-tp1397p8384.html
Sent from the Apache Kylin mailing list archive at Nabble.com.

Re: Lookup Table Enumerator high memory

Posted by ShaoFeng Shi <sh...@apache.org>.

There are a couple options for this:

1) use a Hive view to shade your wide lookup table, picking up only
interested columns, and then use this view as lookup table in Cube. From
1.5.x Kylin starts to support view as lookup;

2) Kylin 2.0 supports using a big table as lookup; When create the model,
you have an option of "not take snapshot", then Kylin will not load it to
memory.

2017-07-07 9:04 GMT+08:00 flycshi <fl...@gmail.com>:

> in DictionaryGeneratorCLI.class for method processSegment
> to get lookup table by column
>         // snapshot
>         Set<String> toSnapshot = Sets.newHashSet();
>         Set<TableRef> toCheckLookup = Sets.newHashSet();
>         for (DimensionDesc dim : cubeSeg.getCubeDesc().getDimensions()) {
>             TableRef table = dim.getTableRef();
>             if (*cubeSeg.getModel().isLookupTable(table)*) {
>                 toSnapshot.add(table.getTableIdentity());
>                 toCheckLookup.add(table);
>             }
>         }
>
> when lookup table is larger, this step easily to failed to load lookup
> table
>
> the judgement of here , whether can consider to add more judgement
> for example, if the column is not a derived dimension and even if the
> column
> belong to a lookup table, and the lookup table will not to load in memory
>
> otherwise， every lookup table will load in memory，this lead to big lookup
> table unable to use in kylin completely。
>
> Looking forward to your reply，thanks.
>
>
> --
> View this message in context: http://apache-kylin.74782.x6.
> nabble.com/Lookup-Table-Enumerator-high-memory-tp1397p8384.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi 史少锋