You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "hongbin ma (JIRA)" <ji...@apache.org> on 2016/10/25 12:55:58 UTC
[jira] [Commented] (KYLIN-1985) SnapshotTable should only keep the columns described in tableDesc

    [ https://issues.apache.org/jira/browse/KYLIN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605247#comment-15605247 ] 

hongbin ma commented on KYLIN-1985:
-----------------------------------

Even though I agree the patch is indeed a good enhancement, however If I'm understanding correctly, the patch only takes care for the "append columns to the end of lookup table" case, but won't work for the "insert columns to somewhere in the middle of lookup table", right?

If so the JIRA's name seem to be misleading.

> SnapshotTable should only keep the columns described in tableDesc
> -----------------------------------------------------------------
>
>                 Key: KYLIN-1985
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1985
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>    Affects Versions: v1.5.3
>            Reporter: zhengdong
>            Assignee: zhengdong
>             Fix For: v1.5.4
>
>         Attachments: KYLIN-1985.patch
>
>
> we suffered from a strange problem that we got a  java.lang.ArrayIndexOutOfBoundsException when build of refresh a cube, exception stack like this :
>   java.lang.IllegalStateException: Failed to load lookup table DIM_TABLE_NAME from snapshot /table_snapshot/dim_table_name/5a78a522-6f85-4650-b47d-6a5f5806b7f7.snapshot
> 	at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:621)
> 	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:61)
> 	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
> 	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> 	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
> 	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112)
> 	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
> 	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112)
> 	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:127)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 19
> 	at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(LookupStringTable.java:85)
> 	at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(LookupStringTable.java:34)
> 	at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:76)
> 	at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67)
> 	at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
> 	at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:55)
> 	at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
> 	at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619)
> 	... 13 more
> and a simple exception when queried by a lookup table dimension
>   
> ERROR [http-bio-7070-exec-7] controller.QueryController:209 : Exception when execute sql
>         at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
>         at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
>         at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:143)
>         at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:186)
>         at org.apache.kylin.rest.service.QueryService.execute(QueryService.java:366)
>         at org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:278)
>         at org.apache.kylin.rest.service.QueryService.query(QueryService.java:121)
>         at org.apache.kylin.rest.service.QueryService$$FastClassByCGLIB$$4957273f.invoke(<generated>)
>         at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
>         at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:618)
>         at org.apache.kylin.rest.service.QueryService$$EnhancerByCGLIB$$315e2079.query(<generated>)
>         at org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:192)
>         at org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 19
>         at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(LookupStringTable.java:85)
>         at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(LookupStringTable.java:34)
>         at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:76)
>         at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67)
>         at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
>         at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:55)
>         at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
>         at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619)
>     Though the exception message, we found that one lookup table had been changed in hive (add columns) and not been synchronized with kylin. However, the cause of this problem is too subtle and not easily found.  
>     As for SnapshotTable, only checking 'row.length <= maxIndex' in takeSnapshot method to detect 'Bad hive table row ' is not enough. And only encode and store the data columns described in tableDesc could be better since these columns not in tableDesc are not used in any cube definition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)