You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Shaofeng SHI (JIRA)" <ji...@apache.org> on 2017/08/08 11:22:00 UTC

[jira] [Commented] (KYLIN-2749) Merge may remove old segments without saving merged segment

    [ https://issues.apache.org/jira/browse/KYLIN-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118205#comment-16118205 ] 

Shaofeng SHI commented on KYLIN-2749:
-------------------------------------

I went through the log, there is no error when Kylin updates the metastore change (use new merged segment to replace old segments). After the saving, reloading metadata said the segment number was decreased from 26 to 19, this indicates the merging is succeed. 

2017-07-22 09:12:13,124 INFO  [http-bio-7070-exec-11] cube.CubeManager:785 : Reloaded cube cohort_main being CUBE[name=cohort_main] having 26 segments

<after merge, reload metadata>

2017-07-22 09:20:24,608 INFO  [http-bio-7070-exec-23] cube.CubeManager:785 : Reloaded cube cohort_main being CUBE[name=cohort_main] having 19 segments

<after a couple minutes, it was changed back:> 
2017-07-22 09:55:23,561 INFO  [http-bio-7070-exec-5] cube.CubeManager:785 : Reloaded cube cohort_main being CUBE[name=cohort_main] having 25 segments

It seems like the metastore was restored to an old version, causing Kylin looks for the old HTables.

Not knowing your HBase configuration, you may need do some search. Besides, using spot instance is not recommended. 

> Merge may remove old segments without saving merged segment
> -----------------------------------------------------------
>
>                 Key: KYLIN-2749
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2749
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v2.0.0
>            Reporter: Alexander Sterligov
>         Attachments: kylin.log.2017-07-22.tar
>
>
> Merge started to work on last 7 segments.
> During the process hbase had a failure because of spot-instances shutdown in Amazon. Data was not lost, because it is at S3.
> I stopped kylin and did hbase hbck --repair. During the report of repair I didn't see any information about lost data, just redistribution of regions.
> Then after kylin was started I cannot query data from the last  7 segments:
> {quote}
> Caused by: java.lang.RuntimeException: org.apache.hadoop.hbase.TableNotFoundException: Table 'KYLIN_7MMHCHKVVB' was not found, got: KYLIN_7H3WSPX1UJ.
>         at com.google.common.base.Throwables.propagate(Throwables.java:160)
>         at org.apache.kylin.storage.hbase.cube.v2.ExpectedSizeIterator.next(ExpectedSizeIterator.java:67)
>         at org.apache.kylin.storage.hbase.cube.v2.ExpectedSizeIterator.next(ExpectedSizeIterator.java:31)
>         at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
>         at com.google.common.collect.Iterators$6.hasNext(Iterators.java:583)
>         at org.apache.kylin.storage.gtrecord.SegmentCubeTupleIterator$2.hasNext(SegmentCubeTupleIterator.java:116)
>         at org.apache.kylin.storage.gtrecord.SegmentCubeTupleIterator.hasNext(SegmentCubeTupleIterator.java:149)
>         at com.google.common.collect.Iterators$6.hasNext(Iterators.java:582)
>         at org.apache.kylin.storage.gtrecord.SequentialCubeTupleIterator.hasNext(SequentialCubeTupleIterator.java:129)
>         at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:67)
>         at Baz$1$1.moveNext(Unknown Source)
>         at org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:826)
>         at org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>         at org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>         at Baz.bind(Unknown Source)
>         at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:331)
>         at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
>         at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:553)
>         at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:544)
>         at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
>         at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>         at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
>         at org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:607)
>         at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:600)
>         at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615)
>         at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148)
>         ... 77 more
> Caused by: org.apache.hadoop.hbase.TableNotFoundException: Table 'KYLIN_7MMHCHKVVB' was not found, got: KYLIN_7H3WSPX1UJ.
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1310)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1189)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1173)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1130)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getRegionLocation(ConnectionManager.java:965)
>         at org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:83)
>         at org.apache.hadoop.hbase.client.HTable.getRegionLocation(HTable.java:505)
>         at org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:721)
>         at org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:691)
>         at org.apache.hadoop.hbase.client.HTable.getStartKeysInRange(HTable.java:1796)
>         at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1751)
>         at org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$1.run(CubeHBaseEndpointRPC.java:182)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         ... 1 more
> {quote}
> hbase really doesn't contain tables for the last 7 segments and I didn't call any cleanup jobs.
> It looks like Merge removed old segment tables before merged segment was saved.
> I'm going to continue to investigate this problem and will post more details next week.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)