You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Alexander Sterligov (JIRA)" <ji...@apache.org> on 2017/07/23 07:30:01 UTC

[jira] [Comment Edited] (KYLIN-2749) Merge may remove old segments without saving merged segment

    [ https://issues.apache.org/jira/browse/KYLIN-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097540#comment-16097540 ] 

Alexander Sterligov edited comment on KYLIN-2749 at 7/23/17 7:29 AM:
---------------------------------------------------------------------

The merge job had finished successfully and new merged segment contained new segment, but cube.json was old and contained references to old removed segments. I don't see any errors at this moment in log, but hbase had a failure at that moment.

What if exception was ignored at org.apache.kylin.storage.hbase.HBaseResourceStore.checkAndPutResourceImpl during Table.close and internal hbase buffers were not flushed during the failure?

I mean IOUtils.closeQuietly(table);


was (Author: sterligovak):
The merge job had finished successfully and new merged segment contained new segment, but cube.json was old and contained references to old removed segments. I don't see any errors at this moment in log, but hbase had a failure at that moment.

What if exception was ignored at org.apache.kylin.storage.hbase.HBaseResourceStore during Table.close and internal hbase buffers were not flushed during the failure?
{quote}
@Override
    protected long checkAndPutResourceImpl(String resPath, byte[] content, long oldTS, long newTS) throws IOException, IllegalStateException {
        Table table = getConnection().getTable(TableName.valueOf(getAllInOneTableName()));
        try {
            byte[] row = Bytes.toBytes(resPath);
            byte[] bOldTS = oldTS == 0 ? null : Bytes.toBytes(oldTS);
            Put put = buildPut(resPath, newTS, row, content, table);

            boolean ok = table.checkAndPut(row, B_FAMILY, B_COLUMN_TS, bOldTS, put);
            logger.trace("Update row " + resPath + " from oldTs: " + oldTS + ", to newTs: " + newTS + ", operation result: " + ok);
            if (!ok) {
                long real = getResourceTimestampImpl(resPath);
                throw new IllegalStateException("Overwriting conflict " + resPath + ", expect old TS " + oldTS + ", but it is " + real);
            }

            return newTS;
        } finally {
            IOUtils.closeQuietly(table);
        }
    }
{quote}

> Merge may remove old segments without saving merged segment
> -----------------------------------------------------------
>
>                 Key: KYLIN-2749
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2749
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v2.0.0
>            Reporter: Alexander Sterligov
>
> Merge started to work on last 7 segments.
> During the process hbase had a failure because of spot-instances shutdown in Amazon. Data was not lost, because it is at S3.
> I stopped kylin and did hbase hbck --repair. During the report of repair I didn't see any information about lost data, just redistribution of regions.
> Then after kylin was started I cannot query data from the last  7 segments:
> {quote}
> Caused by: java.lang.RuntimeException: org.apache.hadoop.hbase.TableNotFoundException: Table 'KYLIN_7MMHCHKVVB' was not found, got: KYLIN_7H3WSPX1UJ.
>         at com.google.common.base.Throwables.propagate(Throwables.java:160)
>         at org.apache.kylin.storage.hbase.cube.v2.ExpectedSizeIterator.next(ExpectedSizeIterator.java:67)
>         at org.apache.kylin.storage.hbase.cube.v2.ExpectedSizeIterator.next(ExpectedSizeIterator.java:31)
>         at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
>         at com.google.common.collect.Iterators$6.hasNext(Iterators.java:583)
>         at org.apache.kylin.storage.gtrecord.SegmentCubeTupleIterator$2.hasNext(SegmentCubeTupleIterator.java:116)
>         at org.apache.kylin.storage.gtrecord.SegmentCubeTupleIterator.hasNext(SegmentCubeTupleIterator.java:149)
>         at com.google.common.collect.Iterators$6.hasNext(Iterators.java:582)
>         at org.apache.kylin.storage.gtrecord.SequentialCubeTupleIterator.hasNext(SequentialCubeTupleIterator.java:129)
>         at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:67)
>         at Baz$1$1.moveNext(Unknown Source)
>         at org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:826)
>         at org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>         at org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>         at Baz.bind(Unknown Source)
>         at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:331)
>         at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
>         at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:553)
>         at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:544)
>         at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
>         at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>         at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
>         at org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:607)
>         at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:600)
>         at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615)
>         at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148)
>         ... 77 more
> Caused by: org.apache.hadoop.hbase.TableNotFoundException: Table 'KYLIN_7MMHCHKVVB' was not found, got: KYLIN_7H3WSPX1UJ.
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1310)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1189)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1173)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1130)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getRegionLocation(ConnectionManager.java:965)
>         at org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:83)
>         at org.apache.hadoop.hbase.client.HTable.getRegionLocation(HTable.java:505)
>         at org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:721)
>         at org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:691)
>         at org.apache.hadoop.hbase.client.HTable.getStartKeysInRange(HTable.java:1796)
>         at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1751)
>         at org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$1.run(CubeHBaseEndpointRPC.java:182)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         ... 1 more
> {quote}
> hbase really doesn't contain tables for the last 7 segments and I didn't call any cleanup jobs.
> It looks like Merge removed old segment tables before merged segment was saved.
> I'm going to continue to investigate this problem and will post more details next week.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)