You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Alexander Sterligov (JIRA)" <ji...@apache.org> on 2017/07/23 07:30:01 UTC
[jira] [Comment Edited] (KYLIN-2749) Merge may remove old segments
without saving merged segment
[ https://issues.apache.org/jira/browse/KYLIN-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097540#comment-16097540 ]
Alexander Sterligov edited comment on KYLIN-2749 at 7/23/17 7:29 AM:
---------------------------------------------------------------------
The merge job had finished successfully and new merged segment contained new segment, but cube.json was old and contained references to old removed segments. I don't see any errors at this moment in log, but hbase had a failure at that moment.
What if exception was ignored at org.apache.kylin.storage.hbase.HBaseResourceStore.checkAndPutResourceImpl during Table.close and internal hbase buffers were not flushed during the failure?
I mean IOUtils.closeQuietly(table);
was (Author: sterligovak):
The merge job had finished successfully and new merged segment contained new segment, but cube.json was old and contained references to old removed segments. I don't see any errors at this moment in log, but hbase had a failure at that moment.
What if exception was ignored at org.apache.kylin.storage.hbase.HBaseResourceStore during Table.close and internal hbase buffers were not flushed during the failure?
{quote}
@Override
protected long checkAndPutResourceImpl(String resPath, byte[] content, long oldTS, long newTS) throws IOException, IllegalStateException {
Table table = getConnection().getTable(TableName.valueOf(getAllInOneTableName()));
try {
byte[] row = Bytes.toBytes(resPath);
byte[] bOldTS = oldTS == 0 ? null : Bytes.toBytes(oldTS);
Put put = buildPut(resPath, newTS, row, content, table);
boolean ok = table.checkAndPut(row, B_FAMILY, B_COLUMN_TS, bOldTS, put);
logger.trace("Update row " + resPath + " from oldTs: " + oldTS + ", to newTs: " + newTS + ", operation result: " + ok);
if (!ok) {
long real = getResourceTimestampImpl(resPath);
throw new IllegalStateException("Overwriting conflict " + resPath + ", expect old TS " + oldTS + ", but it is " + real);
}
return newTS;
} finally {
IOUtils.closeQuietly(table);
}
}
{quote}
> Merge may remove old segments without saving merged segment
> -----------------------------------------------------------
>
> Key: KYLIN-2749
> URL: https://issues.apache.org/jira/browse/KYLIN-2749
> Project: Kylin
> Issue Type: Bug
> Affects Versions: v2.0.0
> Reporter: Alexander Sterligov
>
> Merge started to work on last 7 segments.
> During the process hbase had a failure because of spot-instances shutdown in Amazon. Data was not lost, because it is at S3.
> I stopped kylin and did hbase hbck --repair. During the report of repair I didn't see any information about lost data, just redistribution of regions.
> Then after kylin was started I cannot query data from the last 7 segments:
> {quote}
> Caused by: java.lang.RuntimeException: org.apache.hadoop.hbase.TableNotFoundException: Table 'KYLIN_7MMHCHKVVB' was not found, got: KYLIN_7H3WSPX1UJ.
> at com.google.common.base.Throwables.propagate(Throwables.java:160)
> at org.apache.kylin.storage.hbase.cube.v2.ExpectedSizeIterator.next(ExpectedSizeIterator.java:67)
> at org.apache.kylin.storage.hbase.cube.v2.ExpectedSizeIterator.next(ExpectedSizeIterator.java:31)
> at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
> at com.google.common.collect.Iterators$6.hasNext(Iterators.java:583)
> at org.apache.kylin.storage.gtrecord.SegmentCubeTupleIterator$2.hasNext(SegmentCubeTupleIterator.java:116)
> at org.apache.kylin.storage.gtrecord.SegmentCubeTupleIterator.hasNext(SegmentCubeTupleIterator.java:149)
> at com.google.common.collect.Iterators$6.hasNext(Iterators.java:582)
> at org.apache.kylin.storage.gtrecord.SequentialCubeTupleIterator.hasNext(SequentialCubeTupleIterator.java:129)
> at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:67)
> at Baz$1$1.moveNext(Unknown Source)
> at org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:826)
> at org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
> at org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
> at Baz.bind(Unknown Source)
> at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:331)
> at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
> at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:553)
> at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:544)
> at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
> at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
> at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
> at org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:607)
> at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:600)
> at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615)
> at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148)
> ... 77 more
> Caused by: org.apache.hadoop.hbase.TableNotFoundException: Table 'KYLIN_7MMHCHKVVB' was not found, got: KYLIN_7H3WSPX1UJ.
> at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1310)
> at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1189)
> at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1173)
> at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1130)
> at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getRegionLocation(ConnectionManager.java:965)
> at org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:83)
> at org.apache.hadoop.hbase.client.HTable.getRegionLocation(HTable.java:505)
> at org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:721)
> at org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:691)
> at org.apache.hadoop.hbase.client.HTable.getStartKeysInRange(HTable.java:1796)
> at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1751)
> at org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$1.run(CubeHBaseEndpointRPC.java:182)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> ... 1 more
> {quote}
> hbase really doesn't contain tables for the last 7 segments and I didn't call any cleanup jobs.
> It looks like Merge removed old segment tables before merged segment was saved.
> I'm going to continue to investigate this problem and will post more details next week.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)