You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Sadanand Shenoy (Jira)" <ji...@apache.org> on 2023/02/07 05:27:00 UTC

[jira] [Updated] (HDDS-7913) 'Graph traversal level exceeded allowed maximum' when the difference b/w snapshots is more than 3.

     [ https://issues.apache.org/jira/browse/HDDS-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sadanand Shenoy updated HDDS-7913:
----------------------------------
    Description: 
Getting the below error when calling snapdiff b/w snapshots which are more than 2-3 snapshots apart.
{code:java}
2023-02-07 10:50:07,464 [IPC Server handler 13 on default port 60192] ERROR rocksdiff.RocksDBCheckpointDiffer (RocksDBCheckpointDiffer.java:internalGetSSTDiffList(899)) - Graph traversal level exceeded allowed maximum (1000000). This could be due to invalid input generating a loop in the traversal path. Same SSTs found so far: [000053], different SSTs: []
2023-02-07 10:50:07,465 [IPC Server handler 13 on default port 60192] WARN  ipc.Server (Server.java:logException(3035)) - IPC Server handler 13 on default port 60192, call Call#102 Retry#159 org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 127.0.0.1:60273
java.lang.RuntimeException: Graph traversal level exceeded allowed maximum (1000000). This could be due to invalid input generating a loop in the traversal path. Same SSTs found so far: [000053], different SSTs: []
    at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.internalGetSSTDiffList(RocksDBCheckpointDiffer.java:904)
    at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:788)
    at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:756)
    at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:231)
    at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getSnapshotDiffReport(SnapshotDiffManager.java:134)
    at org.apache.hadoop.ozone.om.OmSnapshotManager.getSnapshotDiffReport(OmSnapshotManager.java:250)
    at org.apache.hadoop.ozone.om.OzoneManager.snapshotDiff(OzoneManager.java:4325)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.snapshotDiff(OzoneManagerRequestHandler.java:1216)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleReadRequest(OzoneManagerRequestHandler.java:298)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:223)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:177)
    at org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:147)
    at org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.processCall(ProtobufRpcEngine.java:465)
    at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:578)
    at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1043)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:971)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2976) {code}
Unit test to repro in *TestOmSnapshot.java*
{code:java}
@Test
public void testSnapDiffWithMultipleSSTs()
    throws IOException, InterruptedException, TimeoutException {
  String volumeName1 = "vol-" + RandomStringUtils.randomNumeric(5);
  String bucketName1 = "buck1";
  String bucketName2 = "bucketz";
  store.createVolume(volumeName1);
  OzoneVolume volume1 = store.getVolume(volumeName1);
  volume1.createBucket(bucketName1);
  volume1.createBucket(bucketName2);
  OzoneBucket bucket1 = volume1.getBucket(bucketName1);
  OzoneBucket bucket2 = volume1.getBucket(bucketName2);
  String keyPrefix = "key-";
  createFileKey(bucket1, keyPrefix);
  String snap1 = "snap" + RandomStringUtils.randomNumeric(5);
  createSnapshot(volumeName1, bucketName1, snap1); 
  createFileKey(bucket2, keyPrefix);
  String snap2 = "snap" + RandomStringUtils.randomNumeric(5);
  createSnapshot(volumeName1, bucketName1, snap2); 
  createFileKey(bucket2, keyPrefix);
  String snap3 = "snap" + RandomStringUtils.randomNumeric(5);
  createSnapshot(volumeName1, bucketName1, snap3);
  createFileKey(bucket1,keyPrefix);
  String snap4 = "snap" + RandomStringUtils.randomNumeric(5);
  createSnapshot(volumeName1, bucketName1, snap4);
  SnapshotDiffReport diff1 =
      store.snapshotDiff(volumeName1, bucketName1, snap1, snap4);
}
 {code}

  was:
Getting the below error when calling snapdiff b/w snapshots which are more than 2-3 snapshots apart.
{code:java}
2023-02-07 10:50:07,464 [IPC Server handler 13 on default port 60192] ERROR rocksdiff.RocksDBCheckpointDiffer (RocksDBCheckpointDiffer.java:internalGetSSTDiffList(899)) - Graph traversal level exceeded allowed maximum (1000000). This could be due to invalid input generating a loop in the traversal path. Same SSTs found so far: [000053], different SSTs: []
2023-02-07 10:50:07,465 [IPC Server handler 13 on default port 60192] WARN  ipc.Server (Server.java:logException(3035)) - IPC Server handler 13 on default port 60192, call Call#102 Retry#159 org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 127.0.0.1:60273
java.lang.RuntimeException: Graph traversal level exceeded allowed maximum (1000000). This could be due to invalid input generating a loop in the traversal path. Same SSTs found so far: [000053], different SSTs: []
    at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.internalGetSSTDiffList(RocksDBCheckpointDiffer.java:904)
    at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:788)
    at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:756)
    at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:231)
    at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getSnapshotDiffReport(SnapshotDiffManager.java:134)
    at org.apache.hadoop.ozone.om.OmSnapshotManager.getSnapshotDiffReport(OmSnapshotManager.java:250)
    at org.apache.hadoop.ozone.om.OzoneManager.snapshotDiff(OzoneManager.java:4325)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.snapshotDiff(OzoneManagerRequestHandler.java:1216)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleReadRequest(OzoneManagerRequestHandler.java:298)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:223)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:177)
    at org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
    at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:147)
    at org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.processCall(ProtobufRpcEngine.java:465)
    at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:578)
    at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1043)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:971)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2976) {code}
Unit test to repro 
{code:java}
@Test
public void testSnapDiffWithMultipleSSTs()
    throws IOException, InterruptedException, TimeoutException {
  String volumeName1 = "vol-" + RandomStringUtils.randomNumeric(5);
  String bucketName1 = "buck1";
  String bucketName2 = "bucketz";
  store.createVolume(volumeName1);
  OzoneVolume volume1 = store.getVolume(volumeName1);
  volume1.createBucket(bucketName1);
  volume1.createBucket(bucketName2);
  OzoneBucket bucket1 = volume1.getBucket(bucketName1);
  OzoneBucket bucket2 = volume1.getBucket(bucketName2);
  String keyPrefix = "key-";
  createFileKey(bucket1, keyPrefix);
  String snap1 = "snap" + RandomStringUtils.randomNumeric(5);
  createSnapshot(volumeName1, bucketName1, snap1); // 1.sst
  createFileKey(bucket2, keyPrefix);
  String snap2 = "snap" + RandomStringUtils.randomNumeric(5);
  createSnapshot(volumeName1, bucketName1, snap2); // 1.sst
  createFileKey(bucket2, keyPrefix);
  String snap3 = "snap" + RandomStringUtils.randomNumeric(5);
  createSnapshot(volumeName1, bucketName1, snap3); // 1.sst
  createFileKey(bucket1,keyPrefix);
  String snap4 = "snap" + RandomStringUtils.randomNumeric(5);
  createSnapshot(volumeName1, bucketName1, snap4); // 1.sst 2.sst 3.sst 4.sst
  SnapshotDiffReport diff1 =
      store.snapshotDiff(volumeName1, bucketName1, snap1, snap4);
  Assert.assertEquals(2, diff1.getDiffList().size());
}
 {code}


> 'Graph traversal level exceeded allowed maximum' when the difference b/w snapshots is more than 3.
> --------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-7913
>                 URL: https://issues.apache.org/jira/browse/HDDS-7913
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sadanand Shenoy
>            Priority: Major
>
> Getting the below error when calling snapdiff b/w snapshots which are more than 2-3 snapshots apart.
> {code:java}
> 2023-02-07 10:50:07,464 [IPC Server handler 13 on default port 60192] ERROR rocksdiff.RocksDBCheckpointDiffer (RocksDBCheckpointDiffer.java:internalGetSSTDiffList(899)) - Graph traversal level exceeded allowed maximum (1000000). This could be due to invalid input generating a loop in the traversal path. Same SSTs found so far: [000053], different SSTs: []
> 2023-02-07 10:50:07,465 [IPC Server handler 13 on default port 60192] WARN  ipc.Server (Server.java:logException(3035)) - IPC Server handler 13 on default port 60192, call Call#102 Retry#159 org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 127.0.0.1:60273
> java.lang.RuntimeException: Graph traversal level exceeded allowed maximum (1000000). This could be due to invalid input generating a loop in the traversal path. Same SSTs found so far: [000053], different SSTs: []
>     at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.internalGetSSTDiffList(RocksDBCheckpointDiffer.java:904)
>     at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffList(RocksDBCheckpointDiffer.java:788)
>     at org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.getSSTDiffListWithFullPath(RocksDBCheckpointDiffer.java:756)
>     at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFiles(SnapshotDiffManager.java:231)
>     at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getSnapshotDiffReport(SnapshotDiffManager.java:134)
>     at org.apache.hadoop.ozone.om.OmSnapshotManager.getSnapshotDiffReport(OmSnapshotManager.java:250)
>     at org.apache.hadoop.ozone.om.OzoneManager.snapshotDiff(OzoneManager.java:4325)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.snapshotDiff(OzoneManagerRequestHandler.java:1216)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleReadRequest(OzoneManagerRequestHandler.java:298)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:223)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:177)
>     at org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:147)
>     at org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>     at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.processCall(ProtobufRpcEngine.java:465)
>     at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:578)
>     at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
>     at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1043)
>     at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:971)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2976) {code}
> Unit test to repro in *TestOmSnapshot.java*
> {code:java}
> @Test
> public void testSnapDiffWithMultipleSSTs()
>     throws IOException, InterruptedException, TimeoutException {
>   String volumeName1 = "vol-" + RandomStringUtils.randomNumeric(5);
>   String bucketName1 = "buck1";
>   String bucketName2 = "bucketz";
>   store.createVolume(volumeName1);
>   OzoneVolume volume1 = store.getVolume(volumeName1);
>   volume1.createBucket(bucketName1);
>   volume1.createBucket(bucketName2);
>   OzoneBucket bucket1 = volume1.getBucket(bucketName1);
>   OzoneBucket bucket2 = volume1.getBucket(bucketName2);
>   String keyPrefix = "key-";
>   createFileKey(bucket1, keyPrefix);
>   String snap1 = "snap" + RandomStringUtils.randomNumeric(5);
>   createSnapshot(volumeName1, bucketName1, snap1); 
>   createFileKey(bucket2, keyPrefix);
>   String snap2 = "snap" + RandomStringUtils.randomNumeric(5);
>   createSnapshot(volumeName1, bucketName1, snap2); 
>   createFileKey(bucket2, keyPrefix);
>   String snap3 = "snap" + RandomStringUtils.randomNumeric(5);
>   createSnapshot(volumeName1, bucketName1, snap3);
>   createFileKey(bucket1,keyPrefix);
>   String snap4 = "snap" + RandomStringUtils.randomNumeric(5);
>   createSnapshot(volumeName1, bucketName1, snap4);
>   SnapshotDiffReport diff1 =
>       store.snapshotDiff(volumeName1, bucketName1, snap1, snap4);
> }
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org