You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Stephen O'Donnell (Jira)" <ji...@apache.org> on 2022/02/03 17:00:00 UTC

[jira] [Created] (HDDS-6258) EC: Read with stopped but not dead nodes gives IllegalStateException rather than InsufficientNodesExcepion

Stephen O'Donnell created HDDS-6258:
---------------------------------------

             Summary: EC: Read with stopped but not dead nodes gives IllegalStateException rather than InsufficientNodesExcepion
                 Key: HDDS-6258
                 URL: https://issues.apache.org/jira/browse/HDDS-6258
             Project: Apache Ozone
          Issue Type: Sub-task
            Reporter: Stephen O'Donnell


Attempting to read a key less than 1 chunk, with 3 of the 5 nodes stopped (both when not yet stale or stale), the read hangs for sometime and fails with:

{code}
$ ozone sh key get /vol1/bucket/ec1 /tmp/3_down
java.lang.IllegalStateException
    at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:33)
    at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.selectParityIndexes(ECBlockReconstructedStripeInputStream.java:432)
    at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.init(ECBlockReconstructedStripeInputStream.java:179)
    at org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.readStripe(ECBlockReconstructedStripeInputStream.java:285)
    at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.readStripe(ECBlockReconstructedInputStream.java:192)
    at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.selectNextBuffer(ECBlockReconstructedInputStream.java:109)
    at org.apache.hadoop.ozone.client.io.ECBlockReconstructedInputStream.read(ECBlockReconstructedInputStream.java:83)
    at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.read(ECBlockInputStreamProxy.java:156)
    at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.read(ECBlockInputStreamProxy.java:171)
    at org.apache.hadoop.ozone.client.io.ECBlockInputStreamProxy.read(ECBlockInputStreamProxy.java:141)
    at org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:57)
    at org.apache.hadoop.ozone.client.io.KeyInputStream.readWithStrategy(KeyInputStream.java:268)
    at org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:235)
    at org.apache.hadoop.ozone.client.io.OzoneInputStream.read(OzoneInputStream.java:56)
    at java.base/java.io.InputStream.read(InputStream.java:205)
    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:94)
    at org.apache.hadoop.ozone.shell.keys.GetKeyHandler.execute(GetKeyHandler.java:88)
    at org.apache.hadoop.ozone.shell.Handler.call(Handler.java:98)
    at org.apache.hadoop.ozone.shell.Handler.call(Handler.java:44)
    at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
    at picocli.CommandLine.access$1300(CommandLine.java:145)
    at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
    at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
    at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
    at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
    at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:96)
    at org.apache.hadoop.ozone.shell.OzoneShell.lambda$execute$0(OzoneShell.java:55)
    at org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:159)
    at org.apache.hadoop.ozone.shell.OzoneShell.execute(OzoneShell.java:53)
    at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:87)
        at org.apache.hadoop.ozone.shell.OzoneShell.main(OzoneShell.java:47)
{code}

After the nodes are marked dead and the replicas no longer present in SCM, we get the expected error immediately:

{code}
ozone sh key get /vol1/bucket/ec1 /tmp/3_down_dead
There are insufficient datanodes to read the EC block
{code}

We should fail with a better error here.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org