You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Sammi Chen (Jira)" <ji...@apache.org> on 2021/08/09 07:51:00 UTC

[jira] [Updated] (HDDS-5548) Keep downloaded container .gz.tar file for debug purpose

     [ https://issues.apache.org/jira/browse/HDDS-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sammi Chen updated HDDS-5548:
-----------------------------
    Summary: Keep downloaded container .gz.tar file for debug purpose  (was: Keep downloaded .gz.tar container file for debug purpose)

> Keep downloaded container .gz.tar file for debug purpose
> --------------------------------------------------------
>
>                 Key: HDDS-5548
>                 URL: https://issues.apache.org/jira/browse/HDDS-5548
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>
> There are a lot of container import failure LOGs in production, such as,
> 2021-08-03 21:48:12,311 [ContainerReplicationThread-9] INFO org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator: Starting replication of container 66315 from [4e613295-6d55-4bf9-bdc9-1668fd24741c{ip: 11.61.44.244, host: 11.61.44.244, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9858, RATIS_SERVER=9858, STANDALONE=9859], networkLocation: /rack582702, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, 7694e208-c887-4d8e-b249-28a176b4d7b7{ip: 11.61.45.38, host: 11.61.45.38, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9858, RATIS_SERVER=9858, STANDALONE=9859], networkLocation: /rack582788, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}]
> 2021-08-03 21:48:17,462 [grpc-default-executor-12557] INFO org.apache.hadoop.ozone.container.replication.GrpcReplicationClient: Container 66315 is downloaded to /data/ozoneadmin/ozoneenv/ozone-temp/container-66315.tar.gz
> 2021-08-03 21:48:17,462 [ContainerReplicationThread-9] INFO org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator: Container 66315 is downloaded with size 6154503, starting to import.
> 2021-08-03 21:48:17,582 [ContainerReplicationThread-9] ERROR org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator: Container 66315 replication was unsuccessful.
> java.io.IOException: Container descriptor is missing from the container archive.
>         at org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.unpackContainerDescriptor(TarContainerPacker.java:190)
>         at org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.importContainer(DownloadAndImportReplicator.java:76)
>         at org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:125)
>         at org.apache.hadoop.ozone.container.replication.MeasuredReplicator.replicate(MeasuredReplicator.java:69)
>         at org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:139)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2021-08-03 21:48:17,582 [ContainerReplicationThread-9] ERROR org.apache.hadoop.ozone.container.replication.ReplicationSupervisor: Container 66315 can't be downloaded from any of the datanodes.
> In the above case,   66315 container on the source datanode actually has the Container descriptor on disk.  So what's the root cause of this error is in doubt. 
> This task is to keep the downloaded tar file for investigation purpose at the cost of storage space.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org