You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Nandakumar (Jira)" <ji...@apache.org> on 2023/01/23 16:19:00 UTC

[jira] [Assigned] (HDDS-7195) EC: Additional checksum validations to make sure data integrity to avoid reconstruction time corruptions.

     [ https://issues.apache.org/jira/browse/HDDS-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nandakumar reassigned HDDS-7195:
--------------------------------

    Assignee: Nandakumar

> EC: Additional checksum validations to make sure data integrity to avoid reconstruction time corruptions.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-7195
>                 URL: https://issues.apache.org/jira/browse/HDDS-7195
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Uma Maheswara Rao G
>            Assignee: Nandakumar
>            Priority: Major
>
> For EC blocks, each chunk has a checksum as for usual writes. The user data is written as a stripe across the container group, and each chunk has its own standalone checksum.
> After calculating parity and writing the parity bytes, we know the checksums for all chunks in the stripe, including parity. This is the stripe checksum, and it is a simple concatenation of all the checksum bytes for each chunk in the stripe in replicaIndex order. This stripe checksum is written to a redundant number of replicas in the container group - namely replica index = 1 and all parity replicas. In the case of EC-6-3, this means 4 out of the 9 replicas hold the stripe checksum. For a container to be recovered at least 6 replicas must be available, and hence the stripe checksum must be available if the container is recoverable.
> To ensure the integrity of each recovered chunk in a stripe, after recovering the data and calculating its checksum, we should use the stripe checksum already stored and compare the newly calculated checksum for the recovered chunk against that stored in the stripe checksum. If the new checksum does not match, the reconstruction should be failed. This will avoid the case where a chunk can get recovered with some error, due to a bug or wire corruption and it goes unnoticed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org