You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/06/29 22:55:44 UTC

[GitHub] [ozone] jojochuang opened a new pull request, #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

jojochuang opened a new pull request, #3570:
URL: https://github.com/apache/ozone/pull/3570

   ## What changes were proposed in this pull request?
   
   Support both CRC32 an CRC32C for Ozone to make it possible to replicate files between HDFS and Ozone.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-6127
   
   ## How was this patch tested?
   Unit tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on a diff in pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on code in PR #3570:
URL: https://github.com/apache/ozone/pull/3570#discussion_r921154655


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/checksum/ReplicatedFileChecksumHelper.java:
##########
@@ -71,7 +71,6 @@ protected void checksumBlocks() throws IOException {
          blockIdx++) {
       OmKeyLocationInfo keyLocationInfo =
           getKeyLocationInfoList().get(blockIdx);
-      currentLength += keyLocationInfo.getLength();
       if (currentLength > getLength()) {

Review Comment:
   can you explain why we should do this judge here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 merged pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 merged PR #3570:
URL: https://github.com/apache/ozone/pull/3570


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3570:
URL: https://github.com/apache/ozone/pull/3570#issuecomment-1171115506

   @JacksonYao287 - thanks for the info.
   
   > md5md5crc32 will be influenced by chunk size and block size.
   but for composite-crc32c, since crc32c support incremental computation, block size and chunk size will take no effect on the final file checksum.
   
   So far in Ozone, we only have md5md5crc32, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3570:
URL: https://github.com/apache/ozone/pull/3570#issuecomment-1173751494

   thanks @jojochuang for this patch. can you add a test for ChecksumCombineMode.COMPOSITE_CRC?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3570:
URL: https://github.com/apache/ozone/pull/3570#issuecomment-1171121434

   @sodonnel 
   seems we already have composite-crc32 in Ozone,  but we do not have composite-crc32c.
   please  take a look at `BaseFilechecksumHelper#makeCompositeCrcResult()`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on a diff in pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on code in PR #3570:
URL: https://github.com/apache/ozone/pull/3570#discussion_r913353910


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/checksum/BaseFileChecksumHelper.java:
##########
@@ -242,8 +253,29 @@ private FileChecksum makeMd5CrcResult() {
     //compute file MD5
     final MD5Hash fileMD5 = MD5Hash.digest(getBlockChecksumBuf().getData());
     // assume CRC32 for now

Review Comment:
   NIT: since we have CRC32C now in this patch, remove this comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on a diff in pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on code in PR #3570:
URL: https://github.com/apache/ozone/pull/3570#discussion_r913353910


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/checksum/BaseFileChecksumHelper.java:
##########
@@ -242,8 +253,29 @@ private FileChecksum makeMd5CrcResult() {
     //compute file MD5
     final MD5Hash fileMD5 = MD5Hash.digest(getBlockChecksumBuf().getData());
     // assume CRC32 for now

Review Comment:
   NIT: since we have CRC32C, remove this comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on a diff in pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on code in PR #3570:
URL: https://github.com/apache/ozone/pull/3570#discussion_r921154655


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/checksum/ReplicatedFileChecksumHelper.java:
##########
@@ -71,7 +71,6 @@ protected void checksumBlocks() throws IOException {
          blockIdx++) {
       OmKeyLocationInfo keyLocationInfo =
           getKeyLocationInfoList().get(blockIdx);
-      currentLength += keyLocationInfo.getLength();
       if (currentLength > getLength()) {

Review Comment:
   can you explain why we should do this judge here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3570:
URL: https://github.com/apache/ozone/pull/3570#issuecomment-1212897150

   thanks @jojochuang for the work. i have committed this patch to master branch!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3570:
URL: https://github.com/apache/ozone/pull/3570#issuecomment-1171104901

   > Just a question - does this actually help if the block sizes are different? Eg HDFS is mostly 128MB, Ozone is 256MB by default. Combining checksums with different blocks sizes will give a different combined checksum, right?
   
   md5md5crc32 will be influenced by chunk size and block size. 
   but for composite-crc32c,  since crc32c support incremental computation, block size and chunk size will take no effect on the final file checksum.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3570:
URL: https://github.com/apache/ozone/pull/3570#issuecomment-1170985259

   Just a question - does this actually help if the block sizes are different? Eg HDFS is mostly 128MB, Ozone is 256MB by default. Combining checksums with different blocks sizes will give a different combined checksum, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] Xushaohong commented on pull request #3570: HDDS-6127. file checksum to support both CRC32 and CRC32C.

Posted by GitBox <gi...@apache.org>.
Xushaohong commented on PR #3570:
URL: https://github.com/apache/ozone/pull/3570#issuecomment-1173285152

   The coding frame is fine, but the result of computed Composite-crc32c is different from hdfs's. Could you take a deep look? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org