You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Wei-Chiu Chuang (Jira)" <ji...@apache.org> on 2022/09/15 00:19:00 UTC

[jira] [Commented] (HDDS-7062) distcp cause inconsistent block length between keyInfo and block file length

    [ https://issues.apache.org/jira/browse/HDDS-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605014#comment-17605014 ] 

Wei-Chiu Chuang commented on HDDS-7062:
---------------------------------------

I am using CDP 7.1.8, which is a snapshot at Ozone master branch around June. But it's not reproducible to me... 

> distcp cause inconsistent block length between keyInfo and block file length
> ----------------------------------------------------------------------------
>
>                 Key: HDDS-7062
>                 URL: https://issues.apache.org/jira/browse/HDDS-7062
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Jie Yao
>            Priority: Major
>         Attachments: 截屏2022-07-19 20.36.50.png, 截屏2022-07-28 15.41.46.png
>
>
> recently , we use distcp to copy files from HDFS to Ozone and find that : if we copy a file directly, distcp will copy the file as a whole and generate full blocks. but if we put this file into a directory and use distcp to copy this directory, it will break the file into pieces of different size.
> for example, we have a file of 1G and if we copy a file directly it looks like: 
> {color:#0747a6}{{color}
> {color:#0747a6}  "volumeName" : "vol1",{color}
> {color:#0747a6}  "bucketName" : "bkt1",{color}
> {color:#0747a6}  "name" : "1G_39.dat",{color}
> {color:#0747a6}  "dataSize" : 1048576000,{color}
> {color:#0747a6}  "creationTime" : "2022-07-18T09:54:02.944Z",{color}
> {color:#0747a6}  "modificationTime" : "2022-07-18T09:54:23.132Z",{color}
> {color:#0747a6}  "replicationConfig" :{color}
> {color:#0747a6}{     "replicationFactor" : "THREE",     "requiredNodes" : 3,     "replicationType" : "RATIS"   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}  "ozoneKeyLocations" : [{color}
> {color:#0747a6}{     "containerID" : 305,     "localID" : 109611004723206132,     "length" : 268435456,     "offset" : 0,     "keyOffset" : 0   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 304,     "localID" : 109611004723206133,     "length" : 268435456,     "offset" : 0,     "keyOffset" : 268435456   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 303,     "localID" : 109611004723206134,     "length" : 268435456,     "offset" : 0,     "keyOffset" : 536870912   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 304,     "localID" : 109611004723206135,     "length" : 243269632,     "offset" : 0,     "keyOffset" : 805306368   }{color}
> {color:#0747a6}],{color}
> {color:#0747a6}  "metadata" : \{ }{color}
> {color:#0747a6}}
> if we put it in a directory and use `distcp` to copy the directory , it will look like:
> {
> {color:#0747a6}  "volumeName" : "vol1",{color}
> {color:#0747a6}  "bucketName" : "bkt1",{color}
> {color:#0747a6}  "name" : "mockData/1G_39.dat",{color}
> {color:#0747a6}  "dataSize" : 1048576000,{color}
> {color:#0747a6}  "creationTime" : "2022-07-18T08:59:47.472Z",{color}
> {color:#0747a6}  "modificationTime" : "2022-07-18T09:40:22.980Z",{color}
> {color:#0747a6}  "replicationConfig" :{color}
> {color:#0747a6}{     "replicationFactor" : "THREE",     "requiredNodes" : 3,     "replicationType" : "RATIS"   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}  "ozoneKeyLocations" : [{color}
> {color:#0747a6}{     "containerID" : 234,     "localID" : 109611004723204874,     "length" : 67108864,     "offset" : 0,     "keyOffset" : 0   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 240,     "localID" : 109611004723204989,     "length" : 67108864,     "offset" : 0,     "keyOffset" : 67108864   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 244,     "localID" : 109611004723205092,     "length" : 83886080,     "offset" : 0,     "keyOffset" : 134217728   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 250,     "localID" : 109611004723205204,     "length" : 50331648,     "offset" : 0,     "keyOffset" : 218103808   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 255,     "localID" : 109611004723205295,     "length" : 16777216,     "offset" : 0,     "keyOffset" : 268435456   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{        "containerID" : 261,     "localID" : 109611004723205413,     "length" : 83886080,     "offset" : 0,     "keyOffset" : 285212672   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 266,     "localID" : 109611004723205503,     "length" : 50331648,     "offset" : 0,     "keyOffset" : 369098752   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 272,     "localID" : 109611004723205631,     "length" : 100663296,     "offset" : 0,     "keyOffset" : 419430400   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 276,     "localID" : 109611004723205749,     "length" : 67108864,     "offset" : 0,     "keyOffset" : 520093696   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 279,     "localID" : 109611004723205805,     "length" : 33554432,     "offset" : 0,     "keyOffset" : 587202560   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 283,     "localID" : 109611004723205854,     "length" : 50331648,     "offset" : 0,     "keyOffset" : 620756992   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 288,     "localID" : 109611004723205958,     "length" : 16777216,     "offset" : 0,     "keyOffset" : 671088640   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 290,     "localID" : 109611004723206013,     "length" : 67108864,     "offset" : 0,     "keyOffset" : 687865856   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 294,     "localID" : 109611004723206048,     "length" : 100663296,     "offset" : 0,     "keyOffset" : 754974720   }{color}
> {color:#0747a6},{color}
> {color:#0747a6}{     "containerID" : 299,     "localID" : 109611004723206091,     "length" : 192937984,     "offset" : 0,     "keyOffset" : 855638016   }{color}
> {color:#0747a6}],{color}
> {color:#0747a6}  "metadata" : \{ }{color}
> {color:#0747a6}}
> what is more, we also find that there will be some block with different length between the record in keyinfo metadata and the block file length in container. for example :
> the local ID 109611004723205204{color:#0747a6} {color}{color:#172b4d}above has a length of 50331648{color}. but when we check the block file in container , it shows it has a length of 61708864.  !截屏2022-07-19 20.36.50.png!
> another problem is that when we compute the block composite-crc checksum , the full block length(67108864), but not the expected length of {color:#172b4d}50331648, is{color} read back and computed, which leads an erroneous result. 
> i think we need to figure out why the two problem below happens:
> 1 why the block file length in container is different from the metadata recorded in key info.
> 2 why the erroneous file length will be read back.
> PS: the 1G file is generate by `dd if=/dev/zero `, so all the bit in this file is 0.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org