You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Ashish Kumar (Jira)" <ji...@apache.org> on 2023/09/06 09:30:00 UTC

[jira] [Assigned] (HDDS-8371) A keyName field in the keyTable might contain a full path for the key instead of the file name

     [ https://issues.apache.org/jira/browse/HDDS-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Kumar reassigned HDDS-8371:
----------------------------------

    Assignee:     (was: Ashish Kumar)

> A keyName field in the keyTable might contain a full path for the key instead of the file name
> ----------------------------------------------------------------------------------------------
>
>                 Key: HDDS-8371
>                 URL: https://issues.apache.org/jira/browse/HDDS-8371
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OM
>    Affects Versions: 1.3.0
>            Reporter: Kohei Sugihara
>            Priority: Major
>              Labels: pull-request-available
>
> The listStatus API serves a repeated path in the list when a path for the key is deep. We noticed the listStatus API serves a corrupt result against some specific keys in a bucket. The corruption is that repeats a requested key prefix in a final list of the listStatus result like the following:
> {code:java}
> # expected case
> % aws s3 ls s3://bucket/a/b/c/d/e/f/g/file.zip
> <timestamp> <bytes> a/b/c/d/e/f/g/file.zip
> ...
> # actual: "a/b/c/d/e/f/g" is duplicated
> % aws s3 ls s3://bucket/a/b/c/d/e/f/g/file.zip
> <timestamp> <bytes> a/b/c/d/e/f/g/a/b/c/d/e/f/g/file.zip
> ... {code}
> Environment:
>  * Ozone 1.3 [compatible|https://github.com/apache/ozone/commit/9c61a8aa497ab96c014ad3bb7b1ee4f731ebfaf8] version (same environment as HDDS-7701, HDDS-7925)
>  * Several large files, all of them are uploaded by multipart using AWS-CLI, divided into 8 MB chunks
>  * An FSO-enabled bucket
>  * OM HA
> Problem Details:
> I've dug the OM DB and found metadata in the keyTable has the full path for the key, so it finally appears redundant prefix twice in the result of the listStatus API.
> {code:java}
> # dump keyTable entries
> while (keyIter.hasNext()) {
>       Table.KeyValue<String, OmKeyInfo> kv = keyIter.next();
>       OmKeyInfo v = kv.getValue();
>       LOG.info("v/b={}/{} parent={} Key={} size={} time={} checksum={} id={} keyName={}",
>               kv.getValue().getVolumeName(), kv.getValue().getBucketName(),
>               nodeId, kv.getValue().getFileName(),
>               kv.getValue().getDataSize(), kv.getValue().getCreationTime(), kv.getValue().getFileChecksum(), kv.getValue().getObjectID(), kv.getValue().getKeyName());
> }
> # keyName has a full path for the key
> - v/b=volume/bucket parent=-9223371931475161596 Key=0g0pustv.tar.gz size=2276021708 time=1672052153340 checksum=null id=-9223371931457566207 keyName=a/b/c/d/e/0g0pustv.tar.gz
> - v/b=volume/bucket parent=-9223371931475161596 Key=0g0pustv.zip size=2333733892 time=1672052222395 checksum=null id=-9223371931408929023 keyName=a/b/c/d/e/0g0pustv.zip
> - v/b=volume/bucket parent=-9223371931475161596 Key=0nh5ww00.tar.gz size=249321741 time=1672052233487 checksum=null id=-9223371931393057791 keyName=a/b/c/d/e/0nh5ww00.tar.gz
> - v/b=volume/bucket parent=-9223371931475161596 Key=0nh5ww00.zip size=255764877 time=1672052242830 checksum=null id=-9223371931388326655 keyName=a/b/c/d/e/0nh5ww00.zip
> - v/b=volume/bucket parent=-9223371931475161596 Key=5b2uha1h.tar.gz size=2276346612 time=1672052331175 checksum=null id=-9223371931348859135 keyName=a/b/c/d/e/5b2uha1h.tar.gz
> ...
> # other keys which have the same parent do not have their prefix in the key
> - v/b=volume/bucket parent=-9223371931475161596 Key=kh7vbwlh.zip size=573797127 time=1672052273970 checksum=null id=-9223371931375503871 keyName=kh7vbwlh.zip
> - v/b=volume/bucket parent=-9223371931475161596 Key=ngaxsd8c.tar.gz size=380094900 time=1672052284433 checksum=null id=-9223371931368669695 keyName=ngaxsd8c.tar.gz
> - v/b=volume/bucket parent=-9223371931475161596 Key=ngaxsd8c.zip size=393085953 time=1672052099618 checksum=null id=-9223371931473057023 keyName=ngaxsd8c.zip
> - v/b=volume/bucket parent=-9223371931475161596 Key=nrou31c3.tar.gz size=568466718 time=1672052124043 checksum=null id=-9223371931461502975 keyName=nrou31c3.tar.gz
> - v/b=volume/bucket parent=-9223371931475161596 Key=nrou31c3.zip size=574807485 time=1672052149947 checksum=null id=-9223371931446918911 keyName=nrou31c3.zip
> - v/b=volume/bucket parent=-9223371931475161596 Key=ol8dhbqo.tar.gz size=555722830 time=1672052168904 checksum=null id=-9223371931435349759 keyName=ol8dhbqo.tar.gz {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org