You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/09/12 08:41:00 UTC

[jira] [Updated] (HDDS-8920) Ozone is supporting unicode volume and bucket names, potentially unintentionally

     [ https://issues.apache.org/jira/browse/HDDS-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated HDDS-8920:
---------------------------------
    Labels: newbie pull-request-available  (was: newbie)

> Ozone is supporting unicode volume and bucket names, potentially unintentionally
> --------------------------------------------------------------------------------
>
>                 Key: HDDS-8920
>                 URL: https://issues.apache.org/jira/browse/HDDS-8920
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Siyao Meng
>            Assignee: Tejaskriya Madhan
>            Priority: Major
>              Labels: newbie, pull-request-available
>
> Gabor found that because `HddsClientUtils#isSupportedCharacter` calls `Character.isLowerCase` and `Character.isDigit` which are Unicode-aware, Ozone client or Ozone Manager is not really filtering out those Unicode (non-letter) characters and can successfully pass the filter. e.g. with three [U+FF5A|https://www.compart.com/en/unicode/U+FF5A]:
> {code}
> [root@gimre-sp4-1 ~]# ozone sh volume create zzz
> 23/06/23 16:16:44 INFO rpc.RpcClient: Creating Volume: zzz, with root as owner and space quota set to -1 bytes, counts quota set to -1
> {code}
> while according to S3 [bucket naming rules|https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html] this wouldn't be allowed:
> {code}
> Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).
> {code}
> And is indeed blocked by awscli:
> {code}
> $ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem create-bucket --bucket zzz
> Parameter validation failed:
> Invalid bucket name "zzz": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$"
> $ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem delete-bucket --bucket zzz
> Parameter validation failed:
> Invalid bucket name "zzz": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$"
> $ aws --version
> aws-cli/1.15.57 Python/2.7.18 Darwin/22.5.0 botocore/1.10.56
> {code}
> TODO:
> 1. Confirm if indeed such unicode chars shall be blocked
> 2. Enhance resource name checking (for volume and bucket) on both client and server side. e.g. use regex, or use some form of normalization like [Punycode|https://www.punycoder.com/]
> 3. Mitigate impact on existing users when they already have such volumes or buckets in their systems, e.g. by making the new check optional and not enforced on older clusters when upgraded, or only disallow such Unicode chars during new volume and bucket creation (but not operations on existing volume and bucket names that has such characters)
> cc [~swamirishi] [~hemantk] [~ppogde]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org