You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/09/21 14:44:49 UTC

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1419: HDDS-3755. [DESIGN] Storage-class for Ozone

elek commented on a change in pull request #1419:
URL: https://github.com/apache/hadoop-ozone/pull/1419#discussion_r492104772



##########
File path: hadoop-hdds/docs/content/design/storage-class.md
##########
@@ -19,10 +19,331 @@ author: Marton Ele
   See the License for the specific language governing permissions and
   limitations under the License. See accompanying LICENSE file.
 -->
+
+
 # Abstract
 
-Proposal suggest to introduce a new storage-class abstraction which can be used to define different replication strategies (factor, type, ...) for different bucket/keys.
+One of the fundamental abstraction of Ozone is the _Container_ which used as the unit of the replication.
+
+Containers have to favors: _Open_ and _Closed_ containers: Open containers are replicated by Ratis and writable, Closed containers are replicated with data copy and read only.
+
+In this document a new level of abstraction is proposed: the *storage class* which defines which type of containers should be used and what type of transitions are supported.
+
+# Goals / Use cases
+
+## [USER] Simplify user interface and improve usability
+
+Users can choose from an admin provided set of storage classes (for example `STANDARD`, `REDUCED`) instead of using implementation specific terms (`RATIS/THREE`, `RATIS/ONE`)
+
+Today the users should use implementation spefific terms when key is created:
+
+```
+ozone sh key put --replication=THREE --type=RATIS /vol1/bucket1/key1 source-file.txt
+```
+
+There are two problems here:
+
+ 1. User should use low-level, technical terms during the usage. User might not know what is `RATIS` and may not have enough information to decide the right replication scheme.
+
+ 2. The current keys are only for the *open* containers. There is no easy way to add configuration which can be used later during the lifecycle of containers/keys. (For example to support `Ratis/THREE` --> `Ratis/TWO`)
+
+With the storage-class abstraction the complexity of configuration can be moved to the admin side (with more flexibility). And user should choose only from the available storage-classes (or use the default one).
+
+Instead of the earlier CLI this document proposes to use an abstract storage-class parameter instead:
+
+```
+ozone sh key put --storage-class=STANDARD /vol1/bucket1/key1 source-file.txt
+```
+
+## [USER] Set a custom replication for a newly created bucket
+
+A user may want to set a custom replication for bucket at the time of creation. All keys in the bucket will respect the specified storage class (subject to storage and quota availability). E.g.
+
+```
+ozone sh bucket create --storage-class=INFREQUENT_ACCESS
+```
+
+
+Bucket-level default storage-class can be overridden for ay key, but will be used as default.
+
+
+## [USER] Fine grained replication control when using S3 API
+
+A user may want to set custom replication policies for any key **which uploaded via S3 API**. Storage-classes are already used by AWS S3 API. With first-class support of the same concept in Ozone users can choose from the predefined storage-classes (=replication rules) with using AWS API:
+
+
+```
+aws s3 cp --storage-class=REDUCED file1 s3://bucket/file1
+```
+
+
+## [USER] Set the replication for a specific prefix
+
+A user may want to set a custom replication for a specific key prefix. All keys matching that prefix will respect the specified storage class. This operation will not affect keys already in the prefix (question: consider supporting this with data movement?)
+

Review comment:
       Good question. storage-class is assigned to containers and keys, so keys already have storage-classes, but might be the default. To start with I think we should enable to set default storage class only when the prefix/bucket is created.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org