You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/10/22 12:49:19 UTC

[GitHub] [hadoop-ozone] captainzmc opened a new pull request #1515: HDDS-4373. [Design] Ozone support append operation

captainzmc opened a new pull request #1515:
URL: https://github.com/apache/hadoop-ozone/pull/1515


   ## What changes were proposed in this pull request?
   
   This is a design doc, which is moved from google doc to make it easier to track the progress.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4373
   
   ## How was this patch tested?
   
   You can see all content here at: https://github.com/captainzmc/hadoop-ozone/blob/add-append-doc/hadoop-hdds/docs/content/design/append.md
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] arp7 commented on a change in pull request #1515: HDDS-4373. [Design] Ozone support append operation

Posted by GitBox <gi...@apache.org>.
arp7 commented on a change in pull request #1515:
URL: https://github.com/apache/hadoop-ozone/pull/1515#discussion_r513025813



##########
File path: hadoop-hdds/docs/content/design/append.md
##########
@@ -0,0 +1,87 @@
+---
+title: Append
+summary: Append to the existing key.
+date: 2020-10-22
+jira: HDDS-4333
+status: implementing
+author: captainzmc
+---
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+## Introduction
+This is a proposal to introduce append operation for Ozone, which will allow write data in the tail of an existing file.
+ 
+## Goals
+ OzoneClient and OzoneFS Client support append operation. 
+ While the original key is appended to the write, the key needs to be readable by other clients.  
+ After the OutputStream of the new Append operation calls close, other clients can read the new Append content. This ensures consistency of read operations.
+## Non-goals
+The operation of hflush is not within the scope of this design. Created HDDS-4353 to discuss this.
+## Related jira
+https://issues.apache.org/jira/browse/HDDS-4333
+## Implementation
+### Background conditions:
+We can't currently open a closed Container. If append generates a new block every time, the key may have many smaller blocks less than 256MB(Default block size). Too many blocks will make the DB larger and also have an impact on read performance.
+
+### Solution:
+When Append occurs, determine if the container for the last block is closed. If it's closed, we create a new block. if it's open we append data to the last block. This can avoid creating new blocks as much as possible.
+                                                                                                                                                                              
+### Request process:
+![avatar](doc-image/append.png)
+
+ 1. Client executes append key operation to OM
+
+ 2. OM checks if the key is in appendTable; if so, the key is being called by another client append. we cannot append this key at this point. If not, add the key to appendTable.

Review comment:
       Why not have the last append win?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] arp7 commented on a change in pull request #1515: HDDS-4373. [Design] Ozone support append operation

Posted by GitBox <gi...@apache.org>.
arp7 commented on a change in pull request #1515:
URL: https://github.com/apache/hadoop-ozone/pull/1515#discussion_r513026161



##########
File path: hadoop-hdds/docs/content/design/append.md
##########
@@ -0,0 +1,87 @@
+---
+title: Append
+summary: Append to the existing key.
+date: 2020-10-22
+jira: HDDS-4333
+status: implementing
+author: captainzmc
+---
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+## Introduction
+This is a proposal to introduce append operation for Ozone, which will allow write data in the tail of an existing file.
+ 
+## Goals
+ OzoneClient and OzoneFS Client support append operation. 
+ While the original key is appended to the write, the key needs to be readable by other clients.  
+ After the OutputStream of the new Append operation calls close, other clients can read the new Append content. This ensures consistency of read operations.
+## Non-goals
+The operation of hflush is not within the scope of this design. Created HDDS-4353 to discuss this.
+## Related jira
+https://issues.apache.org/jira/browse/HDDS-4333
+## Implementation
+### Background conditions:
+We can't currently open a closed Container. If append generates a new block every time, the key may have many smaller blocks less than 256MB(Default block size). Too many blocks will make the DB larger and also have an impact on read performance.
+
+### Solution:
+When Append occurs, determine if the container for the last block is closed. If it's closed, we create a new block. if it's open we append data to the last block. This can avoid creating new blocks as much as possible.
+                                                                                                                                                                              
+### Request process:
+![avatar](doc-image/append.png)
+
+ 1. Client executes append key operation to OM
+
+ 2. OM checks if the key is in appendTable; if so, the key is being called by another client append. we cannot append this key at this point. If not, add the key to appendTable.
+
+ 3. Check whether the last block of the key belongs to a closed container, if so, apply to SCM allocate a new block, if not, use the current block directly.

Review comment:
       Blocks must be immutable, we should never modify the contents of a block.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org