You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/03/23 17:13:27 UTC

[GitHub] [hudi] umehrot2 opened a new pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

umehrot2 opened a new pull request #5113:
URL: https://github.com/apache/hudi/pull/5113


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   RFC for supporting a new storage layout that is optimized for cloud object stores like Amazon S3.
   
   ## Verify this pull request
   
   This pull request is just an RFC with no code changes that need verification.
   
   ## Committer checklist
   
    - [x] Has a corresponding JIRA in PR title & commit
    
    - [x] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5113:
URL: https://github.com/apache/hudi/pull/5113#issuecomment-1076594329


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7251",
       "triggerID" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59abcd3f5b8d94161ead1a8a02e27da62fc5c80f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7251) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #5113:
URL: https://github.com/apache/hudi/pull/5113#discussion_r840111001



##########
File path: rfc/README.md
##########
@@ -71,3 +71,4 @@ The list of all RFCs can be found here.
 | 45 | [Asynchronous Metadata Indexing](./rfc-45/rfc-45.md) | `UNDER REVIEW` |
 | 46 | [Optimizing Record Payload Handling](./rfc-46/rfc-46.md) | `UNDER REVIEW` |
 | 47 | [Add Call Produce Command for Spark SQL](./rfc-47/rfc-47.md) | `UNDER REVIEW` |
+| 48 | [Optimized storage layout for Cloud object stores](./rfc-48/rfc-48.md) | `UNDER REVIEW` |

Review comment:
       @umehrot2 May be call this "Federated storage layer for Hudi" ... files can be across cloud storage or even on-prem and cloud




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhedoubushishi commented on a change in pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

Posted by GitBox <gi...@apache.org>.
zhedoubushishi commented on a change in pull request #5113:
URL: https://github.com/apache/hudi/pull/5113#discussion_r833918891



##########
File path: rfc/rfc-48/rfc-48.md
##########
@@ -0,0 +1,171 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-[48]: Optimized storage layout for Cloud Object Stores
+
+## Proposers
+- @umehrot2
+
+## Approvers
+- @vinoth
+- @shivnarayan
+
+## Status
+
+JIRA: [https://issues.apache.org/jira/browse/HUDI-3625](https://issues.apache.org/jira/browse/HUDI-3625)
+
+## Abstract
+
+As you scale your Apache Hudi workloads over Cloud object stores like Amazon S3, there is potential of hitting request
+throttling limits which in-turn impacts performance. In this RFC, we are proposing to support an alternate storage
+layout that is optimized for Amazon S3 and other cloud object stores, which helps achieve maximum throughput and
+significantly reduce throttling.
+
+## Background
+
+Apache Hudi follows the traditional Hive storage layout while writing files on storage:
+- Partitioned Tables: The files are distributed across multiple physical partition folders, under the table's base path.
+- Non Partitioned Tables: The files are stored directly under the table's base path.
+
+While this storage layout scales well for HDFS, it increases the probability of hitting request throttle limits when
+working with cloud object stores like Amazon S3 and others. This is because Amazon S3 and other cloud stores [throttle
+requests based on object prefix](https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/).
+Amazon S3 does scale based on request patterns for different prefixes and adds internal partitions (with their own request limits),
+but there can be a 30 - 60 minute wait time before new partitions are created. Thus, all files/objects stored under the
+same table path prefix could result in these request limits being hit for the table prefix, specially as workloads
+scale, and there are several thousands of files being written/updated concurrently. This hurts performance due to
+re-trying of failed requests affecting throughput, and result in occasional failures if the retries are not able to
+succeed either and continue to be throttled.
+
+The high level proposal here is to introduce a new storage layout, where all files are distributed evenly across multiple
+randomly generated prefixes under the Amazon S3 bucket, instead of being stored under a common table path/prefix. This
+would help distribute the requests evenly across different prefixes, resulting in Amazon S3 to create partitions for
+the prefixes each with its own request limit. This significantly reduces the possibility of hitting the request limit
+for a specific prefix/partition.
+
+## Design
+
+### Generating file paths
+
+We want to distribute files evenly across multiple random prefixes, instead of following the traditional Hive storage
+layout of keeping them under a common table path/prefix. In addition to the `Table Path`, for this new layout user will
+configure another `Table Storage Path` under which the actual data files will be distributed. The original `Table Path` will
+be used to maintain the table/partitions Hudi metadata.
+
+For the purpose of this documentation lets assume:
+```
+Table Path => s3://<table_bucket>/<hudi_table_name>/
+
+Table Storage Path => s3://<table_storage_bucket>/
+```
+Note: `Table Storage Path` can be a path in the same Amazon S3 bucket or a different bucket. For best results,
+`Table Storage Path` should be a bucket instead of a prefix under the bucket as it allows for S3 to partition sooner.
+
+We will use a Hashing function on the `File Name` to map them to a prefix generated under `Table Storage Path`:

Review comment:
       Can I know why we don't consider hashing on partition names?

##########
File path: rfc/rfc-48/rfc-48.md
##########
@@ -0,0 +1,171 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-[48]: Optimized storage layout for Cloud Object Stores
+
+## Proposers
+- @umehrot2
+
+## Approvers
+- @vinoth
+- @shivnarayan
+
+## Status
+
+JIRA: [https://issues.apache.org/jira/browse/HUDI-3625](https://issues.apache.org/jira/browse/HUDI-3625)
+
+## Abstract
+
+As you scale your Apache Hudi workloads over Cloud object stores like Amazon S3, there is potential of hitting request
+throttling limits which in-turn impacts performance. In this RFC, we are proposing to support an alternate storage
+layout that is optimized for Amazon S3 and other cloud object stores, which helps achieve maximum throughput and
+significantly reduce throttling.
+
+## Background
+
+Apache Hudi follows the traditional Hive storage layout while writing files on storage:
+- Partitioned Tables: The files are distributed across multiple physical partition folders, under the table's base path.
+- Non Partitioned Tables: The files are stored directly under the table's base path.
+
+While this storage layout scales well for HDFS, it increases the probability of hitting request throttle limits when
+working with cloud object stores like Amazon S3 and others. This is because Amazon S3 and other cloud stores [throttle
+requests based on object prefix](https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/).
+Amazon S3 does scale based on request patterns for different prefixes and adds internal partitions (with their own request limits),
+but there can be a 30 - 60 minute wait time before new partitions are created. Thus, all files/objects stored under the

Review comment:
       Do you mean `30 - 60 minute wait time after new partitions are created`? 
   
   For my understanding, so even partitions are created in s3, s3 would still consider the table path as the prefix for computing throttling rather than the partition path?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] umehrot2 commented on a change in pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

Posted by GitBox <gi...@apache.org>.
umehrot2 commented on a change in pull request #5113:
URL: https://github.com/apache/hudi/pull/5113#discussion_r834721194



##########
File path: rfc/rfc-48/rfc-48.md
##########
@@ -0,0 +1,171 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-[48]: Optimized storage layout for Cloud Object Stores
+
+## Proposers
+- @umehrot2
+
+## Approvers
+- @vinoth
+- @shivnarayan
+
+## Status
+
+JIRA: [https://issues.apache.org/jira/browse/HUDI-3625](https://issues.apache.org/jira/browse/HUDI-3625)
+
+## Abstract
+
+As you scale your Apache Hudi workloads over Cloud object stores like Amazon S3, there is potential of hitting request
+throttling limits which in-turn impacts performance. In this RFC, we are proposing to support an alternate storage
+layout that is optimized for Amazon S3 and other cloud object stores, which helps achieve maximum throughput and
+significantly reduce throttling.
+
+## Background
+
+Apache Hudi follows the traditional Hive storage layout while writing files on storage:
+- Partitioned Tables: The files are distributed across multiple physical partition folders, under the table's base path.
+- Non Partitioned Tables: The files are stored directly under the table's base path.
+
+While this storage layout scales well for HDFS, it increases the probability of hitting request throttle limits when
+working with cloud object stores like Amazon S3 and others. This is because Amazon S3 and other cloud stores [throttle
+requests based on object prefix](https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/).
+Amazon S3 does scale based on request patterns for different prefixes and adds internal partitions (with their own request limits),
+but there can be a 30 - 60 minute wait time before new partitions are created. Thus, all files/objects stored under the

Review comment:
       The `30-60 minute` that I am talking about is how S3 creates internal partitions. This is not the same as table partitions or folders what you are referring to. Checkout https://youtu.be/rHeTn9pHNKo?t=3290 which explain this a little bit.
   
   And yes, you are right that initially S3 will treat the common table prefix to have its fixed request limits. Now, as it seems more traffic across different prefixes under the common table prefix (because of request to different partitions) it may do internal partitioning (30 - 60 minutes) to scale request limits for each of these prefixes. But this scaling is not instantaneous. The video explains this.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5113:
URL: https://github.com/apache/hudi/pull/5113#issuecomment-1076590523


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59abcd3f5b8d94161ead1a8a02e27da62fc5c80f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #5113:
URL: https://github.com/apache/hudi/pull/5113#issuecomment-1076670859


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7251",
       "triggerID" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59abcd3f5b8d94161ead1a8a02e27da62fc5c80f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7251) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5113:
URL: https://github.com/apache/hudi/pull/5113#issuecomment-1076590523


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59abcd3f5b8d94161ead1a8a02e27da62fc5c80f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #5113: [HUDI-3625] [RFC-48] Optimized storage layout for Cloud Object Stores

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #5113:
URL: https://github.com/apache/hudi/pull/5113#issuecomment-1076594329


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7251",
       "triggerID" : "59abcd3f5b8d94161ead1a8a02e27da62fc5c80f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59abcd3f5b8d94161ead1a8a02e27da62fc5c80f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7251) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org