You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "nsivabalan (via GitHub)" <gi...@apache.org> on 2023/02/14 07:30:27 UTC

[GitHub] [hudi] nsivabalan opened a new pull request, #7939: [MINOR] Updating Index page to include bucket and consistent hashing index

nsivabalan opened a new pull request, #7939:
URL: https://github.com/apache/hudi/pull/7939

   ### Change Logs
   
   Updating Index page to include bucket and consistent hashing index
   
   ### Impact
   
   Updating Index page to include bucket and consistent hashing index
   
   ### Risk level (write none, low medium or high below)
   
   none
   
   ### Documentation Update
   
   this is a doc update
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [MINOR] Updating Index page to include bucket and consistent hashing index [hudi]

Posted by "yihua (via GitHub)" <gi...@apache.org>.
yihua commented on PR #7939:
URL: https://github.com/apache/hudi/pull/7939#issuecomment-1975393613

   Bucket and consistent hashing index are already added to the "Indexing" page.  Closing this PR.
   <img width="992" alt="Screenshot 2024-03-03 at 15 15 59" src="https://github.com/apache/hudi/assets/2497195/779d663b-3418-4f42-b155-d19fcdbe4050">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [MINOR] Updating Index page to include bucket and consistent hashing index [hudi]

Posted by "yihua (via GitHub)" <gi...@apache.org>.
yihua closed pull request #7939: [MINOR] Updating Index page to include bucket and consistent hashing index
URL: https://github.com/apache/hudi/pull/7939


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [MINOR] Updating Index page to include bucket and consistent hashing index [hudi]

Posted by "bvaradar (via GitHub)" <gi...@apache.org>.
bvaradar commented on PR #7939:
URL: https://github.com/apache/hudi/pull/7939#issuecomment-1848777706

   @nsivabalan : Should we close this old doc PR ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on a diff in pull request #7939: [MINOR] Updating Index page to include bucket and consistent hashing index

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on code in PR #7939:
URL: https://github.com/apache/hudi/pull/7939#discussion_r1105401313


##########
website/docs/indexing.md:
##########
@@ -27,6 +27,13 @@ Currently, Hudi supports the following indexing options.
 - **HBase Index:** Manages the index mapping in an external Apache HBase table.
 - **Bring your own implementation:** You can extend this [public API](https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndex.java) 
 to implement custom indexing.
+- **Bucket Index:** An efficient and light weight index where file groups are located based on hash of record keys. Index look up is O(1), since there 
+is no index lookup latency. But users might need to allocate the number of buckets per partition upfront as it needs to be statically allocated. This index 
+type is best suited for small to medium scale dataset where data is evenly distributed across all partitions and total data per partition is known upfront
+to some ballpark estimate. This index type is also available with Flink writes. 
+- **Consistent Hashing Index:** This is an advanced version of the Bucket Index, where the buckets could scale up or shrink down based on the load per 
+partition. Users have to declare configurations like min buckets, max buckets and how dynamic scale up and shrink down of buckets will happen. This index 
+is available only with MOR table and has some limitations. Please check 0.13.0 release highlights[ADD link] for more details. 

Review Comment:
   once 0.13.0 release highlights is landed, need to add a link here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org