You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "codope (via GitHub)" <gi...@apache.org> on 2023/02/15 15:05:05 UTC

[GitHub] [hudi] codope opened a new pull request, #7967: [DOCS] Update metadata indexing doc

codope opened a new pull request, #7967:
URL: https://github.com/apache/hudi/pull/7967

   ### Change Logs
   
   Update metadata indexing docs.
   
   ### Impact
   
   Docs update.
   
   ### Risk level (write none, low medium or high below)
   
   low
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on pull request #7967: [DOCS] Update metadata indexing doc

Posted by "codope (via GitHub)" <gi...@apache.org>.
codope commented on PR #7967:
URL: https://github.com/apache/hudi/pull/7967#issuecomment-1434577905

   Thanks @nfarah86 for the review. I have resolved your comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nfarah86 commented on a diff in pull request #7967: [DOCS] Update metadata indexing doc

Posted by "nfarah86 (via GitHub)" <gi...@apache.org>.
nfarah86 commented on code in PR #7967:
URL: https://github.com/apache/hudi/pull/7967#discussion_r1108055435


##########
website/docs/metadata_indexing.md:
##########
@@ -64,8 +74,23 @@ spark-submit \
 From version 0.11.0 onwards, Hudi metadata table is enabled by default and the files index will be automatically created. While the deltastreamer is running in continuous mode, let
 us schedule the indexing for COLUMN_STATS index. First we need to define a properties file for the indexer.
 
+### Configurations
+
+As mentioned before, metadata indexes are pluggable. One can add any index at any point in time depending on changing
+business requirements. Some configurations to enable particular indexes are listed below. Full set of metadata

Review Comment:
   The full set of metadata



##########
website/docs/metadata_indexing.md:
##########
@@ -64,8 +74,23 @@ spark-submit \
 From version 0.11.0 onwards, Hudi metadata table is enabled by default and the files index will be automatically created. While the deltastreamer is running in continuous mode, let
 us schedule the indexing for COLUMN_STATS index. First we need to define a properties file for the indexer.
 
+### Configurations
+
+As mentioned before, metadata indexes are pluggable. One can add any index at any point in time depending on changing
+business requirements. Some configurations to enable particular indexes are listed below. Full set of metadata
+configurations can be explored [here](/docs/configurations/#Metadata-Configs).
+
+
+|Config| Default | Description | Scope | Since Version |
+|---|---|---|---|---|
+| hoodie.metadata.enable | true | Metadata table | Set to false to disable metadata table | 0.7.0 |
+| hoodie.metadata.index.async | false | Metadata table | Enable async indexing of metadata table. | 0.11.0 |
+| hoodie.metadata.index.column.stats.enable | false | Metadata table | Enable indexing column ranges of user data files under metadata table key lookups | 0.11.0 |
+| hoodie.metadata.index.bloom.filter.enable | false | Metadata table | Enable indexing bloom filters of user data files under metadata table | 0.11.0 |
+
 :::note
-Enabling metadata table and configuring a lock provider are the prerequisites for using async indexer.
+Enabling metadata table and configuring a lock provider are the prerequisites for using async indexer. Checkout a sample

Review Comment:
   Enabling the metadata



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope merged pull request #7967: [DOCS] Update metadata indexing doc

Posted by "codope (via GitHub)" <gi...@apache.org>.
codope merged PR #7967:
URL: https://github.com/apache/hudi/pull/7967


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org