You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by si...@apache.org on 2022/06/23 23:00:12 UTC

[hudi] branch asf-site updated: [DOCS] Add externalized confilg file doc in latest versions (#5778)

This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 4b8fd7ce3a [DOCS] Add externalized confilg file doc in latest versions (#5778)
4b8fd7ce3a is described below

commit 4b8fd7ce3a38b74ac20aa992e871e29cda137da2
Author: Sagar Sumit <sa...@gmail.com>
AuthorDate: Fri Jun 24 04:30:05 2022 +0530

    [DOCS] Add externalized confilg file doc in latest versions (#5778)
---
 website/docs/configurations.md                          | 6 ++++++
 website/docs/quick-start-guide.md                       | 9 ++++++++-
 website/versioned_docs/version-0.10.1/configurations.md | 6 ++++++
 website/versioned_docs/version-0.11.0/configurations.md | 6 ++++++
 4 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/website/docs/configurations.md b/website/docs/configurations.md
index 2be4adef60..b92e40f06c 100644
--- a/website/docs/configurations.md
+++ b/website/docs/configurations.md
@@ -17,6 +17,12 @@ This page covers the different ways of configuring your job to write/read Hudi t
 - [**Kafka Connect Configs**](#KAFKA_CONNECT): These set of configs are used for Kafka Connect Sink Connector for writing Hudi Tables
 - [**Amazon Web Services Configs**](#AWS): Please fill in the description for Config Group Name: Amazon Web Services Configs
 
+## Externalized Config File
+Instead of directly passing configuration settings to every Hudi job, you can also centrally set them in a configuration
+file `hudi-default.conf`. By default, Hudi would load the configuration file under `/etc/hudi/conf` directory. You can
+specify a different configuration directory location by setting the `HUDI_CONF_DIR` environment variable. This can be
+useful for uniformly enforcing repeated configs (like Hive sync or write/index tuning), across your entire data lake.
+
 ## Spark Datasource Configs {#SPARK_DATASOURCE}
 These configs control the Hudi Spark Datasource, providing ability to define keys/partitioning, pick out the write operation, specify how to merge records or choosing query type to read.
 
diff --git a/website/docs/quick-start-guide.md b/website/docs/quick-start-guide.md
index 49f133c6f0..acd51ff8cc 100644
--- a/website/docs/quick-start-guide.md
+++ b/website/docs/quick-start-guide.md
@@ -495,7 +495,14 @@ select id, name, price, ts from hudi_mor_tbl;
 
 
 Checkout https://hudi.apache.org/blog/2021/02/13/hudi-key-generators for various key generator options, like Timestamp based,
-complex, custom, NonPartitioned Key gen, etc. 
+complex, custom, NonPartitioned Key gen, etc.
+
+
+:::tip
+With [externalized config file](/docs/next/configurations#externalized-config-file),
+instead of directly passing configuration settings to every Hudi job, 
+you can also centrally set them in a configuration file `hudi-default.conf`.
+:::
 
 ## Query data 
 
diff --git a/website/versioned_docs/version-0.10.1/configurations.md b/website/versioned_docs/version-0.10.1/configurations.md
index b7a6aa7291..ba2ae8aed0 100644
--- a/website/versioned_docs/version-0.10.1/configurations.md
+++ b/website/versioned_docs/version-0.10.1/configurations.md
@@ -17,6 +17,12 @@ This page covers the different ways of configuring your job to write/read Hudi t
 - [**Kafka Connect Configs**](#KAFKA_CONNECT): These set of configs are used for Kafka Connect Sink Connector for writing Hudi Tables
 - [**Amazon Web Services Configs**](#AWS): Please fill in the description for Config Group Name: Amazon Web Services Configs
 
+## Externalized Config File
+Instead of directly passing configuration settings to every Hudi job, you can also centrally set them in a configuration
+file `hudi-default.conf`. By default, Hudi would load the configuration file under `/etc/hudi/conf` directory. You can
+specify a different configuration directory location by setting the `HUDI_CONF_DIR` environment variable. This can be
+useful for uniformly enforcing repeated configs (like Hive sync or write/index tuning), across your entire data lake.
+
 ## Spark Datasource Configs {#SPARK_DATASOURCE}
 These configs control the Hudi Spark Datasource, providing ability to define keys/partitioning, pick out the write operation, specify how to merge records or choosing query type to read.
 
diff --git a/website/versioned_docs/version-0.11.0/configurations.md b/website/versioned_docs/version-0.11.0/configurations.md
index 64d906636d..2e03d9f08d 100644
--- a/website/versioned_docs/version-0.11.0/configurations.md
+++ b/website/versioned_docs/version-0.11.0/configurations.md
@@ -17,6 +17,12 @@ This page covers the different ways of configuring your job to write/read Hudi t
 - [**Kafka Connect Configs**](#KAFKA_CONNECT): These set of configs are used for Kafka Connect Sink Connector for writing Hudi Tables
 - [**Amazon Web Services Configs**](#AWS): Please fill in the description for Config Group Name: Amazon Web Services Configs
 
+## Externalized Config File
+Instead of directly passing configuration settings to every Hudi job, you can also centrally set them in a configuration
+file `hudi-default.conf`. By default, Hudi would load the configuration file under `/etc/hudi/conf` directory. You can
+specify a different configuration directory location by setting the `HUDI_CONF_DIR` environment variable. This can be
+useful for uniformly enforcing repeated configs (like Hive sync or write/index tuning), across your entire data lake.
+
 ## Spark Datasource Configs {#SPARK_DATASOURCE}
 These configs control the Hudi Spark Datasource, providing ability to define keys/partitioning, pick out the write operation, specify how to merge records or choosing query type to read.