You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by bh...@apache.org on 2019/11/14 14:49:08 UTC

[incubator-hudi] branch asf-site updated (a042fb9 -> 98591dd)

This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


    from a042fb9  [SITE] change footer text-align (#1016)
     new b57bbd0  [MINOR] Cosmetic improvements to site
     new 98591dd  [DOCS] Updating site with latest doc changes

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/404.html                                   |   4 +-
 content/admin_guide.html                           |  80 ++--
 content/cn/404.html                                |   2 +-
 content/cn/admin_guide.html                        |  78 ++--
 content/cn/community.html                          |   2 +-
 content/cn/comparison.html                         |   2 +-
 content/cn/concepts.html                           | 242 +++++-----
 content/cn/configurations.html                     | 486 +++++++++++----------
 content/cn/contributing.html                       |   2 +-
 content/cn/docker_demo.html                        |  98 ++---
 content/cn/events/2016-12-30-strata-talk-2017.html |  10 +-
 content/cn/events/2019-01-18-asf-incubation.html   |  10 +-
 content/cn/gcs_hoodie.html                         |   2 +-
 content/cn/index.html                              |  10 +-
 content/cn/migration_guide.html                    |  16 +-
 content/cn/news.html                               |  29 +-
 content/cn/news_archive.html                       |   2 +-
 content/cn/performance.html                        |  51 +--
 content/cn/powered_by.html                         |   2 +-
 content/cn/privacy.html                            |   2 +-
 content/cn/querying_data.html                      | 186 ++++----
 content/cn/quickstart.html                         |  54 +--
 content/cn/s3_hoodie.html                          |  64 +--
 content/cn/use_cases.html                          |   2 +-
 content/cn/writing_data.html                       |  26 +-
 content/community.html                             |   4 +-
 content/comparison.html                            |   4 +-
 content/concepts.html                              |   4 +-
 content/configurations.html                        |  12 +-
 content/contributing.html                          |   4 +-
 content/css/lavish-bootstrap.css                   |   7 +-
 content/docker_demo.html                           | 130 +++---
 content/events/2016-12-30-strata-talk-2017.html    |  12 +-
 content/events/2019-01-18-asf-incubation.html      |  12 +-
 content/feed.xml                                   |   8 +-
 content/gcs_hoodie.html                            |   4 +-
 content/index.html                                 |  12 +-
 content/js/mydoc_scroll.html                       |  10 +-
 content/migration_guide.html                       |  22 +-
 content/news.html                                  |  29 +-
 content/news_archive.html                          |   5 +-
 content/performance.html                           |   4 +-
 content/powered_by.html                            |   4 +-
 content/privacy.html                               |   4 +-
 content/querying_data.html                         |  20 +-
 content/quickstart.html                            |  60 +--
 content/releases.html                              |   2 +-
 content/s3_hoodie.html                             |  66 +--
 content/search.json                                |  58 ++-
 content/sitemap.xml                                |  92 ++--
 content/use_cases.html                             |   4 +-
 content/writing_data.html                          |  28 +-
 docs/_data/sidebars/mydoc_sidebar.yml              |   4 +-
 docs/admin_guide.cn.md                             |  38 +-
 docs/admin_guide.md                                |  38 +-
 docs/configurations.cn.md                          |   4 +-
 docs/configurations.md                             |   4 +-
 docs/css/lavish-bootstrap.css                      |   7 +-
 docs/docker_demo.cn.md                             |  44 +-
 docs/docker_demo.md                                |  66 ++-
 docs/migration_guide.cn.md                         |   9 +-
 docs/migration_guide.md                            |  15 +-
 docs/querying_data.cn.md                           |   8 +-
 docs/querying_data.md                              |   8 +-
 docs/quickstart.cn.md                              |  24 +-
 docs/quickstart.md                                 |  25 +-
 docs/s3_filesystem.cn.md                           |   4 +-
 docs/s3_filesystem.md                              |   4 +-
 docs/writing_data.cn.md                            |  12 +-
 docs/writing_data.md                               |  12 +-
 70 files changed, 1154 insertions(+), 1255 deletions(-)


[incubator-hudi] 01/02: [MINOR] Cosmetic improvements to site

Posted by bh...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git

commit b57bbd0228a17b3571708f17c88ceed192d18b21
Author: vinothchandar <vi...@apache.org>
AuthorDate: Thu Nov 14 06:01:34 2019 -0800

    [MINOR] Cosmetic improvements to site
    
     - Clearer code highlighting, using black font, light blue background
     - Fix version number
     - Fix nav header scroll,scrambled characters issue
---
 docs/_data/sidebars/mydoc_sidebar.yml |  4 +--
 docs/admin_guide.cn.md                | 38 ++++++++++----------
 docs/admin_guide.md                   | 38 ++++++++++----------
 docs/configurations.cn.md             |  4 +--
 docs/configurations.md                |  4 +--
 docs/css/lavish-bootstrap.css         |  7 ++--
 docs/docker_demo.cn.md                | 44 +++++++++++------------
 docs/docker_demo.md                   | 66 +++++++++++++++--------------------
 docs/migration_guide.cn.md            |  9 +++--
 docs/migration_guide.md               | 15 ++++----
 docs/querying_data.cn.md              |  8 ++---
 docs/querying_data.md                 |  8 ++---
 docs/quickstart.cn.md                 | 24 ++++++-------
 docs/quickstart.md                    | 25 +++++++------
 docs/s3_filesystem.cn.md              |  4 +--
 docs/s3_filesystem.md                 |  4 +--
 docs/writing_data.cn.md               | 12 +++----
 docs/writing_data.md                  | 12 +++----
 18 files changed, 161 insertions(+), 165 deletions(-)

diff --git a/docs/_data/sidebars/mydoc_sidebar.yml b/docs/_data/sidebars/mydoc_sidebar.yml
index 9e4ec1e..040c4c4 100644
--- a/docs/_data/sidebars/mydoc_sidebar.yml
+++ b/docs/_data/sidebars/mydoc_sidebar.yml
@@ -2,8 +2,8 @@
 
 entries:
 - title: sidebar
-  product: Latest Version
-  version: 0.5.0-incubating
+  product: Version
+  version: (0.5.0-incubating)
   folders:
 
   - title: Getting Started
diff --git a/docs/admin_guide.cn.md b/docs/admin_guide.cn.md
index 2ba04a5..9e4f542 100644
--- a/docs/admin_guide.cn.md
+++ b/docs/admin_guide.cn.md
@@ -23,7 +23,7 @@ Hudi库使用.hoodie子文件夹跟踪所有元数据,从而有效地在内部
 
 初始化hudi表,可使用如下命令。
 
-```
+```Java
 18/09/06 15:56:52 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
 ============================================
 *                                          *
@@ -44,7 +44,7 @@ hudi->create --path /user/hive/warehouse/table1 --tableName hoodie_table_1 --tab
 
 To see the description of hudi table, use the command:
 
-```
+```Java
 hoodie:hoodie_table_1->desc
 18/09/06 15:57:19 INFO timeline.HoodieActiveTimeline: Loaded instants []
     _________________________________________________________
@@ -60,7 +60,7 @@ hoodie:hoodie_table_1->desc
 
 以下是连接到包含uber trips的Hudi数据集的示例命令。
 
-```
+```Java
 hoodie:trips->connect --path /app/uber/trips
 
 16/10/05 23:20:37 INFO model.HoodieTableMetadata: Attempting to load the commits under /app/uber/trips/.hoodie with suffix .commit
@@ -73,7 +73,7 @@ hoodie:trips->
 连接到数据集后,便可使用许多其他命令。该shell程序具有上下文自动完成帮助(按TAB键),下面是所有命令的列表,本节中对其中的一些命令进行了详细示例。
 
 
-```
+```Java
 hoodie:trips->help
 * ! - Allows execution of operating system (OS) commands
 * // - Inline comment markers (start of line only)
@@ -114,7 +114,7 @@ hoodie:trips->
 查看有关最近10次提交的一些基本信息,
 
 
-```
+```Java
 hoodie:trips->commits show --sortBy "Total Bytes Written" --desc true --limit 10
     ________________________________________________________________________________________________________________________________________________________________________
     | CommitTime    | Total Bytes Written| Total Files Added| Total Files Updated| Total Partitions Written| Total Records Written| Total Update Records Written| Total Errors|
@@ -127,7 +127,7 @@ hoodie:trips->
 
 在每次写入开始时,Hudi还将.inflight提交写入.hoodie文件夹。您可以使用那里的时间戳来估计正在进行的提交已经花费的时间
 
-```
+```Java
 $ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
 -rw-r--r--   3 vinoth supergroup     321984 2016-10-05 23:18 /app/uber/trips/.hoodie/20161005225920.inflight
 ```
@@ -138,7 +138,7 @@ $ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
 了解写入如何分散到特定分区,
 
 
-```
+```Java
 hoodie:trips->commit showpartitions --commit 20161005165855 --sortBy "Total Bytes Written" --desc true --limit 10
     __________________________________________________________________________________________________________________________________________
     | Partition Path| Total Files Added| Total Files Updated| Total Records Inserted| Total Records Updated| Total Bytes Written| Total Errors|
@@ -149,7 +149,7 @@ hoodie:trips->commit showpartitions --commit 20161005165855 --sortBy "Total Byte
 
 如果您需要文件级粒度,我们可以执行以下操作
 
-```
+```Java
 hoodie:trips->commit showfiles --commit 20161005165855 --sortBy "Partition Path"
     ________________________________________________________________________________________________________________________________________________________
     | Partition Path| File ID                             | Previous Commit| Total Records Updated| Total Records Written| Total Bytes Written| Total Errors|
@@ -163,7 +163,7 @@ hoodie:trips->commit showfiles --commit 20161005165855 --sortBy "Partition Path"
 
 Hudi将每个分区视为文件组的集合,每个文件组包含按提交顺序排列的文件切片列表(请参阅概念)。以下命令允许用户查看数据集的文件切片。
 
-```
+```Java
  hoodie:stock_ticks_mor->show fsview all
  ....
   _______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
@@ -189,7 +189,7 @@ Hudi将每个分区视为文件组的集合,每个文件组包含按提交顺
 由于Hudi直接管理DFS数据集的文件大小,这些信息会帮助你全面了解Hudi的运行状况
 
 
-```
+```Java
 hoodie:trips->stats filesizes --partitionPath 2016/09/01 --sortBy "95th" --desc true --limit 10
     ________________________________________________________________________________________________
     | CommitTime    | Min     | 10th    | 50th    | avg     | 95th    | Max     | NumFiles| StdDev  |
@@ -201,7 +201,7 @@ hoodie:trips->stats filesizes --partitionPath 2016/09/01 --sortBy "95th" --desc
 
 如果Hudi写入花费的时间更长,那么可以通过观察写放大指标来发现任何异常
 
-```
+```Java
 hoodie:trips->stats wa
     __________________________________________________________________________
     | CommitTime    | Total Upserted| Total Written| Write Amplifiation Factor|
@@ -220,7 +220,7 @@ hoodie:trips->stats wa
 
 要了解压缩和写程序之间的时滞,请使用以下命令列出所有待处理的压缩。
 
-```
+```Java
 hoodie:trips->compactions show all
      ___________________________________________________________________
     | Compaction Instant Time| State    | Total FileIds to be Compacted|
@@ -231,7 +231,7 @@ hoodie:trips->compactions show all
 
 要检查特定的压缩计划,请使用
 
-```
+```Java
 hoodie:trips->compaction show --instant <INSTANT_1>
     _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
     | Partition Path| File Id | Base Instant  | Data File Path                                    | Total Delta Files| getMetrics                                                                                                                    |
@@ -243,7 +243,7 @@ hoodie:trips->compaction show --instant <INSTANT_1>
 要手动调度或运行压缩,请使用以下命令。该命令使用spark启动器执行压缩操作。
 注意:确保没有其他应用程序正在同时调度此数据集的压缩
 
-```
+```Java
 hoodie:trips->help compaction schedule
 Keyword:                   compaction schedule
 Description:               Schedule Compaction
@@ -256,7 +256,7 @@ Description:               Schedule Compaction
 * compaction schedule - Schedule Compaction
 ```
 
-```
+```Java
 hoodie:trips->help compaction run
 Keyword:                   compaction run
 Description:               Run Compaction for given instant time
@@ -303,7 +303,7 @@ Description:               Run Compaction for given instant time
 
 验证压缩计划:检查压缩所需的所有文件是否都存在且有效
 
-```
+```Java
 hoodie:stock_ticks_mor->compaction validate --instant 20181005222611
 ...
 
@@ -336,7 +336,7 @@ hoodie:stock_ticks_mor->compaction validate --instant 20181005222601
 
 ##### 取消调度压缩
 
-```
+```Java
 hoodie:trips->compaction unscheduleFileId --fileId <FileUUID>
 ....
 No File renames needed to unschedule file from pending compaction. Operation successful.
@@ -344,7 +344,7 @@ No File renames needed to unschedule file from pending compaction. Operation suc
 
 在其他情况下,需要撤销整个压缩计划。以下CLI支持此功能
 
-```
+```Java
 hoodie:trips->compaction unschedule --compactionInstant <compactionInstant>
 .....
 No File renames needed to unschedule pending compaction. Operation successful.
@@ -357,7 +357,7 @@ No File renames needed to unschedule pending compaction. Operation successful.
 当您运行`压缩验证`时,您会注意到无效的压缩操作(如果有的话)。
 在这种情况下,修复命令将立即执行,它将重新排列文件切片,以使文件不丢失,并且文件切片与压缩计划一致
 
-```
+```Java
 hoodie:stock_ticks_mor->compaction repair --instant 20181005222611
 ......
 Compaction successfully repaired
diff --git a/docs/admin_guide.md b/docs/admin_guide.md
index 96ff639..4d267cb 100644
--- a/docs/admin_guide.md
+++ b/docs/admin_guide.md
@@ -23,7 +23,7 @@ Hudi library effectively manages this dataset internally, using .hoodie subfolde
 
 To initialize a hudi table, use the following command.
 
-```
+```Java
 18/09/06 15:56:52 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
 ============================================
 *                                          *
@@ -44,7 +44,7 @@ hudi->create --path /user/hive/warehouse/table1 --tableName hoodie_table_1 --tab
 
 To see the description of hudi table, use the command:
 
-```
+```Java
 hoodie:hoodie_table_1->desc
 18/09/06 15:57:19 INFO timeline.HoodieActiveTimeline: Loaded instants []
     _________________________________________________________
@@ -60,7 +60,7 @@ hoodie:hoodie_table_1->desc
 
 Following is a sample command to connect to a Hudi dataset contains uber trips.
 
-```
+```Java
 hoodie:trips->connect --path /app/uber/trips
 
 16/10/05 23:20:37 INFO model.HoodieTableMetadata: Attempting to load the commits under /app/uber/trips/.hoodie with suffix .commit
@@ -74,7 +74,7 @@ Once connected to the dataset, a lot of other commands become available. The she
 are reviewed
 
 
-```
+```Java
 hoodie:trips->help
 * ! - Allows execution of operating system (OS) commands
 * // - Inline comment markers (start of line only)
@@ -115,7 +115,7 @@ Each commit has a monotonically increasing string/number called the **commit num
 To view some basic information about the last 10 commits,
 
 
-```
+```Java
 hoodie:trips->commits show --sortBy "Total Bytes Written" --desc true --limit 10
     ________________________________________________________________________________________________________________________________________________________________________
     | CommitTime    | Total Bytes Written| Total Files Added| Total Files Updated| Total Partitions Written| Total Records Written| Total Update Records Written| Total Errors|
@@ -129,7 +129,7 @@ hoodie:trips->
 At the start of each write, Hudi also writes a .inflight commit to the .hoodie folder. You can use the timestamp there to estimate how long the commit has been inflight
 
 
-```
+```Java
 $ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
 -rw-r--r--   3 vinoth supergroup     321984 2016-10-05 23:18 /app/uber/trips/.hoodie/20161005225920.inflight
 ```
@@ -140,7 +140,7 @@ $ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
 To understand how the writes spread across specific partiions,
 
 
-```
+```Java
 hoodie:trips->commit showpartitions --commit 20161005165855 --sortBy "Total Bytes Written" --desc true --limit 10
     __________________________________________________________________________________________________________________________________________
     | Partition Path| Total Files Added| Total Files Updated| Total Records Inserted| Total Records Updated| Total Bytes Written| Total Errors|
@@ -152,7 +152,7 @@ hoodie:trips->commit showpartitions --commit 20161005165855 --sortBy "Total Byte
 If you need file level granularity , we can do the following
 
 
-```
+```Java
 hoodie:trips->commit showfiles --commit 20161005165855 --sortBy "Partition Path"
     ________________________________________________________________________________________________________________________________________________________
     | Partition Path| File ID                             | Previous Commit| Total Records Updated| Total Records Written| Total Bytes Written| Total Errors|
@@ -167,7 +167,7 @@ hoodie:trips->commit showfiles --commit 20161005165855 --sortBy "Partition Path"
 Hudi views each partition as a collection of file-groups with each file-group containing a list of file-slices in commit
 order (See Concepts). The below commands allow users to view the file-slices for a data-set.
 
-```
+```Java
  hoodie:stock_ticks_mor->show fsview all
  ....
   _______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
@@ -193,7 +193,7 @@ order (See Concepts). The below commands allow users to view the file-slices for
 Since Hudi directly manages file sizes for DFS dataset, it might be good to get an overall picture
 
 
-```
+```Java
 hoodie:trips->stats filesizes --partitionPath 2016/09/01 --sortBy "95th" --desc true --limit 10
     ________________________________________________________________________________________________
     | CommitTime    | Min     | 10th    | 50th    | avg     | 95th    | Max     | NumFiles| StdDev  |
@@ -206,7 +206,7 @@ hoodie:trips->stats filesizes --partitionPath 2016/09/01 --sortBy "95th" --desc
 In case of Hudi write taking much longer, it might be good to see the write amplification for any sudden increases
 
 
-```
+```Java
 hoodie:trips->stats wa
     __________________________________________________________________________
     | CommitTime    | Total Upserted| Total Written| Write Amplifiation Factor|
@@ -227,7 +227,7 @@ This is a sequence file that contains a mapping from commitNumber => json with r
 To get an idea of the lag between compaction and writer applications, use the below command to list down all
 pending compactions.
 
-```
+```Java
 hoodie:trips->compactions show all
      ___________________________________________________________________
     | Compaction Instant Time| State    | Total FileIds to be Compacted|
@@ -238,7 +238,7 @@ hoodie:trips->compactions show all
 
 To inspect a specific compaction plan, use
 
-```
+```Java
 hoodie:trips->compaction show --instant <INSTANT_1>
     _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
     | Partition Path| File Id | Base Instant  | Data File Path                                    | Total Delta Files| getMetrics                                                                                                                    |
@@ -250,7 +250,7 @@ hoodie:trips->compaction show --instant <INSTANT_1>
 To manually schedule or run a compaction, use the below command. This command uses spark launcher to perform compaction
 operations. NOTE : Make sure no other application is scheduling compaction for this dataset concurrently
 
-```
+```Java
 hoodie:trips->help compaction schedule
 Keyword:                   compaction schedule
 Description:               Schedule Compaction
@@ -263,7 +263,7 @@ Description:               Schedule Compaction
 * compaction schedule - Schedule Compaction
 ```
 
-```
+```Java
 hoodie:trips->help compaction run
 Keyword:                   compaction run
 Description:               Run Compaction for given instant time
@@ -310,7 +310,7 @@ Description:               Run Compaction for given instant time
 
 Validating a compaction plan : Check if all the files necessary for compactions are present and are valid
 
-```
+```Java
 hoodie:stock_ticks_mor->compaction validate --instant 20181005222611
 ...
 
@@ -344,7 +344,7 @@ so that are preserved. Hudi provides the following CLI to support it
 
 ##### UnScheduling Compaction
 
-```
+```Java
 hoodie:trips->compaction unscheduleFileId --fileId <FileUUID>
 ....
 No File renames needed to unschedule file from pending compaction. Operation successful.
@@ -352,7 +352,7 @@ No File renames needed to unschedule file from pending compaction. Operation suc
 
 In other cases, an entire compaction plan needs to be reverted. This is supported by the following CLI
 
-```
+```Java
 hoodie:trips->compaction unschedule --compactionInstant <compactionInstant>
 .....
 No File renames needed to unschedule pending compaction. Operation successful.
@@ -366,7 +366,7 @@ partial failures, the compaction operation could become inconsistent with the st
 command comes to the rescue, it will rearrange the file-slices so that there is no loss and the file-slices are
 consistent with the compaction plan
 
-```
+```Java
 hoodie:stock_ticks_mor->compaction repair --instant 20181005222611
 ......
 Compaction successfully repaired
diff --git a/docs/configurations.cn.md b/docs/configurations.cn.md
index 8dcb34a..7b7397d 100644
--- a/docs/configurations.cn.md
+++ b/docs/configurations.cn.md
@@ -37,7 +37,7 @@ summary: 在这里,我们列出了所有可能的配置及其含义。
 
 另外,您可以使用`options()`或`option(k,v)`方法直接传递任何WriteClient级别的配置。
 
-```
+```Java
 inputDF.write()
 .format("org.apache.hudi")
 .options(clientOpts) // 任何Hudi客户端选项都可以传入
@@ -159,7 +159,7 @@ inputDF.write()
 直接使用RDD级别api进行编程的Jobs可以构建一个`HoodieWriteConfig`对象,并将其传递给`HoodieWriteClient`构造函数。
 HoodieWriteConfig可以使用以下构建器模式构建。
 
-```
+```Java
 HoodieWriteConfig cfg = HoodieWriteConfig.newBuilder()
         .withPath(basePath)
         .forTable(tableName)
diff --git a/docs/configurations.md b/docs/configurations.md
index 3f16e3b..3e303c1 100644
--- a/docs/configurations.md
+++ b/docs/configurations.md
@@ -39,7 +39,7 @@ The actual datasource level configs are listed below.
 
 Additionally, you can pass down any of the WriteClient level configs directly using `options()` or `option(k,v)` methods.
 
-```
+```Java
 inputDF.write()
 .format("org.apache.hudi")
 .options(clientOpts) // any of the Hudi client opts can be passed in as well
@@ -164,7 +164,7 @@ Property: `hoodie.datasource.read.end.instanttime`, Default: latest instant (i.e
 Jobs programming directly against the RDD level apis can build a `HoodieWriteConfig` object and pass it in to the `HoodieWriteClient` constructor. 
 HoodieWriteConfig can be built using a builder pattern as below. 
 
-```
+```Java
 HoodieWriteConfig cfg = HoodieWriteConfig.newBuilder()
         .withPath(basePath)
         .forTable(tableName)
diff --git a/docs/css/lavish-bootstrap.css b/docs/css/lavish-bootstrap.css
index a050c9a..6a0f52f 100644
--- a/docs/css/lavish-bootstrap.css
+++ b/docs/css/lavish-bootstrap.css
@@ -600,7 +600,7 @@ code {
   padding: 2px 4px;
   font-size: 90%;
   color: #444;
-  background-color: #f0f0f0;
+  background-color: #04b3f90d;
   white-space: nowrap;
   border-radius: 4px;
 }
@@ -613,8 +613,8 @@ pre {
   line-height: 1.428571429;
   word-break: break-all;
   word-wrap: break-word;
-  color: #77777a;
-  background-color: #f5f5f5;
+  color: #000000;
+  background-color: #04b3f90d;
   border: 1px solid #cccccc;
   border-radius: 4px;
 }
@@ -3730,6 +3730,7 @@ textarea.input-group-sm > .input-group-btn > .btn {
   }
   .navbar-right {
     float: right !important;
+    background-color: white;
   }
 }
 .navbar-form {
diff --git a/docs/docker_demo.cn.md b/docs/docker_demo.cn.md
index 6f3d72b..83868fb 100644
--- a/docs/docker_demo.cn.md
+++ b/docs/docker_demo.cn.md
@@ -23,7 +23,7 @@ The steps have been tested on a Mac laptop
   * /etc/hosts : The demo references many services running in container by the hostname. Add the following settings to /etc/hosts
 
 
-```
+```Java
    127.0.0.1 adhoc-1
    127.0.0.1 adhoc-2
    127.0.0.1 namenode
@@ -44,7 +44,7 @@ Also, this has not been tested on some environments like Docker on Windows.
 #### Build Hudi
 
 The first step is to build hudi
-```
+```Java
 cd <HUDI_WORKSPACE>
 mvn package -DskipTests
 ```
@@ -54,7 +54,7 @@ mvn package -DskipTests
 The next step is to run the docker compose script and setup configs for bringing up the cluster.
 This should pull the docker images from docker hub and setup docker cluster.
 
-```
+```Java
 cd docker
 ./setup_demo.sh
 ....
@@ -107,7 +107,7 @@ The batches are windowed intentionally so that the second batch contains updates
 
 Upload the first batch to Kafka topic 'stock ticks'
 
-```
+```Java
 cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P
 
 To check if the new topic shows up, use
@@ -158,7 +158,7 @@ pull changes and apply to Hudi dataset using upsert/insert primitives. Here, we
 json data from kafka topic and ingest to both COW and MOR tables we initialized in the previous step. This tool
 automatically initializes the datasets in the file-system if they do not exist yet.
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 
 # Run the following spark-submit command to execute the delta-streamer and ingest to stock_ticks_cow dataset in HDFS
@@ -198,7 +198,7 @@ There will be a similar setup when you browse the MOR dataset
 At this step, the datasets are available in HDFS. We need to sync with Hive to create new Hive tables and add partitions
 inorder to run Hive queries against those datasets.
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 
 # THis command takes in HIveServer URL and COW Hudi Dataset location in HDFS and sync the HDFS state to Hive
@@ -229,7 +229,7 @@ Run a hive query to find the latest timestamp ingested for stock symbol 'GOOG'.
 (for both COW and MOR dataset)and realtime views (for MOR dataset)give the same value "10:29 a.m" as Hudi create a
 parquet file for the first batch of data.
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 # List Tables
@@ -332,7 +332,7 @@ exit
 Hudi support Spark as query processor just like Hive. Here are the same hive queries
 running in spark-sql
 
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --master local[2] --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 ...
@@ -432,7 +432,7 @@ scala> spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, close
 Upload the second batch of data and ingest this batch using delta-streamer. As this batch does not bring in any new
 partitions, there is no need to run hive-sync
 
-```
+```Java
 cat docker/demo/data/batch_2.json | kafkacat -b kafkabroker -t stock_ticks -P
 
 # Within Docker container, run the ingestion command
@@ -464,7 +464,7 @@ This is the time, when ReadOptimized and Realtime views will provide different r
 return "10:29 am" as it will only read from the Parquet file. Realtime View will do on-the-fly merge and return
 latest committed data which is "10:59 a.m".
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
@@ -535,7 +535,7 @@ exit
 
 Running the same queries in Spark-SQL:
 
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 
@@ -605,7 +605,7 @@ With 2 batches of data ingested, lets showcase the support for incremental queri
 
 Lets take the same projection query example
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
@@ -629,7 +629,7 @@ the commit time of the first batch (20180924064621) and run incremental query
 Hudi incremental mode provides efficient scanning for incremental queries by filtering out files that do not have any
 candidate rows using hudi-managed metadata.
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 0: jdbc:hive2://hiveserver:10000> set hoodie.stock_ticks_cow.consume.mode=INCREMENTAL;
@@ -642,7 +642,7 @@ No rows affected (0.009 seconds)
 With the above setting, file-ids that do not have any updates from the commit 20180924065039 is filtered out without scanning.
 Here is the incremental query :
 
-```
+```Java
 0: jdbc:hive2://hiveserver:10000>
 0: jdbc:hive2://hiveserver:10000> select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_cow where  symbol = 'GOOG' and `_hoodie_commit_time` > '20180924064621';
 +----------------------+---------+----------------------+---------+------------+-----------+--+
@@ -655,7 +655,7 @@ Here is the incremental query :
 ```
 
 ##### Incremental Query with Spark SQL:
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 Welcome to
@@ -697,7 +697,7 @@ scala> spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, close
 Lets schedule and run a compaction to create a new version of columnar  file so that read-optimized readers will see fresher data.
 Again, You can use Hudi CLI to manually schedule and run compaction
 
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 root@adhoc-1:/opt#   /var/hoodie/ws/hudi-cli/hudi-cli.sh
 ============================================
@@ -790,7 +790,7 @@ Lets also run the incremental query for MOR table.
 From looking at the below query output, it will be clear that the fist commit time for the MOR table is 20180924064636
 and the second commit time is 20180924070031
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
@@ -851,7 +851,7 @@ exit
 
 ##### Read Optimized and Realtime Views for MOR with Spark-SQL after compaction
 
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 
@@ -895,7 +895,7 @@ This brings the demo to an end.
 ## Testing Hudi in Local Docker environment
 
 You can bring up a hadoop docker environment containing Hadoop, Hive and Spark services with support for hudi.
-```
+```Java
 $ mvn pre-integration-test -DskipTests
 ```
 The above command builds docker images for all the services with
@@ -903,13 +903,13 @@ current Hudi source installed at /var/hoodie/ws and also brings up the services
 currently use Hadoop (v2.8.4), Hive (v2.3.3) and Spark (v2.3.1) in docker images.
 
 To bring down the containers
-```
+```Java
 $ cd hudi-integ-test
 $ mvn docker-compose:down
 ```
 
 If you want to bring up the docker containers, use
-```
+```Java
 $ cd hudi-integ-test
 $  mvn docker-compose:up -DdetachedMode=true
 ```
@@ -937,7 +937,7 @@ run the script
 
 Here are the commands:
 
-```
+```Java
 cd docker
 ./build_local_docker_images.sh
 .....
diff --git a/docs/docker_demo.md b/docs/docker_demo.md
index 5628e5b..ef80794 100644
--- a/docs/docker_demo.md
+++ b/docs/docker_demo.md
@@ -23,7 +23,7 @@ The steps have been tested on a Mac laptop
   * /etc/hosts : The demo references many services running in container by the hostname. Add the following settings to /etc/hosts
 
 
-```
+```Java
    127.0.0.1 adhoc-1
    127.0.0.1 adhoc-2
    127.0.0.1 namenode
@@ -44,7 +44,7 @@ Also, this has not been tested on some environments like Docker on Windows.
 #### Build Hudi
 
 The first step is to build hudi
-```
+```Java
 cd <HUDI_WORKSPACE>
 mvn package -DskipTests
 ```
@@ -54,7 +54,7 @@ mvn package -DskipTests
 The next step is to run the docker compose script and setup configs for bringing up the cluster.
 This should pull the docker images from docker hub and setup docker cluster.
 
-```
+```Java
 cd docker
 ./setup_demo.sh
 ....
@@ -84,7 +84,7 @@ Creating spark-worker-1            ... done
 Copying spark default config and setting up configs
 Copying spark default config and setting up configs
 Copying spark default config and setting up configs
-varadarb-C02SG7Q3G8WP:docker varadarb$ docker ps
+$ docker ps
 ```
 
 At this point, the docker cluster will be up and running. The demo cluster brings up the following services
@@ -107,12 +107,10 @@ The batches are windowed intentionally so that the second batch contains updates
 
 #### Step 1 : Publish the first batch to Kafka
 
-Upload the first batch to Kafka topic 'stock ticks'
-
-```
-cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P
+Upload the first batch to Kafka topic 'stock ticks' `cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P`
 
 To check if the new topic shows up, use
+```Java
 kafkacat -b kafkabroker -L -J | jq .
 {
   "originating_broker": {
@@ -160,24 +158,16 @@ pull changes and apply to Hudi dataset using upsert/insert primitives. Here, we
 json data from kafka topic and ingest to both COW and MOR tables we initialized in the previous step. This tool
 automatically initializes the datasets in the file-system if they do not exist yet.
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 
 # Run the following spark-submit command to execute the delta-streamer and ingest to stock_ticks_cow dataset in HDFS
 spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer $HUDI_UTILITIES_BUNDLE --storage-type COPY_ON_WRITE --source-class org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts  --target-base-path /user/hive/warehouse/stock_ticks_cow --target-table stock_ticks_cow --props /var/demo/config/kafka-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider
-....
-....
-2018-09-24 22:20:00 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
-2018-09-24 22:20:00 INFO  SparkContext:54 - Successfully stopped SparkContext
-
 
 
 # Run the following spark-submit command to execute the delta-streamer and ingest to stock_ticks_mor dataset in HDFS
 spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer $HUDI_UTILITIES_BUNDLE --storage-type MERGE_ON_READ --source-class org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts  --target-base-path /user/hive/warehouse/stock_ticks_mor --target-table stock_ticks_mor --props /var/demo/config/kafka-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider --disable-compaction
-....
-2018-09-24 22:22:01 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
-2018-09-24 22:22:01 INFO  SparkContext:54 - Successfully stopped SparkContext
-....
+
 
 # As part of the setup (Look at setup_demo.sh), the configs needed for DeltaStreamer is uploaded to HDFS. The configs
 # contain mostly Kafa connectivity settings, the avro-schema to be used for ingesting along with key and partitioning fields.
@@ -200,7 +190,7 @@ There will be a similar setup when you browse the MOR dataset
 At this step, the datasets are available in HDFS. We need to sync with Hive to create new Hive tables and add partitions
 inorder to run Hive queries against those datasets.
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 
 # THis command takes in HIveServer URL and COW Hudi Dataset location in HDFS and sync the HDFS state to Hive
@@ -231,7 +221,7 @@ Run a hive query to find the latest timestamp ingested for stock symbol 'GOOG'.
 (for both COW and MOR dataset)and realtime views (for MOR dataset)give the same value "10:29 a.m" as Hudi create a
 parquet file for the first batch of data.
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 # List Tables
@@ -334,7 +324,7 @@ exit
 Hudi support Spark as query processor just like Hive. Here are the same hive queries
 running in spark-sql
 
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --master local[2] --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 ...
@@ -432,7 +422,7 @@ scala> spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, close
 
 Here are the Presto queries for similar Hive and Spark queries. Currently, Hudi does not support Presto queries on realtime views.
 
-```
+```Java
 docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
 presto> show catalogs;
   Catalog
@@ -524,7 +514,7 @@ presto:default> exit
 Upload the second batch of data and ingest this batch using delta-streamer. As this batch does not bring in any new
 partitions, there is no need to run hive-sync
 
-```
+```Java
 cat docker/demo/data/batch_2.json | kafkacat -b kafkabroker -t stock_ticks -P
 
 # Within Docker container, run the ingestion command
@@ -556,7 +546,7 @@ This is the time, when ReadOptimized and Realtime views will provide different r
 return "10:29 am" as it will only read from the Parquet file. Realtime View will do on-the-fly merge and return
 latest committed data which is "10:59 a.m".
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
@@ -627,7 +617,7 @@ exit
 
 Running the same queries in Spark-SQL:
 
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 
@@ -696,7 +686,7 @@ exit
 Running the same queries on Presto for ReadOptimized views. 
 
 
-```
+```Java
 docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
 presto> use hive.default;
 USE
@@ -761,7 +751,7 @@ With 2 batches of data ingested, lets showcase the support for incremental queri
 
 Lets take the same projection query example
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
@@ -785,7 +775,7 @@ the commit time of the first batch (20180924064621) and run incremental query
 Hudi incremental mode provides efficient scanning for incremental queries by filtering out files that do not have any
 candidate rows using hudi-managed metadata.
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 0: jdbc:hive2://hiveserver:10000> set hoodie.stock_ticks_cow.consume.mode=INCREMENTAL;
@@ -798,7 +788,7 @@ No rows affected (0.009 seconds)
 With the above setting, file-ids that do not have any updates from the commit 20180924065039 is filtered out without scanning.
 Here is the incremental query :
 
-```
+```Java
 0: jdbc:hive2://hiveserver:10000>
 0: jdbc:hive2://hiveserver:10000> select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_cow where  symbol = 'GOOG' and `_hoodie_commit_time` > '20180924064621';
 +----------------------+---------+----------------------+---------+------------+-----------+--+
@@ -811,7 +801,7 @@ Here is the incremental query :
 ```
 
 ##### Incremental Query with Spark SQL:
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 Welcome to
@@ -853,7 +843,7 @@ scala> spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, close
 Lets schedule and run a compaction to create a new version of columnar  file so that read-optimized readers will see fresher data.
 Again, You can use Hudi CLI to manually schedule and run compaction
 
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 root@adhoc-1:/opt#   /var/hoodie/ws/hudi-cli/hudi-cli.sh
 ============================================
@@ -946,7 +936,7 @@ Lets also run the incremental query for MOR table.
 From looking at the below query output, it will be clear that the fist commit time for the MOR table is 20180924064636
 and the second commit time is 20180924070031
 
-```
+```Java
 docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
@@ -1007,7 +997,7 @@ exit
 
 ##### Step 10: Read Optimized and Realtime Views for MOR with Spark-SQL after compaction
 
-```
+```Java
 docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 
@@ -1047,7 +1037,7 @@ scala> spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, close
 
 ##### Step 11:  Presto queries over Read Optimized View on MOR dataset after compaction
 
-```
+```Java
 docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
 presto> use hive.default;
 USE
@@ -1084,7 +1074,7 @@ This brings the demo to an end.
 ## Testing Hudi in Local Docker environment
 
 You can bring up a hadoop docker environment containing Hadoop, Hive and Spark services with support for hudi.
-```
+```Java
 $ mvn pre-integration-test -DskipTests
 ```
 The above command builds docker images for all the services with
@@ -1092,13 +1082,13 @@ current Hudi source installed at /var/hoodie/ws and also brings up the services
 currently use Hadoop (v2.8.4), Hive (v2.3.3) and Spark (v2.3.1) in docker images.
 
 To bring down the containers
-```
+```Java
 $ cd hudi-integ-test
 $ mvn docker-compose:down
 ```
 
 If you want to bring up the docker containers, use
-```
+```Java
 $ cd hudi-integ-test
 $  mvn docker-compose:up -DdetachedMode=true
 ```
@@ -1126,7 +1116,7 @@ run the script
 
 Here are the commands:
 
-```
+```Java
 cd docker
 ./build_local_docker_images.sh
 .....
diff --git a/docs/migration_guide.cn.md b/docs/migration_guide.cn.md
index 6f3ed59..ba46781 100644
--- a/docs/migration_guide.cn.md
+++ b/docs/migration_guide.cn.md
@@ -42,16 +42,19 @@ Use the HDFSParquetImporter tool. As the name suggests, this only works if your
 This tool essentially starts a Spark Job to read the existing parquet dataset and converts it into a HUDI managed dataset by re-writing all the data.
 
 #### Option 2
-For huge datasets, this could be as simple as : for partition in [list of partitions in source dataset] {
+For huge datasets, this could be as simple as : 
+```java
+for partition in [list of partitions in source dataset] {
         val inputDF = spark.read.format("any_input_format").load("partition_path")
         inputDF.write.format("org.apache.hudi").option()....save("basePath")
-        }      
+}
+```      
 
 #### Option 3
 Write your own custom logic of how to load an existing dataset into a Hudi managed one. Please read about the RDD API
  [here](quickstart.html).
 
-```
+```Java
 Using the HDFSParquetImporter Tool. Once hudi has been built via `mvn clean install -DskipTests`, the shell can be
 fired by via `cd hudi-cli && ./hudi-cli.sh`.
 
diff --git a/docs/migration_guide.md b/docs/migration_guide.md
index 6f3ed59..75b65ae 100644
--- a/docs/migration_guide.md
+++ b/docs/migration_guide.md
@@ -42,19 +42,22 @@ Use the HDFSParquetImporter tool. As the name suggests, this only works if your
 This tool essentially starts a Spark Job to read the existing parquet dataset and converts it into a HUDI managed dataset by re-writing all the data.
 
 #### Option 2
-For huge datasets, this could be as simple as : for partition in [list of partitions in source dataset] {
+For huge datasets, this could be as simple as : 
+```java
+for partition in [list of partitions in source dataset] {
         val inputDF = spark.read.format("any_input_format").load("partition_path")
         inputDF.write.format("org.apache.hudi").option()....save("basePath")
-        }      
+}
+```  
 
 #### Option 3
 Write your own custom logic of how to load an existing dataset into a Hudi managed one. Please read about the RDD API
- [here](quickstart.html).
-
-```
-Using the HDFSParquetImporter Tool. Once hudi has been built via `mvn clean install -DskipTests`, the shell can be
+ [here](quickstart.html). Using the HDFSParquetImporter Tool. Once hudi has been built via `mvn clean install -DskipTests`, the shell can be
 fired by via `cd hudi-cli && ./hudi-cli.sh`.
 
+```Java
+
+
 hudi->hdfsparquetimport
         --upsert false
         --srcPath /user/parquet/dataset/basepath
diff --git a/docs/querying_data.cn.md b/docs/querying_data.cn.md
index c690385..6d12f3a 100644
--- a/docs/querying_data.cn.md
+++ b/docs/querying_data.cn.md
@@ -92,13 +92,13 @@ Spark可将Hudi jars和捆绑包轻松部署和管理到作业/笔记本中。
 要使用SparkSQL将RO表读取为Hive表,只需按如下所示将路径过滤器推入sparkContext。
 对于Hudi表,该方法保留了Spark内置的读取Parquet文件的优化功能,例如进行矢量化读取。
 
-```
+```Scala
 spark.sparkContext.hadoopConfiguration.setClass("mapreduce.input.pathFilter.class", classOf[org.apache.hudi.hadoop.HoodieROTablePathFilter], classOf[org.apache.hadoop.fs.PathFilter]);
 ```
 
 如果您希望通过数据源在DFS上使用全局路径,则只需执行以下类似操作即可得到Spark数据帧。
 
-```
+```Scala
 Dataset<Row> hoodieROViewDF = spark.read().format("org.apache.hudi")
 // pass any path glob, can include hudi & non-hudi datasets
 .load("/glob/path/pattern");
@@ -108,7 +108,7 @@ Dataset<Row> hoodieROViewDF = spark.read().format("org.apache.hudi")
 当前,实时表只能在Spark中作为Hive表进行查询。为了做到这一点,设置`spark.sql.hive.convertMetastoreParquet = false`,
 迫使Spark回退到使用Hive Serde读取数据(计划/执行仍然是Spark)。
 
-```
+```Scala
 $ spark-shell --jars hudi-spark-bundle-x.y.z-SNAPSHOT.jar --driver-class-path /etc/hive/conf  --packages com.databricks:spark-avro_2.11:4.0.0 --conf spark.sql.hive.convertMetastoreParquet=false --num-executors 10 --driver-memory 7g --executor-memory 2g  --master yarn-client
 
 scala> sqlContext.sql("select count(*) from hudi_rt where datestr = '2016-10-02'").show()
@@ -118,7 +118,7 @@ scala> sqlContext.sql("select count(*) from hudi_rt where datestr = '2016-10-02'
 `hudi-spark`模块提供了DataSource API,这是一种从Hudi数据集中提取数据并通过Spark处理数据的更优雅的方法。
 如下所示是一个示例增量拉取,它将获取自`beginInstantTime`以来写入的所有记录。
 
-```
+```Java
  Dataset<Row> hoodieIncViewDF = spark.read()
      .format("org.apache.hudi")
      .option(DataSourceReadOptions.VIEW_TYPE_OPT_KEY(),
diff --git a/docs/querying_data.md b/docs/querying_data.md
index 1653b08..91836ac 100644
--- a/docs/querying_data.md
+++ b/docs/querying_data.md
@@ -92,13 +92,13 @@ Spark provides much easier deployment & management of Hudi jars and bundles into
 To read RO table as a Hive table using SparkSQL, simply push a path filter into sparkContext as follows. 
 This method retains Spark built-in optimizations for reading Parquet files like vectorized reading on Hudi tables.
 
-```
+```Scala
 spark.sparkContext.hadoopConfiguration.setClass("mapreduce.input.pathFilter.class", classOf[org.apache.hudi.hadoop.HoodieROTablePathFilter], classOf[org.apache.hadoop.fs.PathFilter]);
 ```
 
 If you prefer to glob paths on DFS via the datasource, you can simply do something like below to get a Spark dataframe to work with. 
 
-```
+```Java
 Dataset<Row> hoodieROViewDF = spark.read().format("org.apache.hudi")
 // pass any path glob, can include hudi & non-hudi datasets
 .load("/glob/path/pattern");
@@ -108,7 +108,7 @@ Dataset<Row> hoodieROViewDF = spark.read().format("org.apache.hudi")
 Currently, real time table can only be queried as a Hive table in Spark. In order to do this, set `spark.sql.hive.convertMetastoreParquet=false`, forcing Spark to fallback 
 to using the Hive Serde to read the data (planning/executions is still Spark). 
 
-```
+```Java
 $ spark-shell --jars hudi-spark-bundle-x.y.z-SNAPSHOT.jar --driver-class-path /etc/hive/conf  --packages com.databricks:spark-avro_2.11:4.0.0 --conf spark.sql.hive.convertMetastoreParquet=false --num-executors 10 --driver-memory 7g --executor-memory 2g  --master yarn-client
 
 scala> sqlContext.sql("select count(*) from hudi_rt where datestr = '2016-10-02'").show()
@@ -118,7 +118,7 @@ scala> sqlContext.sql("select count(*) from hudi_rt where datestr = '2016-10-02'
 The `hudi-spark` module offers the DataSource API, a more elegant way to pull data from Hudi dataset and process it via Spark.
 A sample incremental pull, that will obtain all records written since `beginInstantTime`, looks like below.
 
-```
+```Java
  Dataset<Row> hoodieIncViewDF = spark.read()
      .format("org.apache.hudi")
      .option(DataSourceReadOptions.VIEW_TYPE_OPT_KEY(),
diff --git a/docs/quickstart.cn.md b/docs/quickstart.cn.md
index 410dd24..e614d57 100644
--- a/docs/quickstart.cn.md
+++ b/docs/quickstart.cn.md
@@ -10,22 +10,17 @@ permalink: quickstart.html
 本指南通过使用spark-shell简要介绍了Hudi功能。使用Spark数据源,我们将通过代码段展示如何插入和更新的Hudi默认存储类型数据集:
 [写时复制](https://hudi.apache.org/concepts.html#copy-on-write-storage)。每次写操作之后,我们还将展示如何读取快照和增量读取数据。
 
-**注意:**
-您也可以通过[自己构建hudi](https://github.com/apache/incubator-hudi#building-apache-hudi-from-source-building-hudi)来快速入门,
-并在spark-shell命令中使用`--jars <path to hudi_code>/packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar`,
-而不是`--packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating`
-
 ## 设置spark-shell
 Hudi适用于Spark-2.x版本。您可以按照[此处](https://spark.apache.org/downloads.html)的说明设置spark。
 在提取的目录中,使用spark-shell运行Hudi:
 
-```
+```Scala
 bin/spark-shell --packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
 ```
 
 设置表名、基本路径和数据生成器来为本指南生成记录。
 
-```
+```Java
 import org.apache.hudi.QuickstartUtils._
 import scala.collection.JavaConversions._
 import org.apache.spark.sql.SaveMode._
@@ -45,7 +40,7 @@ val dataGen = new DataGenerator
 ## 插入数据 {#inserts}
 生成一些新的行程样本,将其加载到DataFrame中,然后将DataFrame写入Hudi数据集中,如下所示。
 
-```
+```Java
 val inserts = convertToStringList(dataGen.generateInserts(10))
 val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
 df.write.format("org.apache.hudi").
@@ -71,7 +66,7 @@ df.write.format("org.apache.hudi").
 
 将数据文件加载到数据帧中。
 
-```
+```Java
 val roViewDF = spark.
     read.
     format("org.apache.hudi").
@@ -89,7 +84,7 @@ spark.sql("select _hoodie_commit_time, _hoodie_record_key, _hoodie_partition_pat
 
 这类似于插入新数据。使用数据生成器生成对现有行程的更新,加载到数据帧并将数据帧写入hudi数据集。
 
-```
+```Java
 val updates = convertToStringList(dataGen.generateUpdates(10))
 val df = spark.read.json(spark.sparkContext.parallelize(updates, 2));
 df.write.format("org.apache.hudi").
@@ -112,7 +107,7 @@ Hudi还提供了获取给定提交时间戳以来已更改的记录流的功能
 这可以通过使用Hudi的增量视图并提供所需更改的开始时间来实现。
 如果我们需要给定提交之后的所有更改(这是常见的情况),则无需指定结束时间。
 
-```
+```Java
 val commits = spark.sql("select distinct(_hoodie_commit_time) as commitTime from  hudi_ro_table order by commitTime").map(k => k.getString(0)).take(50)
 val beginTime = commits(commits.length - 2) // commit time we are interested in
 
@@ -133,7 +128,7 @@ spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hu
 
 让我们看一下如何查询特定时间的数据。可以通过将结束时间指向特定的提交时间,将开始时间指向"000"(表示最早的提交时间)来表示特定时间。
 
-```
+```Java
 val beginTime = "000" // Represents all commits > this time.
 val endTime = commits(commits.length - 2) // commit time we are interested in
 
@@ -149,6 +144,11 @@ spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hu
 
 ## 从这开始下一步?
 
+您也可以通过[自己构建hudi](https://github.com/apache/incubator-hudi#building-apache-hudi-from-source-building-hudi)来快速入门,
+并在spark-shell命令中使用`--jars <path to hudi_code>/packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar`,
+而不是`--packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating`
+
+
 这里我们使用Spark演示了Hudi的功能。但是,Hudi可以支持多种存储类型/视图,并且可以从Hive,Spark,Presto等查询引擎中查询Hudi数据集。
 我们制作了一个基于Docker设置、所有依赖系统都在本地运行的[演示视频](https://www.youtube.com/watch?v=VhNgUsxdrD0),
 我们建议您复制相同的设置然后按照[这里](docker_demo.html)的步骤自己运行这个演示。
diff --git a/docs/quickstart.md b/docs/quickstart.md
index 121009e..3a17b83 100644
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -12,24 +12,19 @@ code snippets that allows you to insert and update a Hudi dataset of default sto
 [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). 
 After each write operation we will also show how to read the data both snapshot and incrementally.
 
-**NOTE:**
-You can also do the quickstart by [building hudi yourself](https://github.com/apache/incubator-hudi#building-apache-hudi-from-source-building-hudi), 
-and using `--jars <path to hudi_code>/packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar` in the spark-shell command
-instead of `--packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating`
-
 ## Setup spark-shell
 Hudi works with Spark-2.x versions. You can follow instructions [here](https://spark.apache.org/downloads.html) for 
 setting up spark. 
 
 From the extracted directory run spark-shell with Hudi as:
 
-```
+```Scala
 bin/spark-shell --packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
 ```
 
 Setup table name, base path and a data generator to generate records for this guide.
 
-```
+```Scala
 import org.apache.hudi.QuickstartUtils._
 import scala.collection.JavaConversions._
 import org.apache.spark.sql.SaveMode._
@@ -50,7 +45,7 @@ can generate sample inserts and updates based on the the sample trip schema
 ## Insert data {#inserts}
 Generate some new trips, load them into a DataFrame and write the DataFrame into the Hudi dataset as below.
 
-```
+```Scala
 val inserts = convertToStringList(dataGen.generateInserts(10))
 val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
 df.write.format("org.apache.hudi").
@@ -75,7 +70,7 @@ Here we are using the default write operation : `upsert`. If you have a workload
  
 ## Query data {#query}
 Load the data files into a DataFrame.
-```
+```Scala
 val roViewDF = spark.
     read.
     format("org.apache.hudi").
@@ -92,7 +87,7 @@ Refer to [Storage Types and Views](https://hudi.apache.org/concepts.html#storage
 This is similar to inserting new data. Generate updates to existing trips using the data generator, load into a DataFrame 
 and write DataFrame into the hudi dataset.
 
-```
+```Scala
 val updates = convertToStringList(dataGen.generateUpdates(10))
 val df = spark.read.json(spark.sparkContext.parallelize(updates, 2));
 df.write.format("org.apache.hudi").
@@ -115,7 +110,7 @@ Hudi also provides capability to obtain a stream of records that changed since g
 This can be achieved using Hudi's incremental view and providing a begin time from which changes need to be streamed. 
 We do not need to specify endTime, if we want all changes after the given commit (as is the common case). 
 
-```
+```Scala
 val commits = spark.sql("select distinct(_hoodie_commit_time) as commitTime from  hudi_ro_table order by commitTime").map(k => k.getString(0)).take(50)
 val beginTime = commits(commits.length - 2) // commit time we are interested in
 
@@ -136,7 +131,7 @@ feature is that it now lets you author streaming pipelines on batch data.
 Lets look at how to query data as of a specific time. The specific time can be represented by pointing endTime to a 
 specific commit time and beginTime to "000" (denoting earliest possible commit time). 
 
-```
+```Scala
 val beginTime = "000" // Represents all commits > this time.
 val endTime = commits(commits.length - 2) // commit time we are interested in
 
@@ -151,7 +146,11 @@ spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hu
 ``` 
 
 ## Where to go from here?
-Here, we used Spark to show case the capabilities of Hudi. However, Hudi can support multiple storage types/views and 
+You can also do the quickstart by [building hudi yourself](https://github.com/apache/incubator-hudi#building-apache-hudi-from-source-building-hudi), 
+and using `--jars <path to hudi_code>/packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar` in the spark-shell command above
+instead of `--packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating`
+
+Also, we used Spark here to show case the capabilities of Hudi. However, Hudi can support multiple storage types/views and 
 Hudi datasets can be queried from query engines like Hive, Spark, Presto and much more. We have put together a 
 [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases all of this on a docker based setup with all 
 dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, by following 
diff --git a/docs/s3_filesystem.cn.md b/docs/s3_filesystem.cn.md
index fe9a442..f662bda 100644
--- a/docs/s3_filesystem.cn.md
+++ b/docs/s3_filesystem.cn.md
@@ -21,7 +21,7 @@ Simplest way to use Hudi with S3, is to configure your `SparkSession` or `SparkC
 
 Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the `fs.defaultFS` with your S3 bucket name and Hudi should be able to read/write from the bucket.
 
-```
+```xml
   <property>
       <name>fs.defaultFS</name>
       <value>s3://ysharma</value>
@@ -57,7 +57,7 @@ Alternatively, add the required configs in your core-site.xml from where Hudi ca
 Utilities such as hudi-cli or deltastreamer tool, can pick up s3 creds via environmental variable prefixed with `HOODIE_ENV_`. For e.g below is a bash snippet to setup
 such variables and then have cli be able to work on datasets stored in s3
 
-```
+```Java
 export HOODIE_ENV_fs_DOT_s3a_DOT_access_DOT_key=$accessKey
 export HOODIE_ENV_fs_DOT_s3a_DOT_secret_DOT_key=$secretKey
 export HOODIE_ENV_fs_DOT_s3_DOT_awsAccessKeyId=$accessKey
diff --git a/docs/s3_filesystem.md b/docs/s3_filesystem.md
index fe9a442..f662bda 100644
--- a/docs/s3_filesystem.md
+++ b/docs/s3_filesystem.md
@@ -21,7 +21,7 @@ Simplest way to use Hudi with S3, is to configure your `SparkSession` or `SparkC
 
 Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the `fs.defaultFS` with your S3 bucket name and Hudi should be able to read/write from the bucket.
 
-```
+```xml
   <property>
       <name>fs.defaultFS</name>
       <value>s3://ysharma</value>
@@ -57,7 +57,7 @@ Alternatively, add the required configs in your core-site.xml from where Hudi ca
 Utilities such as hudi-cli or deltastreamer tool, can pick up s3 creds via environmental variable prefixed with `HOODIE_ENV_`. For e.g below is a bash snippet to setup
 such variables and then have cli be able to work on datasets stored in s3
 
-```
+```Java
 export HOODIE_ENV_fs_DOT_s3a_DOT_access_DOT_key=$accessKey
 export HOODIE_ENV_fs_DOT_s3a_DOT_secret_DOT_key=$secretKey
 export HOODIE_ENV_fs_DOT_s3_DOT_awsAccessKeyId=$accessKey
diff --git a/docs/writing_data.cn.md b/docs/writing_data.cn.md
index 58b6c99..bd7f646 100644
--- a/docs/writing_data.cn.md
+++ b/docs/writing_data.cn.md
@@ -39,7 +39,7 @@ summary: 这一页里,我们将讨论一些可用的工具,这些工具可
 
 命令行选项更详细地描述了这些功能:
 
-```
+```Java
 [hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` --help
 Usage: <main class> [options]
   Options:
@@ -118,13 +118,13 @@ Usage: <main class> [options]
 ([impressions.avro](https://docs.confluent.io/current/ksql/docs/tutorials/generate-custom-test-data.html),
 由schema-registry代码库提供)
 
-```
+```Java
 [confluent-5.0.0]$ bin/ksql-datagen schema=../impressions.avro format=avro topic=impressions key=impressionid
 ```
 
 然后用如下命令摄取这些数据。
 
-```
+```Java
 [hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` \
   --props file://${PWD}/hudi-utilities/src/test/resources/delta-streamer-config/kafka-source.properties \
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
@@ -142,7 +142,7 @@ Usage: <main class> [options]
 以下是在指定需要使用的字段名称的之后,如何插入更新数据帧的方法,这些字段包括
 `recordKey => _row_key`、`partitionPath => partition`和`precombineKey => timestamp`
 
-```
+```Java
 inputDF.write()
        .format("org.apache.hudi")
        .options(clientOpts) // 可以传入任何Hudi客户端参数
@@ -160,7 +160,7 @@ inputDF.write()
 如果需要从命令行或在独立的JVM中运行它,Hudi提供了一个`HiveSyncTool`,
 在构建了hudi-hive模块之后,可以按以下方式调用它。
 
-```
+```Java
 cd hudi-hive
 ./run_sync_tool.sh
  [hudi-hive]$ ./run_sync_tool.sh --help
@@ -192,7 +192,7 @@ Usage: <main class> [options]
  这可以通过触发一个带有自定义负载实现的插入更新来实现,这种实现可以使用总是返回Optional.Empty作为组合值的DataSource或DeltaStreamer。 
  Hudi附带了一个内置的`org.apache.hudi.EmptyHoodieRecordPayload`类,它就是实现了这一功能。
  
-```
+```Java
  deleteDF // 仅包含要删除的记录的数据帧
    .write().format("org.apache.hudi")
    .option(...) // 根据设置需要添加HUDI参数,例如记录键、分区路径和其他参数
diff --git a/docs/writing_data.md b/docs/writing_data.md
index 37bc0c9..5199382 100644
--- a/docs/writing_data.md
+++ b/docs/writing_data.md
@@ -40,7 +40,7 @@ The `HoodieDeltaStreamer` utility (part of hudi-utilities-bundle) provides the w
 
 Command line options describe capabilities in more detail
 
-```
+```Java
 [hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` --help
 Usage: <main class> [options]
   Options:
@@ -117,13 +117,13 @@ provided under `hudi-utilities/src/test/resources/delta-streamer-config`.
 
 For e.g: once you have Confluent Kafka, Schema registry up & running, produce some test data using ([impressions.avro](https://docs.confluent.io/current/ksql/docs/tutorials/generate-custom-test-data.html) provided by schema-registry repo)
 
-```
+```Java
 [confluent-5.0.0]$ bin/ksql-datagen schema=../impressions.avro format=avro topic=impressions key=impressionid
 ```
 
 and then ingest it as follows.
 
-```
+```Java
 [hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` \
   --props file://${PWD}/hudi-utilities/src/test/resources/delta-streamer-config/kafka-source.properties \
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
@@ -142,7 +142,7 @@ Following is how we can upsert a dataframe, while specifying the field names tha
 for `recordKey => _row_key`, `partitionPath => partition` and `precombineKey => timestamp`
 
 
-```
+```Java
 inputDF.write()
        .format("org.apache.hudi")
        .options(clientOpts) // any of the Hudi client opts can be passed in as well
@@ -160,7 +160,7 @@ Both tools above support syncing of the dataset's latest schema to Hive metastor
 In case, its preferable to run this from commandline or in an independent jvm, Hudi provides a `HiveSyncTool`, which can be invoked as below, 
 once you have built the hudi-hive module.
 
-```
+```Java
 cd hudi-hive
 ./run_sync_tool.sh
  [hudi-hive]$ ./run_sync_tool.sh --help
@@ -193,7 +193,7 @@ Hudi supports implementing two types of deletes on data stored in Hudi datasets,
  - **Hard Deletes** : A stronger form of delete is to physically remove any trace of the record from the dataset. This can be achieved by issuing an upsert with a custom payload implementation
  via either DataSource or DeltaStreamer which always returns Optional.Empty as the combined value. Hudi ships with a built-in `org.apache.hudi.EmptyHoodieRecordPayload` class that does exactly this.
  
-```
+```Java
  deleteDF // dataframe containing just records to be deleted
    .write().format("org.apache.hudi")
    .option(...) // Add HUDI options like record-key, partition-path and others as needed for your setup


[incubator-hudi] 02/02: [DOCS] Updating site with latest doc changes

Posted by bh...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git

commit 98591ddde9a8e6e3351889b0dcf521e82c5983c9
Author: vinothchandar <vi...@apache.org>
AuthorDate: Thu Nov 14 06:38:25 2019 -0800

    [DOCS] Updating site with latest doc changes
---
 content/404.html                                   |   4 +-
 content/admin_guide.html                           |  80 ++--
 content/cn/404.html                                |   2 +-
 content/cn/admin_guide.html                        |  78 ++--
 content/cn/community.html                          |   2 +-
 content/cn/comparison.html                         |   2 +-
 content/cn/concepts.html                           | 242 +++++-----
 content/cn/configurations.html                     | 486 +++++++++++----------
 content/cn/contributing.html                       |   2 +-
 content/cn/docker_demo.html                        |  98 ++---
 content/cn/events/2016-12-30-strata-talk-2017.html |  10 +-
 content/cn/events/2019-01-18-asf-incubation.html   |  10 +-
 content/cn/gcs_hoodie.html                         |   2 +-
 content/cn/index.html                              |  10 +-
 content/cn/migration_guide.html                    |  16 +-
 content/cn/news.html                               |  29 +-
 content/cn/news_archive.html                       |   2 +-
 content/cn/performance.html                        |  51 +--
 content/cn/powered_by.html                         |   2 +-
 content/cn/privacy.html                            |   2 +-
 content/cn/querying_data.html                      | 186 ++++----
 content/cn/quickstart.html                         |  54 +--
 content/cn/s3_hoodie.html                          |  64 +--
 content/cn/use_cases.html                          |   2 +-
 content/cn/writing_data.html                       |  26 +-
 content/community.html                             |   4 +-
 content/comparison.html                            |   4 +-
 content/concepts.html                              |   4 +-
 content/configurations.html                        |  12 +-
 content/contributing.html                          |   4 +-
 content/css/lavish-bootstrap.css                   |   7 +-
 content/docker_demo.html                           | 130 +++---
 content/events/2016-12-30-strata-talk-2017.html    |  12 +-
 content/events/2019-01-18-asf-incubation.html      |  12 +-
 content/feed.xml                                   |   8 +-
 content/gcs_hoodie.html                            |   4 +-
 content/index.html                                 |  12 +-
 content/js/mydoc_scroll.html                       |  10 +-
 content/migration_guide.html                       |  22 +-
 content/news.html                                  |  29 +-
 content/news_archive.html                          |   5 +-
 content/performance.html                           |   4 +-
 content/powered_by.html                            |   4 +-
 content/privacy.html                               |   4 +-
 content/querying_data.html                         |  20 +-
 content/quickstart.html                            |  60 +--
 content/releases.html                              |   2 +-
 content/s3_hoodie.html                             |  66 +--
 content/search.json                                |  58 ++-
 content/sitemap.xml                                |  92 ++--
 content/use_cases.html                             |   4 +-
 content/writing_data.html                          |  28 +-
 52 files changed, 993 insertions(+), 1090 deletions(-)

diff --git a/content/404.html b/content/404.html
index 98b31a0..075eb4b 100644
--- a/content/404.html
+++ b/content/404.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/admin_guide.html b/content/admin_guide.html
index 6e57366..bd55f28 100644
--- a/content/admin_guide.html
+++ b/content/admin_guide.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -359,7 +359,7 @@ Hudi library effectively manages this dataset internally, using .hoodie subfolde
 
 <p>To initialize a hudi table, use the following command.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>18/09/06 15:56:52 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
+<pre><code class="language-Java">18/09/06 15:56:52 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
 ============================================
 *                                          *
 *     _    _           _   _               *
@@ -375,11 +375,11 @@ Welcome to Hoodie CLI. Please type help if you are looking for help.
 hudi-&gt;create --path /user/hive/warehouse/table1 --tableName hoodie_table_1 --tableType COPY_ON_WRITE
 .....
 18/09/06 15:57:15 INFO table.HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE from ...
-</code></pre></div></div>
+</code></pre>
 
 <p>To see the description of hudi table, use the command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:hoodie_table_1-&gt;desc
+<pre><code class="language-Java">hoodie:hoodie_table_1-&gt;desc
 18/09/06 15:57:19 INFO timeline.HoodieActiveTimeline: Loaded instants []
     _________________________________________________________
     | Property                | Value                        |
@@ -390,23 +390,23 @@ hudi-&gt;create --path /user/hive/warehouse/table1 --tableName hoodie_table_1 --
     | hoodie.table.name       | hoodie_table_1               |
     | hoodie.table.type       | COPY_ON_WRITE                |
     | hoodie.archivelog.folder|                              |
-</code></pre></div></div>
+</code></pre>
 
 <p>Following is a sample command to connect to a Hudi dataset contains uber trips.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;connect --path /app/uber/trips
+<pre><code class="language-Java">hoodie:trips-&gt;connect --path /app/uber/trips
 
 16/10/05 23:20:37 INFO model.HoodieTableMetadata: Attempting to load the commits under /app/uber/trips/.hoodie with suffix .commit
 16/10/05 23:20:37 INFO model.HoodieTableMetadata: Attempting to load the commits under /app/uber/trips/.hoodie with suffix .inflight
 16/10/05 23:20:37 INFO model.HoodieTableMetadata: All commits :HoodieCommits{commitList=[20161002045850, 20161002052915, 20161002055918, 20161002065317, 20161002075932, 20161002082904, 20161002085949, 20161002092936, 20161002105903, 20161002112938, 20161002123005, 20161002133002, 20161002155940, 20161002165924, 20161002172907, 20161002175905, 20161002190016, 20161002192954, 20161002195925, 20161002205935, 20161002215928, 20161002222938, 20161002225915, 20161002232906, 20161003003028, 201 [...]
 Metadata for table trips loaded
 hoodie:trips-&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>Once connected to the dataset, a lot of other commands become available. The shell has contextual autocomplete help (press TAB) and below is a list of all commands, few of which are reviewed in this section
 are reviewed</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;help
+<pre><code class="language-Java">hoodie:trips-&gt;help
 * ! - Allows execution of operating system (OS) commands
 * // - Inline comment markers (start of line only)
 * ; - Inline comment markers (start of line only)
@@ -435,7 +435,7 @@ are reviewed</p>
 * version - Displays shell version
 
 hoodie:trips-&gt;
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="inspecting-commits">Inspecting Commits</h4>
 
@@ -444,7 +444,7 @@ Each commit has a monotonically increasing string/number called the <strong>comm
 
 <p>To view some basic information about the last 10 commits,</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;commits show --sortBy "Total Bytes Written" --desc true --limit 10
+<pre><code class="language-Java">hoodie:trips-&gt;commits show --sortBy "Total Bytes Written" --desc true --limit 10
     ________________________________________________________________________________________________________________________________________________________________________
     | CommitTime    | Total Bytes Written| Total Files Added| Total Files Updated| Total Partitions Written| Total Records Written| Total Update Records Written| Total Errors|
     |=======================================================================================================================================================================|
@@ -452,42 +452,42 @@ Each commit has a monotonically increasing string/number called the <strong>comm
     ....
     ....
 hoodie:trips-&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>At the start of each write, Hudi also writes a .inflight commit to the .hoodie folder. You can use the timestamp there to estimate how long the commit has been inflight</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
+<pre><code class="language-Java">$ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
 -rw-r--r--   3 vinoth supergroup     321984 2016-10-05 23:18 /app/uber/trips/.hoodie/20161005225920.inflight
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="drilling-down-to-a-specific-commit">Drilling Down to a specific Commit</h4>
 
 <p>To understand how the writes spread across specific partiions,</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;commit showpartitions --commit 20161005165855 --sortBy "Total Bytes Written" --desc true --limit 10
+<pre><code class="language-Java">hoodie:trips-&gt;commit showpartitions --commit 20161005165855 --sortBy "Total Bytes Written" --desc true --limit 10
     __________________________________________________________________________________________________________________________________________
     | Partition Path| Total Files Added| Total Files Updated| Total Records Inserted| Total Records Updated| Total Bytes Written| Total Errors|
     |=========================================================================================================================================|
      ....
      ....
-</code></pre></div></div>
+</code></pre>
 
 <p>If you need file level granularity , we can do the following</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;commit showfiles --commit 20161005165855 --sortBy "Partition Path"
+<pre><code class="language-Java">hoodie:trips-&gt;commit showfiles --commit 20161005165855 --sortBy "Partition Path"
     ________________________________________________________________________________________________________________________________________________________
     | Partition Path| File ID                             | Previous Commit| Total Records Updated| Total Records Written| Total Bytes Written| Total Errors|
     |=======================================================================================================================================================|
     ....
     ....
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="filesystem-view">FileSystem View</h4>
 
 <p>Hudi views each partition as a collection of file-groups with each file-group containing a list of file-slices in commit
 order (See Concepts). The below commands allow users to view the file-slices for a data-set.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> hoodie:stock_ticks_mor-&gt;show fsview all
+<pre><code class="language-Java"> hoodie:stock_ticks_mor-&gt;show fsview all
  ....
   _______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
  | Partition | FileId | Base-Instant | Data-File | Data-File Size| Num Delta Files| Total Delta File Size| Delta Files |
@@ -504,30 +504,30 @@ order (See Concepts). The below commands allow users to view the file-slices for
  | 2018/08/31| 111415c3-f26d-4639-86c8-f9956f245ac3| 20181002180759| hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor/2018/08/31/111415c3-f26d-4639-86c8-f9956f245ac3_0_20181002180759.parquet| 432.5 KB | 1 | 20.8 KB | 20.8 KB | 0.0 B | 0.0 B | 0.0 B | [HoodieLogFile {hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor/2018/08/31/.111415c3-f26d-4639-86c8-f9956f245ac3_20181002180759.log.1}]| [] |
 
  hoodie:stock_ticks_mor-&gt;
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="statistics">Statistics</h4>
 
 <p>Since Hudi directly manages file sizes for DFS dataset, it might be good to get an overall picture</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;stats filesizes --partitionPath 2016/09/01 --sortBy "95th" --desc true --limit 10
+<pre><code class="language-Java">hoodie:trips-&gt;stats filesizes --partitionPath 2016/09/01 --sortBy "95th" --desc true --limit 10
     ________________________________________________________________________________________________
     | CommitTime    | Min     | 10th    | 50th    | avg     | 95th    | Max     | NumFiles| StdDev  |
     |===============================================================================================|
     | &lt;COMMIT_ID&gt;   | 93.9 MB | 93.9 MB | 93.9 MB | 93.9 MB | 93.9 MB | 93.9 MB | 2       | 2.3 KB  |
     ....
     ....
-</code></pre></div></div>
+</code></pre>
 
 <p>In case of Hudi write taking much longer, it might be good to see the write amplification for any sudden increases</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;stats wa
+<pre><code class="language-Java">hoodie:trips-&gt;stats wa
     __________________________________________________________________________
     | CommitTime    | Total Upserted| Total Written| Write Amplifiation Factor|
     |=========================================================================|
     ....
     ....
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="archived-commits">Archived Commits</h4>
 
@@ -539,28 +539,28 @@ This is a sequence file that contains a mapping from commitNumber =&gt; json wit
 <p>To get an idea of the lag between compaction and writer applications, use the below command to list down all
 pending compactions.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;compactions show all
+<pre><code class="language-Java">hoodie:trips-&gt;compactions show all
      ___________________________________________________________________
     | Compaction Instant Time| State    | Total FileIds to be Compacted|
     |==================================================================|
     | &lt;INSTANT_1&gt;            | REQUESTED| 35                           |
     | &lt;INSTANT_2&gt;            | INFLIGHT | 27                           |
-</code></pre></div></div>
+</code></pre>
 
 <p>To inspect a specific compaction plan, use</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;compaction show --instant &lt;INSTANT_1&gt;
+<pre><code class="language-Java">hoodie:trips-&gt;compaction show --instant &lt;INSTANT_1&gt;
     _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
     | Partition Path| File Id | Base Instant  | Data File Path                                    | Total Delta Files| getMetrics                                                                                                                    |
     |================================================================================================================================================================================================================================================
     | 2018/07/17    | &lt;UUID&gt;  | &lt;INSTANT_1&gt;   | viewfs://ns-default/.../../UUID_&lt;INSTANT&gt;.parquet | 1                | {TOTAL_LOG_FILES=1.0, TOTAL_IO_READ_MB=1230.0, TOTAL_LOG_FILES_SIZE=2.51255751E8, TOTAL_IO_WRITE_MB=991.0, TOTAL_IO_MB=2221.0}|
 
-</code></pre></div></div>
+</code></pre>
 
 <p>To manually schedule or run a compaction, use the below command. This command uses spark launcher to perform compaction
 operations. NOTE : Make sure no other application is scheduling compaction for this dataset concurrently</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;help compaction schedule
+<pre><code class="language-Java">hoodie:trips-&gt;help compaction schedule
 Keyword:                   compaction schedule
 Description:               Schedule Compaction
  Keyword:                  sparkMemory
@@ -570,9 +570,9 @@ Description:               Schedule Compaction
    Default if unspecified: '1G'
 
 * compaction schedule - Schedule Compaction
-</code></pre></div></div>
+</code></pre>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;help compaction run
+<pre><code class="language-Java">hoodie:trips-&gt;help compaction run
 Keyword:                   compaction run
 Description:               Run Compaction for given instant time
  Keyword:                  tableName
@@ -612,13 +612,13 @@ Description:               Run Compaction for given instant time
    Default if unspecified: '__NULL__'
 
 * compaction run - Run Compaction for given instant time
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="validate-compaction">Validate Compaction</h5>
 
 <p>Validating a compaction plan : Check if all the files necessary for compactions are present and are valid</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:stock_ticks_mor-&gt;compaction validate --instant 20181005222611
+<pre><code class="language-Java">hoodie:stock_ticks_mor-&gt;compaction validate --instant 20181005222611
 ...
 
    COMPACTION PLAN VALID
@@ -638,7 +638,7 @@ hoodie:stock_ticks_mor-&gt;compaction validate --instant 20181005222601
     | File Id                             | Base Instant Time| Base Data File                                                                                                                   | Num Delta Files| Valid| Error                                                                           |
     |=====================================================================================================================================================================================================================================================================================================|
     | 05320e98-9a57-4c38-b809-a6beaaeb36bd| 20181005222445   | hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor/2018/08/31/05320e98-9a57-4c38-b809-a6beaaeb36bd_0_20181005222445.parquet| 1              | false| All log files specified in compaction operation is not present. Missing ....    |
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="note">NOTE</h5>
 
@@ -650,17 +650,17 @@ so that are preserved. Hudi provides the following CLI to support it</p>
 
 <h5 id="unscheduling-compaction">UnScheduling Compaction</h5>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;compaction unscheduleFileId --fileId &lt;FileUUID&gt;
+<pre><code class="language-Java">hoodie:trips-&gt;compaction unscheduleFileId --fileId &lt;FileUUID&gt;
 ....
 No File renames needed to unschedule file from pending compaction. Operation successful.
-</code></pre></div></div>
+</code></pre>
 
 <p>In other cases, an entire compaction plan needs to be reverted. This is supported by the following CLI</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;compaction unschedule --compactionInstant &lt;compactionInstant&gt;
+<pre><code class="language-Java">hoodie:trips-&gt;compaction unschedule --compactionInstant &lt;compactionInstant&gt;
 .....
 No File renames needed to unschedule pending compaction. Operation successful.
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="repair-compaction">Repair Compaction</h5>
 
@@ -670,11 +670,11 @@ partial failures, the compaction operation could become inconsistent with the st
 command comes to the rescue, it will rearrange the file-slices so that there is no loss and the file-slices are
 consistent with the compaction plan</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:stock_ticks_mor-&gt;compaction repair --instant 20181005222611
+<pre><code class="language-Java">hoodie:stock_ticks_mor-&gt;compaction repair --instant 20181005222611
 ......
 Compaction successfully repaired
 .....
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="metrics">Metrics</h2>
 
diff --git a/content/cn/404.html b/content/cn/404.html
index 926fc69..5806e32 100644
--- a/content/cn/404.html
+++ b/content/cn/404.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/admin_guide.html b/content/cn/admin_guide.html
index b15add3..d4016b0 100644
--- a/content/cn/admin_guide.html
+++ b/content/cn/admin_guide.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -358,7 +358,7 @@ Hudi库使用.hoodie子文件夹跟踪所有元数据,从而有效地在内部
 
 <p>初始化hudi表,可使用如下命令。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>18/09/06 15:56:52 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
+<pre><code class="language-Java">18/09/06 15:56:52 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
 ============================================
 *                                          *
 *     _    _           _   _               *
@@ -374,11 +374,11 @@ Welcome to Hoodie CLI. Please type help if you are looking for help.
 hudi-&gt;create --path /user/hive/warehouse/table1 --tableName hoodie_table_1 --tableType COPY_ON_WRITE
 .....
 18/09/06 15:57:15 INFO table.HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE from ...
-</code></pre></div></div>
+</code></pre>
 
 <p>To see the description of hudi table, use the command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:hoodie_table_1-&gt;desc
+<pre><code class="language-Java">hoodie:hoodie_table_1-&gt;desc
 18/09/06 15:57:19 INFO timeline.HoodieActiveTimeline: Loaded instants []
     _________________________________________________________
     | Property                | Value                        |
@@ -389,22 +389,22 @@ hudi-&gt;create --path /user/hive/warehouse/table1 --tableName hoodie_table_1 --
     | hoodie.table.name       | hoodie_table_1               |
     | hoodie.table.type       | COPY_ON_WRITE                |
     | hoodie.archivelog.folder|                              |
-</code></pre></div></div>
+</code></pre>
 
 <p>以下是连接到包含uber trips的Hudi数据集的示例命令。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;connect --path /app/uber/trips
+<pre><code class="language-Java">hoodie:trips-&gt;connect --path /app/uber/trips
 
 16/10/05 23:20:37 INFO model.HoodieTableMetadata: Attempting to load the commits under /app/uber/trips/.hoodie with suffix .commit
 16/10/05 23:20:37 INFO model.HoodieTableMetadata: Attempting to load the commits under /app/uber/trips/.hoodie with suffix .inflight
 16/10/05 23:20:37 INFO model.HoodieTableMetadata: All commits :HoodieCommits{commitList=[20161002045850, 20161002052915, 20161002055918, 20161002065317, 20161002075932, 20161002082904, 20161002085949, 20161002092936, 20161002105903, 20161002112938, 20161002123005, 20161002133002, 20161002155940, 20161002165924, 20161002172907, 20161002175905, 20161002190016, 20161002192954, 20161002195925, 20161002205935, 20161002215928, 20161002222938, 20161002225915, 20161002232906, 20161003003028, 201 [...]
 Metadata for table trips loaded
 hoodie:trips-&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>连接到数据集后,便可使用许多其他命令。该shell程序具有上下文自动完成帮助(按TAB键),下面是所有命令的列表,本节中对其中的一些命令进行了详细示例。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;help
+<pre><code class="language-Java">hoodie:trips-&gt;help
 * ! - Allows execution of operating system (OS) commands
 * // - Inline comment markers (start of line only)
 * ; - Inline comment markers (start of line only)
@@ -433,7 +433,7 @@ hoodie:trips-&gt;
 * version - Displays shell version
 
 hoodie:trips-&gt;
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="检查提交">检查提交</h4>
 
@@ -442,7 +442,7 @@ hoodie:trips-&gt;
 
 <p>查看有关最近10次提交的一些基本信息,</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;commits show --sortBy "Total Bytes Written" --desc true --limit 10
+<pre><code class="language-Java">hoodie:trips-&gt;commits show --sortBy "Total Bytes Written" --desc true --limit 10
     ________________________________________________________________________________________________________________________________________________________________________
     | CommitTime    | Total Bytes Written| Total Files Added| Total Files Updated| Total Partitions Written| Total Records Written| Total Update Records Written| Total Errors|
     |=======================================================================================================================================================================|
@@ -450,41 +450,41 @@ hoodie:trips-&gt;
     ....
     ....
 hoodie:trips-&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>在每次写入开始时,Hudi还将.inflight提交写入.hoodie文件夹。您可以使用那里的时间戳来估计正在进行的提交已经花费的时间</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
+<pre><code class="language-Java">$ hdfs dfs -ls /app/uber/trips/.hoodie/*.inflight
 -rw-r--r--   3 vinoth supergroup     321984 2016-10-05 23:18 /app/uber/trips/.hoodie/20161005225920.inflight
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="深入到特定的提交">深入到特定的提交</h4>
 
 <p>了解写入如何分散到特定分区,</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;commit showpartitions --commit 20161005165855 --sortBy "Total Bytes Written" --desc true --limit 10
+<pre><code class="language-Java">hoodie:trips-&gt;commit showpartitions --commit 20161005165855 --sortBy "Total Bytes Written" --desc true --limit 10
     __________________________________________________________________________________________________________________________________________
     | Partition Path| Total Files Added| Total Files Updated| Total Records Inserted| Total Records Updated| Total Bytes Written| Total Errors|
     |=========================================================================================================================================|
      ....
      ....
-</code></pre></div></div>
+</code></pre>
 
 <p>如果您需要文件级粒度,我们可以执行以下操作</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;commit showfiles --commit 20161005165855 --sortBy "Partition Path"
+<pre><code class="language-Java">hoodie:trips-&gt;commit showfiles --commit 20161005165855 --sortBy "Partition Path"
     ________________________________________________________________________________________________________________________________________________________
     | Partition Path| File ID                             | Previous Commit| Total Records Updated| Total Records Written| Total Bytes Written| Total Errors|
     |=======================================================================================================================================================|
     ....
     ....
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="文件系统视图">文件系统视图</h4>
 
 <p>Hudi将每个分区视为文件组的集合,每个文件组包含按提交顺序排列的文件切片列表(请参阅概念)。以下命令允许用户查看数据集的文件切片。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> hoodie:stock_ticks_mor-&gt;show fsview all
+<pre><code class="language-Java"> hoodie:stock_ticks_mor-&gt;show fsview all
  ....
   _______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
  | Partition | FileId | Base-Instant | Data-File | Data-File Size| Num Delta Files| Total Delta File Size| Delta Files |
@@ -501,30 +501,30 @@ hoodie:trips-&gt;
  | 2018/08/31| 111415c3-f26d-4639-86c8-f9956f245ac3| 20181002180759| hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor/2018/08/31/111415c3-f26d-4639-86c8-f9956f245ac3_0_20181002180759.parquet| 432.5 KB | 1 | 20.8 KB | 20.8 KB | 0.0 B | 0.0 B | 0.0 B | [HoodieLogFile {hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor/2018/08/31/.111415c3-f26d-4639-86c8-f9956f245ac3_20181002180759.log.1}]| [] |
 
  hoodie:stock_ticks_mor-&gt;
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="统计信息">统计信息</h4>
 
 <p>由于Hudi直接管理DFS数据集的文件大小,这些信息会帮助你全面了解Hudi的运行状况</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;stats filesizes --partitionPath 2016/09/01 --sortBy "95th" --desc true --limit 10
+<pre><code class="language-Java">hoodie:trips-&gt;stats filesizes --partitionPath 2016/09/01 --sortBy "95th" --desc true --limit 10
     ________________________________________________________________________________________________
     | CommitTime    | Min     | 10th    | 50th    | avg     | 95th    | Max     | NumFiles| StdDev  |
     |===============================================================================================|
     | &lt;COMMIT_ID&gt;   | 93.9 MB | 93.9 MB | 93.9 MB | 93.9 MB | 93.9 MB | 93.9 MB | 2       | 2.3 KB  |
     ....
     ....
-</code></pre></div></div>
+</code></pre>
 
 <p>如果Hudi写入花费的时间更长,那么可以通过观察写放大指标来发现任何异常</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;stats wa
+<pre><code class="language-Java">hoodie:trips-&gt;stats wa
     __________________________________________________________________________
     | CommitTime    | Total Upserted| Total Written| Write Amplifiation Factor|
     |=========================================================================|
     ....
     ....
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="归档的提交">归档的提交</h4>
 
@@ -535,28 +535,28 @@ hoodie:trips-&gt;
 
 <p>要了解压缩和写程序之间的时滞,请使用以下命令列出所有待处理的压缩。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;compactions show all
+<pre><code class="language-Java">hoodie:trips-&gt;compactions show all
      ___________________________________________________________________
     | Compaction Instant Time| State    | Total FileIds to be Compacted|
     |==================================================================|
     | &lt;INSTANT_1&gt;            | REQUESTED| 35                           |
     | &lt;INSTANT_2&gt;            | INFLIGHT | 27                           |
-</code></pre></div></div>
+</code></pre>
 
 <p>要检查特定的压缩计划,请使用</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;compaction show --instant &lt;INSTANT_1&gt;
+<pre><code class="language-Java">hoodie:trips-&gt;compaction show --instant &lt;INSTANT_1&gt;
     _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
     | Partition Path| File Id | Base Instant  | Data File Path                                    | Total Delta Files| getMetrics                                                                                                                    |
     |================================================================================================================================================================================================================================================
     | 2018/07/17    | &lt;UUID&gt;  | &lt;INSTANT_1&gt;   | viewfs://ns-default/.../../UUID_&lt;INSTANT&gt;.parquet | 1                | {TOTAL_LOG_FILES=1.0, TOTAL_IO_READ_MB=1230.0, TOTAL_LOG_FILES_SIZE=2.51255751E8, TOTAL_IO_WRITE_MB=991.0, TOTAL_IO_MB=2221.0}|
 
-</code></pre></div></div>
+</code></pre>
 
 <p>要手动调度或运行压缩,请使用以下命令。该命令使用spark启动器执行压缩操作。
 注意:确保没有其他应用程序正在同时调度此数据集的压缩</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;help compaction schedule
+<pre><code class="language-Java">hoodie:trips-&gt;help compaction schedule
 Keyword:                   compaction schedule
 Description:               Schedule Compaction
  Keyword:                  sparkMemory
@@ -566,9 +566,9 @@ Description:               Schedule Compaction
    Default if unspecified: '1G'
 
 * compaction schedule - Schedule Compaction
-</code></pre></div></div>
+</code></pre>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;help compaction run
+<pre><code class="language-Java">hoodie:trips-&gt;help compaction run
 Keyword:                   compaction run
 Description:               Run Compaction for given instant time
  Keyword:                  tableName
@@ -608,13 +608,13 @@ Description:               Run Compaction for given instant time
    Default if unspecified: '__NULL__'
 
 * compaction run - Run Compaction for given instant time
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="验证压缩">验证压缩</h5>
 
 <p>验证压缩计划:检查压缩所需的所有文件是否都存在且有效</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:stock_ticks_mor-&gt;compaction validate --instant 20181005222611
+<pre><code class="language-Java">hoodie:stock_ticks_mor-&gt;compaction validate --instant 20181005222611
 ...
 
    COMPACTION PLAN VALID
@@ -634,7 +634,7 @@ hoodie:stock_ticks_mor-&gt;compaction validate --instant 20181005222601
     | File Id                             | Base Instant Time| Base Data File                                                                                                                   | Num Delta Files| Valid| Error                                                                           |
     |=====================================================================================================================================================================================================================================================================================================|
     | 05320e98-9a57-4c38-b809-a6beaaeb36bd| 20181005222445   | hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor/2018/08/31/05320e98-9a57-4c38-b809-a6beaaeb36bd_0_20181005222445.parquet| 1              | false| All log files specified in compaction operation is not present. Missing ....    |
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="注意">注意</h5>
 
@@ -645,17 +645,17 @@ hoodie:stock_ticks_mor-&gt;compaction validate --instant 20181005222601
 
 <h5 id="取消调度压缩">取消调度压缩</h5>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;compaction unscheduleFileId --fileId &lt;FileUUID&gt;
+<pre><code class="language-Java">hoodie:trips-&gt;compaction unscheduleFileId --fileId &lt;FileUUID&gt;
 ....
 No File renames needed to unschedule file from pending compaction. Operation successful.
-</code></pre></div></div>
+</code></pre>
 
 <p>在其他情况下,需要撤销整个压缩计划。以下CLI支持此功能</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:trips-&gt;compaction unschedule --compactionInstant &lt;compactionInstant&gt;
+<pre><code class="language-Java">hoodie:trips-&gt;compaction unschedule --compactionInstant &lt;compactionInstant&gt;
 .....
 No File renames needed to unschedule pending compaction. Operation successful.
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="修复压缩">修复压缩</h5>
 
@@ -664,11 +664,11 @@ No File renames needed to unschedule pending compaction. Operation successful.
 当您运行<code class="highlighter-rouge">压缩验证</code>时,您会注意到无效的压缩操作(如果有的话)。
 在这种情况下,修复命令将立即执行,它将重新排列文件切片,以使文件不丢失,并且文件切片与压缩计划一致</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hoodie:stock_ticks_mor-&gt;compaction repair --instant 20181005222611
+<pre><code class="language-Java">hoodie:stock_ticks_mor-&gt;compaction repair --instant 20181005222611
 ......
 Compaction successfully repaired
 .....
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="metrics">指标</h2>
 
diff --git a/content/cn/community.html b/content/cn/community.html
index c9dc27c..1c4e21c 100644
--- a/content/cn/community.html
+++ b/content/cn/community.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/comparison.html b/content/cn/comparison.html
index 4d9595d..5dddc97 100644
--- a/content/cn/comparison.html
+++ b/content/cn/comparison.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/concepts.html b/content/cn/concepts.html
index 1400ee7..ced9326 100644
--- a/content/cn/concepts.html
+++ b/content/cn/concepts.html
@@ -3,7 +3,7 @@
     <meta charset="utf-8">
 <meta http-equiv="X-UA-Compatible" content="IE=edge">
 <meta name="viewport" content="width=device-width, initial-scale=1">
-<meta name="description" content="Here we introduce some basic concepts & give a broad technical overview of Hudi">
+<meta name="description" content="这里我们将介绍Hudi的一些基本概念并提供关于Hudi的技术概述">
 <meta name="keywords" content="hudi, design, storage, views, timeline">
 <title>Concepts | Hudi</title>
 <link rel="stylesheet" href="/css/syntax.css">
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -332,7 +332,7 @@
 <div class="post-content">
 
    
-    <div class="summary">Here we introduce some basic concepts & give a broad technical overview of Hudi</div>
+    <div class="summary">这里我们将介绍Hudi的一些基本概念并提供关于Hudi的技术概述</div>
    
 
     
@@ -340,234 +340,222 @@
 
     
 
-  <p>Apache Hudi (pronounced “Hudi”) provides the following streaming primitives over datasets on DFS</p>
+  <p>Apache Hudi(发音为“Hudi”)在DFS的数据集上提供以下流原语</p>
 
 <ul>
-  <li>Upsert                     (how do I change the dataset?)</li>
-  <li>Incremental pull           (how do I fetch data that changed?)</li>
+  <li>插入更新           (如何改变数据集?)</li>
+  <li>增量拉取           (如何获取变更的数据?)</li>
 </ul>
 
-<p>In this section, we will discuss key concepts &amp; terminologies that are important to understand, to be able to effectively use these primitives.</p>
+<p>在本节中,我们将讨论重要的概念和术语,这些概念和术语有助于理解并有效使用这些原语。</p>
 
-<h2 id="timeline">Timeline</h2>
-<p>At its core, Hudi maintains a <code class="highlighter-rouge">timeline</code> of all actions performed on the dataset at different <code class="highlighter-rouge">instants</code> of time that helps provide instantaneous views of the dataset,
-while also efficiently supporting retrieval of data in the order of arrival. A Hudi instant consists of the following components</p>
+<h2 id="时间轴">时间轴</h2>
+<p>在它的核心,Hudi维护一条包含在不同的<code class="highlighter-rouge">即时</code>时间所有对数据集操作的<code class="highlighter-rouge">时间轴</code>,从而提供,从不同时间点出发得到不同的视图下的数据集。Hudi即时包含以下组件</p>
 
 <ul>
-  <li><code class="highlighter-rouge">Action type</code> : Type of action performed on the dataset</li>
-  <li><code class="highlighter-rouge">Instant time</code> : Instant time is typically a timestamp (e.g: 20190117010349), which monotonically increases in the order of action’s begin time.</li>
-  <li><code class="highlighter-rouge">state</code> : current state of the instant</li>
+  <li><code class="highlighter-rouge">操作类型</code> : 对数据集执行的操作类型</li>
+  <li><code class="highlighter-rouge">即时时间</code> : 即时时间通常是一个时间戳(例如:20190117010349),该时间戳按操作开始时间的顺序单调增加。</li>
+  <li><code class="highlighter-rouge">状态</code> : 即时的状态</li>
 </ul>
 
-<p>Hudi guarantees that the actions performed on the timeline are atomic &amp; timeline consistent based on the instant time.</p>
+<p>Hudi保证在时间轴上执行的操作的原子性和基于即时时间的时间轴一致性。</p>
 
-<p>Key actions performed include</p>
+<p>执行的关键操作包括</p>
 
 <ul>
-  <li><code class="highlighter-rouge">COMMITS</code> - A commit denotes an <strong>atomic write</strong> of a batch of records into a dataset.</li>
-  <li><code class="highlighter-rouge">CLEANS</code> - Background activity that gets rid of older versions of files in the dataset, that are no longer needed.</li>
-  <li><code class="highlighter-rouge">DELTA_COMMIT</code> - A delta commit refers to an <strong>atomic write</strong> of a batch of records into a  MergeOnRead storage type of dataset, where some/all of the data could be just written to delta logs.</li>
-  <li><code class="highlighter-rouge">COMPACTION</code> - Background activity to reconcile differential data structures within Hudi e.g: moving updates from row based log files to columnar formats. Internally, compaction manifests as a special commit on the timeline</li>
-  <li><code class="highlighter-rouge">ROLLBACK</code> - Indicates that a commit/delta commit was unsuccessful &amp; rolled back, removing any partial files produced during such a write</li>
-  <li><code class="highlighter-rouge">SAVEPOINT</code> - Marks certain file groups as “saved”, such that cleaner will not delete them. It helps restore the dataset to a point on the timeline, in case of disaster/data recovery scenarios.</li>
+  <li><code class="highlighter-rouge">COMMITS</code> - 一次提交表示将一组记录<strong>原子写入</strong>到数据集中。</li>
+  <li><code class="highlighter-rouge">CLEANS</code> - 删除数据集中不再需要的旧文件版本的后台活动。</li>
+  <li><code class="highlighter-rouge">DELTA_COMMIT</code> - 增量提交是指将一批记录<strong>原子写入</strong>到MergeOnRead存储类型的数据集中,其中一些/所有数据都可以只写到增量日志中。</li>
+  <li><code class="highlighter-rouge">COMPACTION</code> - 协调Hudi中差异数据结构的后台活动,例如:将更新从基于行的日志文件变成列格式。在内部,压缩表现为时间轴上的特殊提交。</li>
+  <li><code class="highlighter-rouge">ROLLBACK</code> - 表示提交/增量提交不成功且已回滚,删除在写入过程中产生的所有部分文件。</li>
+  <li><code class="highlighter-rouge">SAVEPOINT</code> - 将某些文件组标记为”已保存”,以便清理程序不会将其删除。在发生灾难/数据恢复的情况下,它有助于将数据集还原到时间轴上的某个点。</li>
 </ul>
 
-<p>Any given instant can be 
-in one of the following states</p>
+<p>任何给定的即时都可以处于以下状态之一</p>
 
 <ul>
-  <li><code class="highlighter-rouge">REQUESTED</code> - Denotes an action has been scheduled, but has not initiated</li>
-  <li><code class="highlighter-rouge">INFLIGHT</code> - Denotes that the action is currently being performed</li>
-  <li><code class="highlighter-rouge">COMPLETED</code> - Denotes completion of an action on the timeline</li>
+  <li><code class="highlighter-rouge">REQUESTED</code> - 表示已调度但尚未启动的操作。</li>
+  <li><code class="highlighter-rouge">INFLIGHT</code> - 表示当前正在执行该操作。</li>
+  <li><code class="highlighter-rouge">COMPLETED</code> - 表示在时间轴上完成了该操作。</li>
 </ul>
 
 <figure>
     <img class="docimage" src="/images/hudi_timeline.png" alt="hudi_timeline.png" />
 </figure>
 
-<p>Example above shows upserts happenings between 10:00 and 10:20 on a Hudi dataset, roughly every 5 mins, leaving commit metadata on the Hudi timeline, along
-with other background cleaning/compactions. One key observation to make is that the commit time indicates the <code class="highlighter-rouge">arrival time</code> of the data (10:20AM), while the actual data
-organization reflects the actual time or <code class="highlighter-rouge">event time</code>, the data was intended for (hourly buckets from 07:00). These are two key concepts when reasoning about tradeoffs between latency and completeness of data.</p>
+<p>上面的示例显示了在Hudi数据集上大约10:00到10:20之间发生的更新事件,大约每5分钟一次,将提交元数据以及其他后台清理/压缩保留在Hudi时间轴上。
+观察的关键点是:提交时间指示数据的<code class="highlighter-rouge">到达时间</code>(上午10:20),而实际数据组织则反映了实际时间或<code class="highlighter-rouge">事件时间</code>,即数据所反映的(从07:00开始的每小时时段)。在权衡数据延迟和完整性时,这是两个关键概念。</p>
 
-<p>When there is late arriving data (data intended for 9:00 arriving &gt;1 hr late at 10:20), we can see the upsert producing new data into even older time buckets/folders.
-With the help of the timeline, an incremental query attempting to get all new data that was committed successfully since 10:00 hours, is able to very efficiently consume
-only the changed files without say scanning all the time buckets &gt; 07:00.</p>
+<p>如果有延迟到达的数据(事件时间为9:00的数据在10:20达到,延迟 &gt;1 小时),我们可以看到upsert将新数据生成到更旧的时间段/文件夹中。
+在时间轴的帮助下,增量查询可以只提取10:00以后成功提交的新数据,并非常高效地只消费更改过的文件,且无需扫描更大的文件范围,例如07:00后的所有时间段。</p>
 
-<h2 id="file-management">File management</h2>
-<p>Hudi organizes a datasets into a directory structure under a <code class="highlighter-rouge">basepath</code> on DFS. Dataset is broken up into partitions, which are folders containing data files for that partition,
-very similar to Hive tables. Each partition is uniquely identified by its <code class="highlighter-rouge">partitionpath</code>, which is relative to the basepath.</p>
+<h2 id="文件组织">文件组织</h2>
+<p>Hudi将DFS上的数据集组织到<code class="highlighter-rouge">基本路径</code>下的目录结构中。数据集分为多个分区,这些分区是包含该分区的数据文件的文件夹,这与Hive表非常相似。
+每个分区被相对于基本路径的特定<code class="highlighter-rouge">分区路径</code>区分开来。</p>
 
-<p>Within each partition, files are organized into <code class="highlighter-rouge">file groups</code>, uniquely identified by a <code class="highlighter-rouge">file id</code>. Each file group contains several
-<code class="highlighter-rouge">file slices</code>, where each slice contains a base columnar file (<code class="highlighter-rouge">*.parquet</code>) produced at a certain commit/compaction instant time,
- along with set of log files (<code class="highlighter-rouge">*.log.*</code>) that contain inserts/updates to the base file since the base file was produced. 
-Hudi adopts a MVCC design, where compaction action merges logs and base files to produce new file slices and cleaning action gets rid of 
-unused/older file slices to reclaim space on DFS.</p>
+<p>在每个分区内,文件被组织为<code class="highlighter-rouge">文件组</code>,由<code class="highlighter-rouge">文件id</code>唯一标识。
+每个文件组包含多个<code class="highlighter-rouge">文件切片</code>,其中每个切片包含在某个提交/压缩即时时间生成的基本列文件(<code class="highlighter-rouge">*.parquet</code>)以及一组日志文件(<code class="highlighter-rouge">*.log*</code>),该文件包含自生成基本文件以来对基本文件的插入/更新。
+Hudi采用MVCC设计,其中压缩操作将日志和基本文件合并以产生新的文件片,而清理操作则将未使用的/较旧的文件片删除以回收DFS上的空间。</p>
 
-<p>Hudi provides efficient upserts, by mapping a given hoodie key (record key + partition path) consistently to a file group, via an indexing mechanism. 
-This mapping between record key and file group/file id, never changes once the first version of a record has been written to a file. In short, the 
-mapped file group contains all versions of a group of records.</p>
+<p>Hudi通过索引机制将给定的hoodie键(记录键+分区路径)映射到文件组,从而提供了高效的Upsert。
+一旦将记录的第一个版本写入文件,记录键和文件组/文件id之间的映射就永远不会改变。 简而言之,映射的文件组包含一组记录的所有版本。</p>
 
-<h2 id="storage-types--views">Storage Types &amp; Views</h2>
-<p>Hudi storage types define how data is indexed &amp; laid out on the DFS and how the above primitives and timeline activities are implemented on top of such organization (i.e how data is written). 
-In turn, <code class="highlighter-rouge">views</code> define how the underlying data is exposed to the queries (i.e how data is read).</p>
+<h2 id="存储类型和视图">存储类型和视图</h2>
+<p>Hudi存储类型定义了如何在DFS上对数据进行索引和布局以及如何在这种组织之上实现上述原语和时间轴活动(即如何写入数据)。
+反过来,<code class="highlighter-rouge">视图</code>定义了基础数据如何暴露给查询(即如何读取数据)。</p>
 
 <table>
   <thead>
     <tr>
-      <th>Storage Type</th>
-      <th>Supported Views</th>
+      <th>存储类型</th>
+      <th>支持的视图</th>
     </tr>
   </thead>
   <tbody>
     <tr>
-      <td>Copy On Write</td>
-      <td>Read Optimized + Incremental</td>
+      <td>写时复制</td>
+      <td>读优化 + 增量</td>
     </tr>
     <tr>
-      <td>Merge On Read</td>
-      <td>Read Optimized + Incremental + Near Real-time</td>
+      <td>读时合并</td>
+      <td>读优化 + 增量 + 近实时</td>
     </tr>
   </tbody>
 </table>
 
-<h3 id="storage-types">Storage Types</h3>
-<p>Hudi supports the following storage types.</p>
+<h3 id="存储类型">存储类型</h3>
+<p>Hudi支持以下存储类型。</p>
 
 <ul>
-  <li><a href="#copy-on-write-storage">Copy On Write</a> : Stores data using exclusively columnar file formats (e.g parquet). Updates simply version &amp; rewrite the files by performing a synchronous merge during write.</li>
-  <li><a href="#merge-on-read-storage">Merge On Read</a> : Stores data using a combination of columnar (e.g parquet) + row based (e.g avro) file formats. Updates are logged to delta files &amp; later compacted to produce new versions of columnar files synchronously or asynchronously.</li>
+  <li>
+    <p><a href="#copy-on-write-storage">写时复制</a> : 仅使用列文件格式(例如parquet)存储数据。通过在写入过程中执行同步合并以更新版本并重写文件。</p>
+  </li>
+  <li>
+    <p><a href="#merge-on-read-storage">读时合并</a> : 使用列式(例如parquet)+ 基于行(例如avro)的文件格式组合来存储数据。 更新记录到增量文件中,然后进行同步或异步压缩以生成列文件的新版本。</p>
+  </li>
 </ul>
 
-<p>Following table summarizes the trade-offs between these two storage types</p>
+<p>下表总结了这两种存储类型之间的权衡</p>
 
 <table>
   <thead>
     <tr>
-      <th>Trade-off</th>
-      <th>CopyOnWrite</th>
-      <th>MergeOnRead</th>
+      <th>权衡</th>
+      <th>写时复制</th>
+      <th>读时合并</th>
     </tr>
   </thead>
   <tbody>
     <tr>
-      <td>Data Latency</td>
-      <td>Higher</td>
-      <td>Lower</td>
+      <td>数据延迟</td>
+      <td>更高</td>
+      <td>更低</td>
     </tr>
     <tr>
-      <td>Update cost (I/O)</td>
-      <td>Higher (rewrite entire parquet)</td>
-      <td>Lower (append to delta log)</td>
+      <td>更新代价(I/O)</td>
+      <td>更高(重写整个parquet文件)</td>
+      <td>更低(追加到增量日志)</td>
     </tr>
     <tr>
-      <td>Parquet File Size</td>
-      <td>Smaller (high update(I/0) cost)</td>
-      <td>Larger (low update cost)</td>
+      <td>Parquet文件大小</td>
+      <td>更小(高更新代价(I/o))</td>
+      <td>更大(低更新代价)</td>
     </tr>
     <tr>
-      <td>Write Amplification</td>
-      <td>Higher</td>
-      <td>Lower (depending on compaction strategy)</td>
+      <td>写放大</td>
+      <td>更高</td>
+      <td>更低(取决于压缩策略)</td>
     </tr>
   </tbody>
 </table>
 
-<h3 id="views">Views</h3>
-<p>Hudi supports the following views of stored data</p>
+<h3 id="视图">视图</h3>
+<p>Hudi支持以下存储数据的视图</p>
 
 <ul>
-  <li><strong>Read Optimized View</strong> : Queries on this view see the latest snapshot of the dataset as of a given commit or compaction action. 
- This view exposes only the base/columnar files in latest file slices to the queries and guarantees the same columnar query performance compared to a non-hudi columnar dataset.</li>
-  <li><strong>Incremental View</strong> : Queries on this view only see new data written to the dataset, since a given commit/compaction. This view effectively provides change streams to enable incremental data pipelines.</li>
-  <li><strong>Realtime View</strong> : Queries on this view see the latest snapshot of dataset as of a given delta commit action. This view provides near-real time datasets (few mins)
-  by merging the base and delta files of the latest file slice on-the-fly.</li>
+  <li><strong>读优化视图</strong> : 在此视图上的查询将查看给定提交或压缩操作中数据集的最新快照。
+ 该视图仅将最新文件切片中的基本/列文件暴露给查询,并保证与非Hudi列式数据集相比,具有相同的列式查询性能。</li>
+  <li><strong>增量视图</strong> : 对该视图的查询只能看到从某个提交/压缩后写入数据集的新数据。该视图有效地提供了更改流,来支持增量数据管道。</li>
+  <li><strong>实时视图</strong> : 在此视图上的查询将查看某个增量提交操作中数据集的最新快照。该视图通过动态合并最新的基本文件(例如parquet)和增量文件(例如avro)来提供近实时数据集(几分钟的延迟)。</li>
 </ul>
 
-<p>Following table summarizes the trade-offs between the different views.</p>
+<p>下表总结了不同视图之间的权衡。</p>
 
 <table>
   <thead>
     <tr>
-      <th>Trade-off</th>
-      <th>ReadOptimized</th>
-      <th>RealTime</th>
+      <th>权衡</th>
+      <th>读优化</th>
+      <th>实时</th>
     </tr>
   </thead>
   <tbody>
     <tr>
-      <td>Data Latency</td>
-      <td>Higher</td>
-      <td>Lower</td>
+      <td>数据延迟</td>
+      <td>更高</td>
+      <td>更低</td>
     </tr>
     <tr>
-      <td>Query Latency</td>
-      <td>Lower (raw columnar performance)</td>
-      <td>Higher (merge columnar + row based delta)</td>
+      <td>查询延迟</td>
+      <td>更低(原始列式性能)</td>
+      <td>更高(合并列式 + 基于行的增量)</td>
     </tr>
   </tbody>
 </table>
 
-<h2 id="copy-on-write-storage">Copy On Write Storage</h2>
+<h2 id="写时复制存储">写时复制存储</h2>
 
-<p>File slices in Copy-On-Write storage only contain the base/columnar file and each commit produces new versions of base files. 
-In other words, we implicitly compact on every commit, such that only columnar data exists. As a result, the write amplification 
-(number of bytes written for 1 byte of incoming data) is much higher, where read amplification is zero. 
-This is a much desired property for analytical workloads, which is predominantly read-heavy.</p>
+<p>写时复制存储中的文件片仅包含基本/列文件,并且每次提交都会生成新版本的基本文件。
+换句话说,我们压缩每个提交,从而所有的数据都是以列数据的形式储存。在这种情况下,写入数据非常昂贵(我们需要重写整个列数据文件,即使只有一个字节的新数据被提交),而读取数据的成本则没有增加。
+这种视图有利于读取繁重的分析工作。</p>
 
-<p>Following illustrates how this works conceptually, when  data written into copy-on-write storage  and two queries running on top of it.</p>
+<p>以下内容说明了将数据写入写时复制存储并在其上运行两个查询时,它是如何工作的。</p>
 
 <figure>
     <img class="docimage" src="/images/hudi_cow.png" alt="hudi_cow.png" />
 </figure>
 
-<p>As data gets written, updates to existing file groups produce a new slice for that file group stamped with the commit instant time, 
-while inserts allocate a new file group and write its first slice for that file group. These file slices and their commit instant times are color coded above.
-SQL queries running against such a dataset (eg: <code class="highlighter-rouge">select count(*)</code> counting the total records in that partition), first checks the timeline for the latest commit
-and filters all but latest file slices of each file group. As you can see, an old query does not see the current inflight commit’s files color coded in pink,
-but a new query starting after the commit picks up the new data. Thus queries are immune to any write failures/partial writes and only run on committed data.</p>
+<p>随着数据的写入,对现有文件组的更新将为该文件组生成一个带有提交即时时间标记的新切片,而插入分配一个新文件组并写入该文件组的第一个切片。
+这些文件切片及其提交即时时间在上面用颜色编码。
+针对这样的数据集运行SQL查询(例如:<code class="highlighter-rouge">select count(*)</code>统计该分区中的记录数目),首先检查时间轴上的最新提交并过滤每个文件组中除最新文件片以外的所有文件片。
+如您所见,旧查询不会看到以粉红色标记的当前进行中的提交的文件,但是在该提交后的新查询会获取新数据。因此,查询不受任何写入失败/部分写入的影响,仅运行在已提交数据上。</p>
 
-<p>The intention of copy on write storage, is to fundamentally improve how datasets are managed today through</p>
+<p>写时复制存储的目的是从根本上改善当前管理数据集的方式,通过以下方法来实现</p>
 
 <ul>
-  <li>First class support for atomically updating data at file-level, instead of rewriting whole tables/partitions</li>
-  <li>Ability to incremental consume changes, as opposed to wasteful scans or fumbling with heuristics</li>
-  <li>Tight control file sizes to keep query performance excellent (small files hurt query performance considerably).</li>
+  <li>优先支持在文件级原子更新数据,而无需重写整个表/分区</li>
+  <li>能够只读取更新的部分,而不是进行低效的扫描或搜索</li>
+  <li>严格控制文件大小来保持出色的查询性能(小的文件会严重损害查询性能)。</li>
 </ul>
 
-<h2 id="merge-on-read-storage">Merge On Read Storage</h2>
+<h2 id="读时合并存储">读时合并存储</h2>
 
-<p>Merge on read storage is a superset of copy on write, in the sense it still provides a read optimized view of the dataset via the Read Optmized table.
-Additionally, it stores incoming upserts for each file group, onto a row based delta log, that enables providing near real-time data to the queries
- by applying the delta log, onto the latest version of each file id on-the-fly during query time. Thus, this storage type attempts to balance read and write amplication intelligently, to provide near real-time queries.
-The most significant change here, would be to the compactor, which now carefully chooses which delta logs need to be compacted onto
-their columnar base file, to keep the query performance in check (larger delta logs would incur longer merge times with merge data on query side)</p>
+<p>读时合并存储是写时复制的升级版,从某种意义上说,它仍然可以通过读优化表提供数据集的读取优化视图(写时复制的功能)。
+此外,它将每个文件组的更新插入存储到基于行的增量日志中,通过文件id,将增量日志和最新版本的基本文件进行合并,从而提供近实时的数据查询。因此,此存储类型智能地平衡了读和写的成本,以提供近乎实时的查询。
+这里最重要的一点是压缩器,它现在可以仔细挑选需要压缩到其列式基础文件中的增量日志(根据增量日志的文件大小),以保持查询性能(较大的增量日志将会提升近实时的查询时间,并同时需要更长的合并时间)。</p>
 
-<p>Following illustrates how the storage works, and shows queries on both near-real time table and read optimized table.</p>
+<p>以下内容说明了存储的工作方式,并显示了对近实时表和读优化表的查询。</p>
 
 <figure>
     <img class="docimage" src="/images/hudi_mor.png" alt="hudi_mor.png" style="max-width: 1000px" />
 </figure>
 
-<p>There are lot of interesting things happening in this example, which bring out the subtleties in the approach.</p>
+<p>此示例中发生了很多有趣的事情,这些带出了该方法的微妙之处。</p>
 
 <ul>
-  <li>We now have commits every 1 minute or so, something we could not do in the other storage type.</li>
-  <li>Within each file id group, now there is an delta log, which holds incoming updates to records in the base columnar files. In the example, the delta logs hold
- all the data from 10:05 to 10:10. The base columnar files are still versioned with the commit, as before.
- Thus, if one were to simply look at base files alone, then the storage layout looks exactly like a copy on write table.</li>
-  <li>A periodic compaction process reconciles these changes from the delta log and produces a new version of base file, just like what happened at 10:05 in the example.</li>
-  <li>There are two ways of querying the same underlying storage: ReadOptimized (RO) Table and Near-Realtime (RT) table, depending on whether we chose query performance or freshness of data.</li>
-  <li>The semantics around when data from a commit is available to a query changes in a subtle way for the RO table. Note, that such a query
- running at 10:10, wont see data after 10:05 above, while a query on the RT table always sees the freshest data.</li>
-  <li>When we trigger compaction &amp; what it decides to compact hold all the key to solving these hard problems. By implementing a compacting
- strategy, where we aggressively compact the latest partitions compared to older partitions, we could ensure the RO Table sees data
- published within X minutes in a consistent fashion.</li>
+  <li>现在,我们每1分钟左右就有一次提交,这是其他存储类型无法做到的。</li>
+  <li>现在,在每个文件id组中,都有一个增量日志,其中包含对基础列文件中记录的更新。
+ 在示例中,增量日志包含10:05至10:10的所有数据。与以前一样,基本列式文件仍使用提交进行版本控制。
+ 因此,如果只看一眼基本文件,那么存储布局看起来就像是写时复制表的副本。</li>
+  <li>定期压缩过程会从增量日志中合并这些更改,并生成基础文件的新版本,就像示例中10:05发生的情况一样。</li>
+  <li>有两种查询同一存储的方式:读优化(RO)表和近实时(RT)表,具体取决于我们选择查询性能还是数据新鲜度。</li>
+  <li>对于RO表来说,提交数据在何时可用于查询将有些许不同。 请注意,以10:10运行的(在RO表上的)此类查询将不会看到10:05之后的数据,而在RT表上的查询总会看到最新的数据。</li>
+  <li>何时触发压缩以及压缩什么是解决这些难题的关键。
+ 通过实施压缩策略,在该策略中,与较旧的分区相比,我们会积极地压缩最新的分区,从而确保RO表能够以一致的方式看到几分钟内发布的数据。</li>
 </ul>
 
-<p>The intention of merge on read storage is to enable near real-time processing directly on top of DFS, as opposed to copying
-data out to specialized systems, which may not be able to handle the data volume. There are also a few secondary side benefits to 
-this storage such as reduced write amplification by avoiding synchronous merge of data, i.e, the amount of data written per 1 bytes of data in a batch</p>
-
+<p>读时合并存储上的目的是直接在DFS上启用近实时处理,而不是将数据复制到专用系统,后者可能无法处理大数据量。
+该存储还有一些其他方面的好处,例如通过避免数据的同步合并来减少写放大,即批量数据中每1字节数据需要的写入数据量。</p>
 
 
     <div class="tags">
diff --git a/content/cn/configurations.html b/content/cn/configurations.html
index 97bf307..f1d8850 100644
--- a/content/cn/configurations.html
+++ b/content/cn/configurations.html
@@ -3,9 +3,9 @@
     <meta charset="utf-8">
 <meta http-equiv="X-UA-Compatible" content="IE=edge">
 <meta name="viewport" content="width=device-width, initial-scale=1">
-<meta name="description" content="Here we list all possible configurations and what they mean">
+<meta name="description" content="在这里,我们列出了所有可能的配置及其含义。">
 <meta name="keywords" content="garbage collection, hudi, jvm, configs, tuning">
-<title>Configurations | Hudi</title>
+<title>配置 | Hudi</title>
 <link rel="stylesheet" href="/css/syntax.css">
 
 
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -164,7 +164,7 @@
 
 
 
-  <a class="email" title="Submit feedback" href="#" onclick="javascript:window.location='mailto:dev@hudi.apache.org?subject=Hudi Documentation feedback&body=I have some feedback about the Configurations page: ' + window.location.href;"><i class="fa fa-envelope-o"></i> Feedback</a>
+  <a class="email" title="Submit feedback" href="#" onclick="javascript:window.location='mailto:dev@hudi.apache.org?subject=Hudi Documentation feedback&body=I have some feedback about the 配置 page: ' + window.location.href;"><i class="fa fa-envelope-o"></i> Feedback</a>
 
 <li>
 
@@ -187,7 +187,7 @@
                                 searchInput: document.getElementById('search-input'),
                                 resultsContainer: document.getElementById('results-container'),
                                 dataSource: '/search.json',
-                                searchResultTemplate: '<li><a href="{url}" title="Configurations">{title}</a></li>',
+                                searchResultTemplate: '<li><a href="{url}" title="配置">{title}</a></li>',
                     noResultsText: 'No results found.',
                             limit: 10,
                             fuzzy: true,
@@ -324,7 +324,7 @@
     <!-- Content Column -->
     <div class="col-md-9">
         <div class="post-header">
-   <h1 class="post-title-main">Configurations</h1>
+   <h1 class="post-title-main">配置</h1>
 </div>
 
 
@@ -332,7 +332,7 @@
 <div class="post-content">
 
    
-    <div class="summary">Here we list all possible configurations and what they mean</div>
+    <div class="summary">在这里,我们列出了所有可能的配置及其含义。</div>
    
 
     
@@ -363,427 +363,463 @@ $('#toc').on('click', 'a', function() {
 
     
 
-  <p>This page covers the different ways of configuring your job to write/read Hudi datasets. 
-At a high level, you can control behaviour at few levels.</p>
+  <p>该页面介绍了几种配置写入或读取Hudi数据集的作业的方法。
+简而言之,您可以在几个级别上控制行为。</p>
 
 <ul>
-  <li><strong><a href="#spark-datasource">Spark Datasource Configs</a></strong> : These configs control the Hudi Spark Datasource, providing ability to define keys/partitioning, pick out the write operation, specify how to merge records or choosing view type to read.</li>
-  <li><strong><a href="#writeclient-configs">WriteClient Configs</a></strong> : Internally, the Hudi datasource uses a RDD based <code class="highlighter-rouge">HoodieWriteClient</code> api to actually perform writes to storage. These configs provide deep control over lower level aspects like 
- file sizing, compression, parallelism, compaction, write schema, cleaning etc. Although Hudi provides sane defaults, from time-time these configs may need to be tweaked to optimize for specific workloads.</li>
-  <li><strong><a href="#PAYLOAD_CLASS_OPT_KEY">RecordPayload Config</a></strong> : This is the lowest level of customization offered by Hudi. Record payloads define how to produce new values to upsert based on incoming new record and 
- stored old record. Hudi provides default implementations such as <code class="highlighter-rouge">OverwriteWithLatestAvroPayload</code> which simply update storage with the latest/last-written record. 
- This can be overridden to a custom class extending <code class="highlighter-rouge">HoodieRecordPayload</code> class, on both datasource and WriteClient levels.</li>
+  <li><strong><a href="#spark-datasource">Spark数据源配置</a></strong> : 这些配置控制Hudi Spark数据源,提供如下功能:
+ 定义键和分区、选择写操作、指定如何合并记录或选择要读取的视图类型。</li>
+  <li><strong><a href="#writeclient-configs">WriteClient 配置</a></strong> : 在内部,Hudi数据源使用基于RDD的<code class="highlighter-rouge">HoodieWriteClient</code> API
+ 真正执行对存储的写入。 这些配置可对文件大小、压缩(compression)、并行度、压缩(compaction)、写入模式、清理等底层方面进行完全控制。
+ 尽管Hudi提供了合理的默认设置,但在不同情形下,可能需要对这些配置进行调整以针对特定的工作负载进行优化。</li>
+  <li><strong><a href="#PAYLOAD_CLASS_OPT_KEY">RecordPayload 配置</a></strong> : 这是Hudi提供的最底层的定制。
+ RecordPayload定义了如何根据传入的新记录和存储的旧记录来产生新值以进行插入更新。
+ Hudi提供了诸如<code class="highlighter-rouge">OverwriteWithLatestAvroPayload</code>的默认实现,该实现仅使用最新或最后写入的记录来更新存储。
+ 在数据源和WriteClient级别,都可以将其重写为扩展<code class="highlighter-rouge">HoodieRecordPayload</code>类的自定义类。</li>
 </ul>
 
-<h3 id="talking-to-cloud-storage">Talking to Cloud Storage</h3>
+<h3 id="与云存储连接">与云存储连接</h3>
 
-<p>Immaterial of whether RDD/WriteClient APIs or Datasource is used, the following information helps configure access
-to cloud stores.</p>
+<p>无论使用RDD/WriteClient API还是数据源,以下信息都有助于配置对云存储的访问。</p>
 
 <ul>
   <li><a href="s3_hoodie.html">AWS S3</a> <br />
-Configurations required for S3 and Hudi co-operability.</li>
+S3和Hudi协同工作所需的配置。</li>
   <li><a href="gcs_hoodie.html">Google Cloud Storage</a> <br />
-Configurations required for GCS and Hudi co-operability.</li>
+GCS和Hudi协同工作所需的配置。</li>
 </ul>
 
-<h3 id="spark-datasource">Spark Datasource Configs</h3>
+<h3 id="spark-datasource">Spark数据源配置</h3>
 
-<p>Spark jobs using the datasource can be configured by passing the below options into the <code class="highlighter-rouge">option(k,v)</code> method as usual.
-The actual datasource level configs are listed below.</p>
+<p>可以通过将以下选项传递到<code class="highlighter-rouge">option(k,v)</code>方法中来配置使用数据源的Spark作业。
+实际的数据源级别配置在下面列出。</p>
 
-<h4 id="write-options">Write Options</h4>
+<h4 id="写选项">写选项</h4>
 
-<p>Additionally, you can pass down any of the WriteClient level configs directly using <code class="highlighter-rouge">options()</code> or <code class="highlighter-rouge">option(k,v)</code> methods.</p>
+<p>另外,您可以使用<code class="highlighter-rouge">options()</code>或<code class="highlighter-rouge">option(k,v)</code>方法直接传递任何WriteClient级别的配置。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputDF.write()
+<pre><code class="language-Java">inputDF.write()
 .format("org.apache.hudi")
-.options(clientOpts) // any of the Hudi client opts can be passed in as well
+.options(clientOpts) // 任何Hudi客户端选项都可以传入
 .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY(), "_row_key")
 .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY(), "partition")
 .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY(), "timestamp")
 .option(HoodieWriteConfig.TABLE_NAME, tableName)
 .mode(SaveMode.Append)
 .save(basePath);
-</code></pre></div></div>
+</code></pre>
 
-<p>Options useful for writing datasets via <code class="highlighter-rouge">write.format.option(...)</code></p>
+<p>用于通过<code class="highlighter-rouge">write.format.option(...)</code>写入数据集的选项</p>
 
 <h5 id="TABLE_NAME_OPT_KEY">TABLE_NAME_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.table.name</code> [Required]<br />
-  <span style="color:grey">Hive table name, to register the dataset into.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.table.name</code> [必须]<br />
+  <span style="color:grey">Hive表名,用于将数据集注册到其中。</span></p>
 
 <h5 id="OPERATION_OPT_KEY">OPERATION_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.operation</code>, Default: <code class="highlighter-rouge">upsert</code><br />
-  <span style="color:grey">whether to do upsert, insert or bulkinsert for the write operation. Use <code class="highlighter-rouge">bulkinsert</code> to load new data into a table, and there on use <code class="highlighter-rouge">upsert</code>/<code class="highlighter-rouge">insert</code>. 
-  bulk insert uses a disk based write path to scale to load large inputs without need to cache it.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.operation</code>, 默认值:<code class="highlighter-rouge">upsert</code><br />
+  <span style="color:grey">是否为写操作进行插入更新、插入或批量插入。使用<code class="highlighter-rouge">bulkinsert</code>将新数据加载到表中,之后使用<code class="highlighter-rouge">upsert</code>或<code class="highlighter-rouge">insert</code>。
+  批量插入使用基于磁盘的写入路径来扩展以加载大量输入,而无需对其进行缓存。</span></p>
 
 <h5 id="STORAGE_TYPE_OPT_KEY">STORAGE_TYPE_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.storage.type</code>, Default: <code class="highlighter-rouge">COPY_ON_WRITE</code> <br />
-  <span style="color:grey">The storage type for the underlying data, for this write. This can’t change between writes.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.storage.type</code>, 默认值:<code class="highlighter-rouge">COPY_ON_WRITE</code> <br />
+  <span style="color:grey">此写入的基础数据的存储类型。两次写入之间不能改变。</span></p>
 
 <h5 id="PRECOMBINE_FIELD_OPT_KEY">PRECOMBINE_FIELD_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.precombine.field</code>, Default: <code class="highlighter-rouge">ts</code> <br />
-  <span style="color:grey">Field used in preCombining before actual write. When two records have the same key value,
-we will pick the one with the largest value for the precombine field, determined by Object.compareTo(..)</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.precombine.field</code>, 默认值:<code class="highlighter-rouge">ts</code> <br />
+  <span style="color:grey">实际写入之前在preCombining中使用的字段。
+  当两个记录具有相同的键值时,我们将使用Object.compareTo(..)从precombine字段中选择一个值最大的记录。</span></p>
 
 <h5 id="PAYLOAD_CLASS_OPT_KEY">PAYLOAD_CLASS_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.payload.class</code>, Default: <code class="highlighter-rouge">org.apache.hudi.OverwriteWithLatestAvroPayload</code> <br />
-  <span style="color:grey">Payload class used. Override this, if you like to roll your own merge logic, when upserting/inserting. 
-  This will render any value set for <code class="highlighter-rouge">PRECOMBINE_FIELD_OPT_VAL</code> in-effective</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.payload.class</code>, 默认值:<code class="highlighter-rouge">org.apache.hudi.OverwriteWithLatestAvroPayload</code> <br />
+  <span style="color:grey">使用的有效载荷类。如果您想在插入更新或插入时使用自己的合并逻辑,请重写此方法。
+  这将使得<code class="highlighter-rouge">PRECOMBINE_FIELD_OPT_VAL</code>设置的任何值无效</span></p>
 
 <h5 id="RECORDKEY_FIELD_OPT_KEY">RECORDKEY_FIELD_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.recordkey.field</code>, Default: <code class="highlighter-rouge">uuid</code> <br />
-  <span style="color:grey">Record key field. Value to be used as the <code class="highlighter-rouge">recordKey</code> component of <code class="highlighter-rouge">HoodieKey</code>. Actual value
-will be obtained by invoking .toString() on the field value. Nested fields can be specified using
-the dot notation eg: <code class="highlighter-rouge">a.b.c</code></span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.recordkey.field</code>, 默认值:<code class="highlighter-rouge">uuid</code> <br />
+  <span style="color:grey">记录键字段。用作<code class="highlighter-rouge">HoodieKey</code>中<code class="highlighter-rouge">recordKey</code>部分的值。
+  实际值将通过在字段值上调用.toString()来获得。可以使用点符号指定嵌套字段,例如:<code class="highlighter-rouge">a.b.c</code></span></p>
 
 <h5 id="PARTITIONPATH_FIELD_OPT_KEY">PARTITIONPATH_FIELD_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.partitionpath.field</code>, Default: <code class="highlighter-rouge">partitionpath</code> <br />
-  <span style="color:grey">Partition path field. Value to be used at the <code class="highlighter-rouge">partitionPath</code> component of <code class="highlighter-rouge">HoodieKey</code>.
-Actual value ontained by invoking .toString()</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.partitionpath.field</code>, 默认值:<code class="highlighter-rouge">partitionpath</code> <br />
+  <span style="color:grey">分区路径字段。用作<code class="highlighter-rouge">HoodieKey</code>中<code class="highlighter-rouge">partitionPath</code>部分的值。
+  通过调用.toString()获得实际的值</span></p>
 
 <h5 id="KEYGENERATOR_CLASS_OPT_KEY">KEYGENERATOR_CLASS_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.keygenerator.class</code>, Default: <code class="highlighter-rouge">org.apache.hudi.SimpleKeyGenerator</code> <br />
-  <span style="color:grey">Key generator class, that implements will extract the key out of incoming <code class="highlighter-rouge">Row</code> object</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.keygenerator.class</code>, 默认值:<code class="highlighter-rouge">org.apache.hudi.SimpleKeyGenerator</code> <br />
+  <span style="color:grey">键生成器类,实现从输入的<code class="highlighter-rouge">Row</code>对象中提取键</span></p>
 
 <h5 id="COMMIT_METADATA_KEYPREFIX_OPT_KEY">COMMIT_METADATA_KEYPREFIX_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.commitmeta.key.prefix</code>, Default: <code class="highlighter-rouge">_</code> <br />
-  <span style="color:grey">Option keys beginning with this prefix, are automatically added to the commit/deltacommit metadata.
-This is useful to store checkpointing information, in a consistent way with the hudi timeline</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.commitmeta.key.prefix</code>, 默认值:<code class="highlighter-rouge">_</code> <br />
+  <span style="color:grey">以该前缀开头的选项键会自动添加到提交/增量提交的元数据中。
+  这对于与hudi时间轴一致的方式存储检查点信息很有用</span></p>
 
 <h5 id="INSERT_DROP_DUPS_OPT_KEY">INSERT_DROP_DUPS_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.insert.drop.duplicates</code>, Default: <code class="highlighter-rouge">false</code> <br />
-  <span style="color:grey">If set to true, filters out all duplicate records from incoming dataframe, during insert operations. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.insert.drop.duplicates</code>, 默认值:<code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">如果设置为true,则在插入操作期间从传入数据帧中过滤掉所有重复记录。</span></p>
 
 <h5 id="HIVE_SYNC_ENABLED_OPT_KEY">HIVE_SYNC_ENABLED_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.enable</code>, Default: <code class="highlighter-rouge">false</code> <br />
-  <span style="color:grey">When set to true, register/sync the dataset to Apache Hive metastore</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.enable</code>, 默认值:<code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">设置为true时,将数据集注册并同步到Apache Hive Metastore</span></p>
 
 <h5 id="HIVE_DATABASE_OPT_KEY">HIVE_DATABASE_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.database</code>, Default: <code class="highlighter-rouge">default</code> <br />
-  <span style="color:grey">database to sync to</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.database</code>, 默认值:<code class="highlighter-rouge">default</code> <br />
+  <span style="color:grey">要同步到的数据库</span></p>
 
 <h5 id="HIVE_TABLE_OPT_KEY">HIVE_TABLE_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.table</code>, [Required] <br />
-  <span style="color:grey">table to sync to</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.table</code>, [Required] <br />
+  <span style="color:grey">要同步到的表</span></p>
 
 <h5 id="HIVE_USER_OPT_KEY">HIVE_USER_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.username</code>, Default: <code class="highlighter-rouge">hive</code> <br />
-  <span style="color:grey">hive user name to use</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.username</code>, 默认值:<code class="highlighter-rouge">hive</code> <br />
+  <span style="color:grey">要使用的Hive用户名</span></p>
 
 <h5 id="HIVE_PASS_OPT_KEY">HIVE_PASS_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.password</code>, Default: <code class="highlighter-rouge">hive</code> <br />
-  <span style="color:grey">hive password to use</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.password</code>, 默认值:<code class="highlighter-rouge">hive</code> <br />
+  <span style="color:grey">要使用的Hive密码</span></p>
 
 <h5 id="HIVE_URL_OPT_KEY">HIVE_URL_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.jdbcurl</code>, Default: <code class="highlighter-rouge">jdbc:hive2://localhost:10000</code> <br />
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.jdbcurl</code>, 默认值:<code class="highlighter-rouge">jdbc:hive2://localhost:10000</code> <br />
   <span style="color:grey">Hive metastore url</span></p>
 
 <h5 id="HIVE_PARTITION_FIELDS_OPT_KEY">HIVE_PARTITION_FIELDS_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.partition_fields</code>, Default: ` ` <br />
-  <span style="color:grey">field in the dataset to use for determining hive partition columns.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.partition_fields</code>, 默认值:<code class="highlighter-rouge"> </code> <br />
+  <span style="color:grey">数据集中用于确定Hive分区的字段。</span></p>
 
 <h5 id="HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY">HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.partition_extractor_class</code>, Default: <code class="highlighter-rouge">org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor</code> <br />
-  <span style="color:grey">Class used to extract partition field values into hive partition columns.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.partition_extractor_class</code>, 默认值:<code class="highlighter-rouge">org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor</code> <br />
+  <span style="color:grey">用于将分区字段值提取到Hive分区列中的类。</span></p>
 
 <h5 id="HIVE_ASSUME_DATE_PARTITION_OPT_KEY">HIVE_ASSUME_DATE_PARTITION_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.assume_date_partitioning</code>, Default: <code class="highlighter-rouge">false</code> <br />
-  <span style="color:grey">Assume partitioning is yyyy/mm/dd</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.assume_date_partitioning</code>, 默认值:<code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">假设分区格式是yyyy/mm/dd</span></p>
 
-<h4 id="read-options">Read Options</h4>
+<h4 id="读选项">读选项</h4>
 
-<p>Options useful for reading datasets via <code class="highlighter-rouge">read.format.option(...)</code></p>
+<p>用于通过<code class="highlighter-rouge">read.format.option(...)</code>读取数据集的选项</p>
 
 <h5 id="VIEW_TYPE_OPT_KEY">VIEW_TYPE_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.view.type</code>, Default: <code class="highlighter-rouge">read_optimized</code> <br />
-<span style="color:grey">Whether data needs to be read, in incremental mode (new data since an instantTime)
-(or) Read Optimized mode (obtain latest view, based on columnar data)
-(or) Real time mode (obtain latest view, based on row &amp; columnar data)</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.view.type</code>, 默认值:<code class="highlighter-rouge">read_optimized</code> <br />
+<span style="color:grey">是否需要以某种模式读取数据,增量模式(自InstantTime以来的新数据)
+(或)读优化模式(基于列数据获取最新视图)
+(或)实时模式(基于行和列数据获取最新视图)</span></p>
 
 <h5 id="BEGIN_INSTANTTIME_OPT_KEY">BEGIN_INSTANTTIME_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.read.begin.instanttime</code>, [Required in incremental mode] <br />
-<span style="color:grey">Instant time to start incrementally pulling data from. The instanttime here need not
-necessarily correspond to an instant on the timeline. New data written with an
- <code class="highlighter-rouge">instant_time &gt; BEGIN_INSTANTTIME</code> are fetched out. For e.g: ‘20170901080000’ will get
- all new data written after Sep 1, 2017 08:00AM.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.read.begin.instanttime</code>, [在增量模式下必须] <br />
+<span style="color:grey">开始增量提取数据的即时时间。这里的instanttime不必一定与时间轴上的即时相对应。
+取出以<code class="highlighter-rouge">instant_time &gt; BEGIN_INSTANTTIME</code>写入的新数据。
+例如:’20170901080000’将获取2017年9月1日08:00 AM之后写入的所有新数据。</span></p>
 
 <h5 id="END_INSTANTTIME_OPT_KEY">END_INSTANTTIME_OPT_KEY</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.datasource.read.end.instanttime</code>, Default: latest instant (i.e fetches all new data since begin instant time) <br />
-<span style="color:grey"> Instant time to limit incrementally fetched data to. New data written with an
-<code class="highlighter-rouge">instant_time &lt;= END_INSTANTTIME</code> are fetched out.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.read.end.instanttime</code>, 默认值:最新即时(即从开始即时获取所有新数据) <br />
+<span style="color:grey">限制增量提取的数据的即时时间。取出以<code class="highlighter-rouge">instant_time &lt;= END_INSTANTTIME</code>写入的新数据。</span></p>
 
-<h3 id="writeclient-configs">WriteClient Configs</h3>
+<h3 id="writeclient-configs">WriteClient 配置</h3>
 
-<p>Jobs programming directly against the RDD level apis can build a <code class="highlighter-rouge">HoodieWriteConfig</code> object and pass it in to the <code class="highlighter-rouge">HoodieWriteClient</code> constructor. 
-HoodieWriteConfig can be built using a builder pattern as below.</p>
+<p>直接使用RDD级别api进行编程的Jobs可以构建一个<code class="highlighter-rouge">HoodieWriteConfig</code>对象,并将其传递给<code class="highlighter-rouge">HoodieWriteClient</code>构造函数。
+HoodieWriteConfig可以使用以下构建器模式构建。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>HoodieWriteConfig cfg = HoodieWriteConfig.newBuilder()
+<pre><code class="language-Java">HoodieWriteConfig cfg = HoodieWriteConfig.newBuilder()
         .withPath(basePath)
         .forTable(tableName)
         .withSchema(schemaStr)
-        .withProps(props) // pass raw k,v pairs from a property file.
+        .withProps(props) // 从属性文件传递原始k、v对。
         .withCompactionConfig(HoodieCompactionConfig.newBuilder().withXXX(...).build())
         .withIndexConfig(HoodieIndexConfig.newBuilder().withXXX(...).build())
         ...
         .build();
-</code></pre></div></div>
+</code></pre>
 
-<p>Following subsections go over different aspects of write configs, explaining most important configs with their property names, default values.</p>
+<p>以下各节介绍了写配置的不同方面,并解释了最重要的配置及其属性名称和默认值。</p>
 
 <h5 id="withPath">withPath(hoodie_base_path)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.base.path</code> [Required] <br />
-<span style="color:grey">Base DFS path under which all the data partitions are created. Always prefix it explicitly with the storage scheme (e.g hdfs://, s3:// etc). Hudi stores all the main meta-data about commits, savepoints, cleaning audit logs etc in .hoodie directory under the base directory. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.base.path</code> [必须] <br />
+<span style="color:grey">创建所有数据分区所依据的基本DFS路径。
+始终在前缀中明确指明存储方式(例如hdfs://,s3://等)。
+Hudi将有关提交、保存点、清理审核日志等的所有主要元数据存储在基本目录下的.hoodie目录中。</span></p>
 
 <h5 id="withSchema">withSchema(schema_str)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.avro.schema</code> [Required]<br />
-<span style="color:grey">This is the current reader avro schema for the dataset. This is a string of the entire schema. HoodieWriteClient uses this schema to pass on to implementations of HoodieRecordPayload to convert from the source format to avro record. This is also used when re-writing records during an update. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.avro.schema</code> [必须]<br />
+<span style="color:grey">这是数据集的当前读取器的avro模式(schema)。
+这是整个模式的字符串。HoodieWriteClient使用此模式传递到HoodieRecordPayload的实现,以从源格式转换为avro记录。
+在更新过程中重写记录时也使用此模式。</span></p>
 
 <h5 id="forTable">forTable(table_name)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.table.name</code> [Required] <br />
- <span style="color:grey">Table name for the dataset, will be used for registering with Hive. Needs to be same across runs.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.table.name</code> [必须] <br />
+ <span style="color:grey">数据集的表名,将用于在Hive中注册。每次运行需要相同。</span></p>
 
 <h5 id="withBulkInsertParallelism">withBulkInsertParallelism(bulk_insert_parallelism = 1500)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.bulkinsert.shuffle.parallelism</code><br />
-<span style="color:grey">Bulk insert is meant to be used for large initial imports and this parallelism determines the initial number of files in your dataset. Tune this to achieve a desired optimal size during initial import.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.bulkinsert.shuffle.parallelism</code><br />
+<span style="color:grey">批量插入旨在用于较大的初始导入,而此处的并行度决定了数据集中文件的初始数量。
+调整此值以达到在初始导入期间所需的最佳尺寸。</span></p>
 
 <h5 id="withParallelism">withParallelism(insert_shuffle_parallelism = 1500, upsert_shuffle_parallelism = 1500)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.insert.shuffle.parallelism</code>, <code class="highlighter-rouge">hoodie.upsert.shuffle.parallelism</code><br />
-<span style="color:grey">Once data has been initially imported, this parallelism controls initial parallelism for reading input records. Ensure this value is high enough say: 1 partition for 1 GB of input data</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.insert.shuffle.parallelism</code>, <code class="highlighter-rouge">hoodie.upsert.shuffle.parallelism</code><br />
+<span style="color:grey">最初导入数据后,此并行度将控制用于读取输入记录的初始并行度。
+确保此值足够高,例如:1个分区用于1 GB的输入数据</span></p>
 
 <h5 id="combineInput">combineInput(on_insert = false, on_update=true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.combine.before.insert</code>, <code class="highlighter-rouge">hoodie.combine.before.upsert</code><br />
-<span style="color:grey">Flag which first combines the input RDD and merges multiple partial records into a single record before inserting or updating in DFS</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.combine.before.insert</code>, <code class="highlighter-rouge">hoodie.combine.before.upsert</code><br />
+<span style="color:grey">在DFS中插入或更新之前先组合输入RDD并将多个部分记录合并为单个记录的标志</span></p>
 
 <h5 id="withWriteStatusStorageLevel">withWriteStatusStorageLevel(level = MEMORY_AND_DISK_SER)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.write.status.storage.level</code><br />
-<span style="color:grey">HoodieWriteClient.insert and HoodieWriteClient.upsert returns a persisted RDD[WriteStatus], this is because the Client can choose to inspect the WriteStatus and choose and commit or not based on the failures. This is a configuration for the storage level for this RDD </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.write.status.storage.level</code><br />
+<span style="color:grey">HoodieWriteClient.insert和HoodieWriteClient.upsert返回一个持久的RDD[WriteStatus],
+这是因为客户端可以选择检查WriteStatus并根据失败选择是否提交。这是此RDD的存储级别的配置</span></p>
 
 <h5 id="withAutoCommit">withAutoCommit(autoCommit = true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.auto.commit</code><br />
-<span style="color:grey">Should HoodieWriteClient autoCommit after insert and upsert. The client can choose to turn off auto-commit and commit on a “defined success condition”</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.auto.commit</code><br />
+<span style="color:grey">插入和插入更新后,HoodieWriteClient是否应该自动提交。
+客户端可以选择关闭自动提交,并在”定义的成功条件”下提交</span></p>
 
 <h5 id="withAssumeDatePartitioning">withAssumeDatePartitioning(assumeDatePartitioning = false)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.assume.date.partitioning</code><br />
-<span style="color:grey">Should HoodieWriteClient assume the data is partitioned by dates, i.e three levels from base path. This is a stop-gap to support tables created by versions &lt; 0.3.1. Will be removed eventually </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.assume.date.partitioning</code><br />
+<span style="color:grey">HoodieWriteClient是否应该假设数据按日期划分,即从基本路径划分为三个级别。
+这是支持&lt;0.3.1版本创建的表的一个补丁。最终将被删除</span></p>
 
 <h5 id="withConsistencyCheckEnabled">withConsistencyCheckEnabled(enabled = false)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.consistency.check.enabled</code><br />
-<span style="color:grey">Should HoodieWriteClient perform additional checks to ensure written files’ are listable on the underlying filesystem/storage. Set this to true, to workaround S3’s eventual consistency model and ensure all data written as a part of a commit is faithfully available for queries. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.consistency.check.enabled</code><br />
+<span style="color:grey">HoodieWriteClient是否应该执行其他检查,以确保写入的文件在基础文件系统/存储上可列出。
+将其设置为true可以解决S3的最终一致性模型,并确保作为提交的一部分写入的所有数据均能准确地用于查询。</span></p>
 
-<h4 id="index-configs">Index configs</h4>
-<p>Following configs control indexing behavior, which tags incoming records as either inserts or updates to older records.</p>
+<h4 id="索引配置">索引配置</h4>
+<p>以下配置控制索引行为,该行为将传入记录标记为对较旧记录的插入或更新。</p>
 
 <p><a href="#withIndexConfig">withIndexConfig</a> (HoodieIndexConfig) <br />
-<span style="color:grey">This is pluggable to have a external index (HBase) or use the default bloom filter stored in the Parquet files</span></p>
+<span style="color:grey">可插入以具有外部索引(HBase)或使用存储在Parquet文件中的默认布隆过滤器(bloom filter)</span></p>
 
 <h5 id="withIndexType">withIndexType(indexType = BLOOM)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.index.type</code> <br />
-<span style="color:grey">Type of index to use. Default is Bloom filter. Possible options are [BLOOM | HBASE | INMEMORY]. Bloom filters removes the dependency on a external system and is stored in the footer of the Parquet Data Files</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.index.type</code> <br />
+<span style="color:grey">要使用的索引类型。默认为布隆过滤器。可能的选项是[BLOOM | HBASE | INMEMORY]。
+布隆过滤器消除了对外部系统的依赖,并存储在Parquet数据文件的页脚中</span></p>
 
 <h5 id="bloomFilterNumEntries">bloomFilterNumEntries(numEntries = 60000)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.index.bloom.num_entries</code> <br />
-<span style="color:grey">Only applies if index type is BLOOM. <br />This is the number of entries to be stored in the bloom filter. We assume the maxParquetFileSize is 128MB and averageRecordSize is 1024B and hence we approx a total of 130K records in a file. The default (60000) is roughly half of this approximation. <a href="https://issues.apache.org/jira/browse/HUDI-56">HUDI-56</a> tracks computing this dynamically. Warning: Setting this very low, will generate a lot of false positives [...]
+<p>属性:<code class="highlighter-rouge">hoodie.index.bloom.num_entries</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />这是要存储在布隆过滤器中的条目数。
+我们假设maxParquetFileSize为128MB,averageRecordSize为1024B,因此,一个文件中的记录总数约为130K。
+默认值(60000)大约是此近似值的一半。<a href="https://issues.apache.org/jira/browse/HUDI-56">HUDI-56</a>
+描述了如何动态地对此进行计算。
+警告:将此值设置得太低,将产生很多误报,并且索引查找将必须扫描比其所需的更多的文件;如果将其设置得非常高,将线性增加每个数据文件的大小(每50000个条目大约4KB)。</span></p>
 
 <h5 id="bloomFilterFPP">bloomFilterFPP(fpp = 0.000000001)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.index.bloom.fpp</code> <br />
-<span style="color:grey">Only applies if index type is BLOOM. <br /> Error rate allowed given the number of entries. This is used to calculate how many bits should be assigned for the bloom filter and the number of hash functions. This is usually set very low (default: 0.000000001), we like to tradeoff disk space for lower false positives</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.index.bloom.fpp</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />根据条目数允许的错误率。
+这用于计算应为布隆过滤器分配多少位以及哈希函数的数量。通常将此值设置得很低(默认值:0.000000001),我们希望在磁盘空间上进行权衡以降低误报率</span></p>
 
 <h5 id="bloomIndexPruneByRanges">bloomIndexPruneByRanges(pruneRanges = true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.prune.by.ranges</code> <br />
-<span style="color:grey">Only applies if index type is BLOOM. <br /> When true, range information from files to leveraged speed up index lookups. Particularly helpful, if the key has a monotonously increasing prefix, such as timestamp.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.prune.by.ranges</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />为true时,从文件框定信息,可以加快索引查找的速度。 如果键具有单调递增的前缀,例如时间戳,则特别有用。</span></p>
 
 <h5 id="bloomIndexUseCaching">bloomIndexUseCaching(useCaching = true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.use.caching</code> <br />
-<span style="color:grey">Only applies if index type is BLOOM. <br /> When true, the input RDD will cached to speed up index lookup by reducing IO for computing parallelism or affected partitions</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.use.caching</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />为true时,将通过减少用于计算并行度或受影响分区的IO来缓存输入的RDD以加快索引查找</span></p>
 
 <h5 id="bloomIndexTreebasedFilter">bloomIndexTreebasedFilter(useTreeFilter = true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.use.treebased.filter</code> <br />
-<span style="color:grey">Only applies if index type is BLOOM. <br /> When true, interval tree based file pruning optimization is enabled. This mode speeds-up file-pruning based on key ranges when compared with the brute-force mode</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.use.treebased.filter</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />为true时,启用基于间隔树的文件过滤优化。与暴力模式相比,此模式可根据键范围加快文件过滤速度</span></p>
 
 <h5 id="bloomIndexBucketizedChecking">bloomIndexBucketizedChecking(bucketizedChecking = true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.bucketized.checking</code> <br />
-<span style="color:grey">Only applies if index type is BLOOM. <br /> When true, bucketized bloom filtering is enabled. This reduces skew seen in sort based bloom index lookup</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.bucketized.checking</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />为true时,启用了桶式布隆过滤。这减少了在基于排序的布隆索引查找中看到的偏差</span></p>
 
 <h5 id="bloomIndexKeysPerBucket">bloomIndexKeysPerBucket(keysPerBucket = 10000000)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.keys.per.bucket</code> <br />
-<span style="color:grey">Only applies if bloomIndexBucketizedChecking is enabled and index type is bloom. <br /> This configuration controls the “bucket” size which tracks the number of record-key checks made against a single file and is the unit of work allocated to each partition performing bloom filter lookup. A higher value would amortize the fixed cost of reading a bloom filter to memory. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.keys.per.bucket</code> <br />
+<span style="color:grey">仅在启用bloomIndexBucketizedChecking并且索引类型为bloom的情况下适用。<br />
+此配置控制“存储桶”的大小,该大小可跟踪对单个文件进行的记录键检查的次数,并且是分配给执行布隆过滤器查找的每个分区的工作单位。
+较高的值将分摊将布隆过滤器读取到内存的固定成本。</span></p>
 
 <h5 id="bloomIndexParallelism">bloomIndexParallelism(0)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.parallelism</code> <br />
-<span style="color:grey">Only applies if index type is BLOOM. <br /> This is the amount of parallelism for index lookup, which involves a Spark Shuffle. By default, this is auto computed based on input workload characteristics</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.parallelism</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />这是索引查找的并行度,其中涉及Spark Shuffle。 默认情况下,这是根据输入的工作负载特征自动计算的</span></p>
 
-<h5 id="hbaseZkQuorum">hbaseZkQuorum(zkString) [Required]</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.index.hbase.zkquorum</code> <br />
-<span style="color:grey">Only applies if index type is HBASE. HBase ZK Quorum url to connect to.</span></p>
+<h5 id="hbaseZkQuorum">hbaseZkQuorum(zkString) [必须]</h5>
+<p>属性:<code class="highlighter-rouge">hoodie.index.hbase.zkquorum</code> <br />
+<span style="color:grey">仅在索引类型为HBASE时适用。要连接的HBase ZK Quorum URL。</span></p>
 
-<h5 id="hbaseZkPort">hbaseZkPort(port) [Required]</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.index.hbase.zkport</code> <br />
-<span style="color:grey">Only applies if index type is HBASE. HBase ZK Quorum port to connect to.</span></p>
+<h5 id="hbaseZkPort">hbaseZkPort(port) [必须]</h5>
+<p>属性:<code class="highlighter-rouge">hoodie.index.hbase.zkport</code> <br />
+<span style="color:grey">仅在索引类型为HBASE时适用。要连接的HBase ZK Quorum端口。</span></p>
 
-<h5 id="hbaseTableName">hbaseZkZnodeParent(zkZnodeParent)  [Required]</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.index.hbase.zknode.path</code> <br />
-<span style="color:grey">Only applies if index type is HBASE. This is the root znode that will contain all the znodes created/used by HBase.</span></p>
+<h5 id="hbaseTableName">hbaseZkZnodeParent(zkZnodeParent)  [必须]</h5>
+<p>属性:<code class="highlighter-rouge">hoodie.index.hbase.zknode.path</code> <br />
+<span style="color:grey">仅在索引类型为HBASE时适用。这是根znode,它将包含HBase创建及使用的所有znode。</span></p>
 
-<h5 id="hbaseTableName">hbaseTableName(tableName)  [Required]</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.index.hbase.table</code> <br />
-<span style="color:grey">Only applies if index type is HBASE. HBase Table name to use as the index. Hudi stores the row_key and [partition_path, fileID, commitTime] mapping in the table.</span></p>
+<h5 id="hbaseTableName">hbaseTableName(tableName)  [必须]</h5>
+<p>属性:<code class="highlighter-rouge">hoodie.index.hbase.table</code> <br />
+<span style="color:grey">仅在索引类型为HBASE时适用。HBase表名称,用作索引。Hudi将row_key和[partition_path, fileID, commitTime]映射存储在表中。</span></p>
 
-<h4 id="storage-configs">Storage configs</h4>
-<p>Controls aspects around sizing parquet and log files.</p>
+<h4 id="存储选项">存储选项</h4>
+<p>控制有关调整parquet和日志文件大小的方面。</p>
 
 <p><a href="#withStorageConfig">withStorageConfig</a> (HoodieStorageConfig) <br /></p>
 
 <h5 id="limitFileSize">limitFileSize (size = 120MB)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.parquet.max.file.size</code> <br />
-<span style="color:grey">Target size for parquet files produced by Hudi write phases. For DFS, this needs to be aligned with the underlying filesystem block size for optimal performance. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.max.file.size</code> <br />
+<span style="color:grey">Hudi写阶段生成的parquet文件的目标大小。对于DFS,这需要与基础文件系统块大小保持一致,以实现最佳性能。</span></p>
 
 <h5 id="parquetBlockSize">parquetBlockSize(rowgroupsize = 120MB)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.parquet.block.size</code> <br />
-<span style="color:grey">Parquet RowGroup size. Its better this is same as the file size, so that a single column within a file is stored continuously on disk</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.block.size</code> <br />
+<span style="color:grey">Parquet行组大小。最好与文件大小相同,以便将文件中的单个列连续存储在磁盘上</span></p>
 
 <h5 id="parquetPageSize">parquetPageSize(pagesize = 1MB)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.parquet.page.size</code> <br />
-<span style="color:grey">Parquet page size. Page is the unit of read within a parquet file. Within a block, pages are compressed seperately. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.page.size</code> <br />
+<span style="color:grey">Parquet页面大小。页面是parquet文件中的读取单位。 在一个块内,页面被分别压缩。</span></p>
 
 <h5 id="parquetCompressionRatio">parquetCompressionRatio(parquetCompressionRatio = 0.1)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.parquet.compression.ratio</code> <br />
-<span style="color:grey">Expected compression of parquet data used by Hudi, when it tries to size new parquet files. Increase this value, if bulk_insert is producing smaller than expected sized files</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.compression.ratio</code> <br />
+<span style="color:grey">当Hudi尝试调整新parquet文件的大小时,预期对parquet数据进行压缩的比例。
+如果bulk_insert生成的文件小于预期大小,请增加此值</span></p>
 
 <h5 id="parquetCompressionCodec">parquetCompressionCodec(parquetCompressionCodec = gzip)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.parquet.compression.codec</code> <br />
-<span style="color:grey">Parquet compression codec name. Default is gzip. Possible options are [gzip | snappy | uncompressed | lzo]</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.compression.codec</code> <br />
+<span style="color:grey">Parquet压缩编解码方式名称。默认值为gzip。可能的选项是[gzip | snappy | uncompressed | lzo]</span></p>
 
 <h5 id="logFileMaxSize">logFileMaxSize(logFileSize = 1GB)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.logfile.max.size</code> <br />
-<span style="color:grey">LogFile max size. This is the maximum size allowed for a log file before it is rolled over to the next version. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.logfile.max.size</code> <br />
+<span style="color:grey">LogFile的最大大小。这是在将日志文件移到下一个版本之前允许的最大大小。</span></p>
 
 <h5 id="logFileDataBlockMaxSize">logFileDataBlockMaxSize(dataBlockSize = 256MB)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.logfile.data.block.max.size</code> <br />
-<span style="color:grey">LogFile Data block max size. This is the maximum size allowed for a single data block to be appended to a log file. This helps to make sure the data appended to the log file is broken up into sizable blocks to prevent from OOM errors. This size should be greater than the JVM memory. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.logfile.data.block.max.size</code> <br />
+<span style="color:grey">LogFile数据块的最大大小。这是允许将单个数据块附加到日志文件的最大大小。
+这有助于确保附加到日志文件的数据被分解为可调整大小的块,以防止发生OOM错误。此大小应大于JVM内存。</span></p>
 
 <h5 id="logFileToParquetCompressionRatio">logFileToParquetCompressionRatio(logFileToParquetCompressionRatio = 0.35)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.logfile.to.parquet.compression.ratio</code> <br />
-<span style="color:grey">Expected additional compression as records move from log files to parquet. Used for merge_on_read storage to send inserts into log files &amp; control the size of compacted parquet file.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.logfile.to.parquet.compression.ratio</code> <br />
+<span style="color:grey">随着记录从日志文件移动到parquet,预期会进行额外压缩的比例。
+用于merge_on_read存储,以将插入内容发送到日志文件中并控制压缩parquet文件的大小。</span></p>
 
 <h5 id="parquetCompressionCodec">parquetCompressionCodec(parquetCompressionCodec = gzip)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.parquet.compression.codec</code> <br />
-<span style="color:grey">Compression Codec for parquet files </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.compression.codec</code> <br />
+<span style="color:grey">Parquet文件的压缩编解码方式</span></p>
 
-<h4 id="compaction-configs">Compaction configs</h4>
-<p>Configs that control compaction (merging of log files onto a new parquet base file), cleaning (reclamation of older/unused file groups).
+<h4 id="压缩compaction配置">压缩(Compaction)配置</h4>
+<p>压缩配置用于控制压缩(将日志文件合并到新的parquet基本文件中)、清理(回收较旧及未使用的文件组)。
 <a href="#withCompactionConfig">withCompactionConfig</a> (HoodieCompactionConfig) <br /></p>
 
 <h5 id="withCleanerPolicy">withCleanerPolicy(policy = KEEP_LATEST_COMMITS)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.cleaner.policy</code> <br />
-<span style="color:grey"> Cleaning policy to be used. Hudi will delete older versions of parquet files to re-claim space. Any Query/Computation referring to this version of the file will fail. It is good to make sure that the data is retained for more than the maximum query execution time.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.cleaner.policy</code> <br />
+<span style="color:grey">要使用的清理政策。Hudi将删除旧版本的parquet文件以回收空间。
+任何引用此版本文件的查询和计算都将失败。最好确保数据保留的时间超过最大查询执行时间。</span></p>
 
 <h5 id="retainCommits">retainCommits(no_of_commits_to_retain = 24)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.cleaner.commits.retained</code> <br />
-<span style="color:grey">Number of commits to retain. So data will be retained for num_of_commits * time_between_commits (scheduled). This also directly translates into how much you can incrementally pull on this dataset</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.cleaner.commits.retained</code> <br />
+<span style="color:grey">保留的提交数。因此,数据将保留为num_of_commits * time_between_commits(计划的)。
+这也直接转化为您可以逐步提取此数据集的数量</span></p>
 
 <h5 id="archiveCommitsWith">archiveCommitsWith(minCommits = 96, maxCommits = 128)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.keep.min.commits</code>, <code class="highlighter-rouge">hoodie.keep.max.commits</code> <br />
-<span style="color:grey">Each commit is a small file in the <code class="highlighter-rouge">.hoodie</code> directory. Since DFS typically does not favor lots of small files, Hudi archives older commits into a sequential log. A commit is published atomically by a rename of the commit file.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.keep.min.commits</code>, <code class="highlighter-rouge">hoodie.keep.max.commits</code> <br />
+<span style="color:grey">每个提交都是<code class="highlighter-rouge">.hoodie</code>目录中的一个小文件。由于DFS通常不支持大量小文件,因此Hudi将较早的提交归档到顺序日志中。
+提交通过重命名提交文件以原子方式发布。</span></p>
 
 <h5 id="withCommitsArchivalBatchSize">withCommitsArchivalBatchSize(batch = 10)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.commits.archival.batch</code> <br />
-<span style="color:grey">This controls the number of commit instants read in memory as a batch and archived together.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.commits.archival.batch</code> <br />
+<span style="color:grey">这控制着批量读取并一起归档的提交即时的数量。</span></p>
 
 <h5 id="compactionSmallFileSize">compactionSmallFileSize(size = 0)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.parquet.small.file.limit</code> <br />
-<span style="color:grey">This should be less &lt; maxFileSize and setting it to 0, turns off this feature. Small files can always happen because of the number of insert records in a partition in a batch. Hudi has an option to auto-resolve small files by masking inserts into this partition as updates to existing small files. The size here is the minimum file size considered as a “small file size”.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.small.file.limit</code> <br />
+<span style="color:grey">该值应小于maxFileSize,如果将其设置为0,会关闭此功能。
+由于批处理中分区中插入记录的数量众多,总会出现小文件。
+Hudi提供了一个选项,可以通过将对该分区中的插入作为对现有小文件的更新来解决小文件的问题。
+此处的大小是被视为“小文件大小”的最小文件大小。</span></p>
 
 <h5 id="insertSplitSize">insertSplitSize(size = 500000)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.copyonwrite.insert.split.size</code> <br />
-<span style="color:grey">Insert Write Parallelism. Number of inserts grouped for a single partition. Writing out 100MB files, with atleast 1kb records, means 100K records per file. Default is to overprovision to 500K. To improve insert latency, tune this to match the number of records in a single file. Setting this to a low number, will result in small files (particularly when compactionSmallFileSize is 0)</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.copyonwrite.insert.split.size</code> <br />
+<span style="color:grey">插入写入并行度。为单个分区的总共插入次数。
+写出100MB的文件,至少1kb大小的记录,意味着每个文件有100K记录。默认值是超额配置为500K。
+为了改善插入延迟,请对其进行调整以匹配单个文件中的记录数。
+将此值设置为较小的值将导致文件变小(尤其是当compactionSmallFileSize为0时)</span></p>
 
 <h5 id="autoTuneInsertSplits">autoTuneInsertSplits(true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.copyonwrite.insert.auto.split</code> <br />
-<span style="color:grey">Should hudi dynamically compute the insertSplitSize based on the last 24 commit’s metadata. Turned off by default. </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.copyonwrite.insert.auto.split</code> <br />
+<span style="color:grey">Hudi是否应该基于最后24个提交的元数据动态计算insertSplitSize。默认关闭。</span></p>
 
 <h5 id="approxRecordSize">approxRecordSize(size = 1024)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.copyonwrite.record.size.estimate</code> <br />
-<span style="color:grey">The average record size. If specified, hudi will use this and not compute dynamically based on the last 24 commit’s metadata. No value set as default. This is critical in computing the insert parallelism and bin-packing inserts into small files. See above.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.copyonwrite.record.size.estimate</code> <br />
+<span style="color:grey">平均记录大小。如果指定,hudi将使用它,并且不会基于最后24个提交的元数据动态地计算。
+没有默认值设置。这对于计算插入并行度以及将插入打包到小文件中至关重要。如上所述。</span></p>
 
 <h5 id="withInlineCompaction">withInlineCompaction(inlineCompaction = false)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.compact.inline</code> <br />
-<span style="color:grey">When set to true, compaction is triggered by the ingestion itself, right after a commit/deltacommit action as part of insert/upsert/bulk_insert</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.compact.inline</code> <br />
+<span style="color:grey">当设置为true时,紧接在插入或插入更新或批量插入的提交或增量提交操作之后由摄取本身触发压缩</span></p>
 
 <h5 id="withMaxNumDeltaCommitsBeforeCompaction">withMaxNumDeltaCommitsBeforeCompaction(maxNumDeltaCommitsBeforeCompaction = 10)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.compact.inline.max.delta.commits</code> <br />
-<span style="color:grey">Number of max delta commits to keep before triggering an inline compaction</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.compact.inline.max.delta.commits</code> <br />
+<span style="color:grey">触发内联压缩之前要保留的最大增量提交数</span></p>
 
 <h5 id="withCompactionLazyBlockReadEnabled">withCompactionLazyBlockReadEnabled(true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.compaction.lazy.block.read</code> <br />
-<span style="color:grey">When a CompactedLogScanner merges all log files, this config helps to choose whether the logblocks should be read lazily or not. Choose true to use I/O intensive lazy block reading (low memory usage) or false for Memory intensive immediate block read (high memory usage)</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.lazy.block.read</code> <br />
+<span style="color:grey">当CompactedLogScanner合并所有日志文件时,此配置有助于选择是否应延迟读取日志块。
+选择true以使用I/O密集型延迟块读取(低内存使用),或者为false来使用内存密集型立即块读取(高内存使用)</span></p>
 
 <h5 id="withCompactionReverseLogReadEnabled">withCompactionReverseLogReadEnabled(false)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.compaction.reverse.log.read</code> <br />
-<span style="color:grey">HoodieLogFormatReader reads a logfile in the forward direction starting from pos=0 to pos=file_length. If this config is set to true, the Reader reads the logfile in reverse direction, from pos=file_length to pos=0</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.reverse.log.read</code> <br />
+<span style="color:grey">HoodieLogFormatReader会从pos=0到pos=file_length向前读取日志文件。
+如果此配置设置为true,则Reader会从pos=file_length到pos=0反向读取日志文件</span></p>
 
 <h5 id="withCleanerParallelism">withCleanerParallelism(cleanerParallelism = 200)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.cleaner.parallelism</code> <br />
-<span style="color:grey">Increase this if cleaning becomes slow.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.cleaner.parallelism</code> <br />
+<span style="color:grey">如果清理变慢,请增加此值。</span></p>
 
 <h5 id="withCompactionStrategy">withCompactionStrategy(compactionStrategy = org.apache.hudi.io.compact.strategy.LogFileSizeBasedCompactionStrategy)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.compaction.strategy</code> <br />
-<span style="color:grey">Compaction strategy decides which file groups are picked up for compaction during each compaction run. By default. Hudi picks the log file with most accumulated unmerged data</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.strategy</code> <br />
+<span style="color:grey">用来决定在每次压缩运行期间选择要压缩的文件组的压缩策略。
+默认情况下,Hudi选择具有累积最多未合并数据的日志文件</span></p>
 
 <h5 id="withTargetIOPerCompactionInMB">withTargetIOPerCompactionInMB(targetIOPerCompactionInMB = 500000)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.compaction.target.io</code> <br />
-<span style="color:grey">Amount of MBs to spend during compaction run for the LogFileSizeBasedCompactionStrategy. This value helps bound ingestion latency while compaction is run inline mode.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.target.io</code> <br />
+<span style="color:grey">LogFileSizeBasedCompactionStrategy的压缩运行期间要花费的MB量。当压缩以内联模式运行时,此值有助于限制摄取延迟。</span></p>
 
 <h5 id="withTargetPartitionsPerDayBasedCompaction">withTargetPartitionsPerDayBasedCompaction(targetPartitionsPerCompaction = 10)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.compaction.daybased.target</code> <br />
-<span style="color:grey">Used by org.apache.hudi.io.compact.strategy.DayBasedCompactionStrategy to denote the number of latest partitions to compact during a compaction run.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.daybased.target</code> <br />
+<span style="color:grey">由org.apache.hudi.io.compact.strategy.DayBasedCompactionStrategy使用,表示在压缩运行期间要压缩的最新分区数。</span></p>
 
 <h5 id="payloadClassName">withPayloadClass(payloadClassName = org.apache.hudi.common.model.HoodieAvroPayload)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.compaction.payload.class</code> <br />
-<span style="color:grey">This needs to be same as class used during insert/upserts. Just like writing, compaction also uses the record payload class to merge records in the log against each other, merge again with the base file and produce the final record to be written after compaction.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.payload.class</code> <br />
+<span style="color:grey">这需要与插入/插入更新过程中使用的类相同。
+就像写入一样,压缩也使用记录有效负载类将日志中的记录彼此合并,再次与基本文件合并,并生成压缩后要写入的最终记录。</span></p>
 
-<h4 id="metrics-configs">Metrics configs</h4>
-<p>Enables reporting of Hudi metrics to graphite.
+<h4 id="指标配置">指标配置</h4>
+<p>能够将Hudi指标报告给graphite。
 <a href="#withMetricsConfig">withMetricsConfig</a> (HoodieMetricsConfig) <br />
-<span style="color:grey">Hudi publishes metrics on every commit, clean, rollback etc.</span></p>
+<span style="color:grey">Hudi会发布有关每次提交、清理、回滚等的指标。</span></p>
 
 <h5 id="on">on(metricsOn = true)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.metrics.on</code> <br />
-<span style="color:grey">Turn sending metrics on/off. on by default.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.on</code> <br />
+<span style="color:grey">打开或关闭发送指标。默认情况下处于启用状态。</span></p>
 
 <h5 id="withReporterType">withReporterType(reporterType = GRAPHITE)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.metrics.reporter.type</code> <br />
-<span style="color:grey">Type of metrics reporter. Graphite is the default and the only value suppported.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.reporter.type</code> <br />
+<span style="color:grey">指标报告者的类型。默认使用graphite,也是唯一支持的类型。</span></p>
 
 <h5 id="toGraphiteHost">toGraphiteHost(host = localhost)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.metrics.graphite.host</code> <br />
-<span style="color:grey">Graphite host to connect to</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.graphite.host</code> <br />
+<span style="color:grey">要连接的graphite主机</span></p>
 
 <h5 id="onGraphitePort">onGraphitePort(port = 4756)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.metrics.graphite.port</code> <br />
-<span style="color:grey">Graphite port to connect to</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.graphite.port</code> <br />
+<span style="color:grey">要连接的graphite端口</span></p>
 
 <h5 id="usePrefix">usePrefix(prefix = “”)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.metrics.graphite.metric.prefix</code> <br />
-<span style="color:grey">Standard prefix applied to all metrics. This helps to add datacenter, environment information for e.g</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.graphite.metric.prefix</code> <br />
+<span style="color:grey">适用于所有指标的标准前缀。这有助于添加如数据中心、环境等信息</span></p>
 
-<h4 id="memory-configs">Memory configs</h4>
-<p>Controls memory usage for compaction and merges, performed internally by Hudi
+<h4 id="内存配置">内存配置</h4>
+<p>控制由Hudi内部执行的压缩和合并的内存使用情况
 <a href="#withMemoryConfig">withMemoryConfig</a> (HoodieMemoryConfig) <br />
-<span style="color:grey">Memory related configs</span></p>
+<span style="color:grey">内存相关配置</span></p>
 
 <h5 id="withMaxMemoryFractionPerPartitionMerge">withMaxMemoryFractionPerPartitionMerge(maxMemoryFractionPerPartitionMerge = 0.6)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.memory.merge.fraction</code> <br />
-<span style="color:grey">This fraction is multiplied with the user memory fraction (1 - spark.memory.fraction) to get a final fraction of heap space to use during merge </span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.memory.merge.fraction</code> <br />
+<span style="color:grey">该比例乘以用户内存比例(1-spark.memory.fraction)以获得合并期间要使用的堆空间的最终比例</span></p>
 
 <h5 id="withMaxMemorySizePerCompactionInBytes">withMaxMemorySizePerCompactionInBytes(maxMemorySizePerCompactionInBytes = 1GB)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.memory.compaction.fraction</code> <br />
-<span style="color:grey">HoodieCompactedLogScanner reads logblocks, converts records to HoodieRecords and then merges these log blocks and records. At any point, the number of entries in a log block can be less than or equal to the number of entries in the corresponding parquet file. This can lead to OOM in the Scanner. Hence, a spillable map helps alleviate the memory pressure. Use this config to set the max allowable inMemory footprint of the spillable map.</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.memory.compaction.fraction</code> <br />
+<span style="color:grey">HoodieCompactedLogScanner读取日志块,将记录转换为HoodieRecords,然后合并这些日志块和记录。
+在任何时候,日志块中的条目数可以小于或等于相应的parquet文件中的条目数。这可能导致Scanner出现OOM。
+因此,可溢出的映射有助于减轻内存压力。使用此配置来设置可溢出映射的最大允许inMemory占用空间。</span></p>
 
 <h5 id="withWriteStatusFailureFraction">withWriteStatusFailureFraction(failureFraction = 0.1)</h5>
-<p>Property: <code class="highlighter-rouge">hoodie.memory.writestatus.failure.fraction</code> <br />
-<span style="color:grey">This property controls what fraction of the failed record, exceptions we report back to driver</span></p>
+<p>属性:<code class="highlighter-rouge">hoodie.memory.writestatus.failure.fraction</code> <br />
+<span style="color:grey">此属性控制报告给驱动程序的失败记录和异常的比例</span></p>
 
 
     <div class="tags">
diff --git a/content/cn/contributing.html b/content/cn/contributing.html
index be92763..a56da22 100644
--- a/content/cn/contributing.html
+++ b/content/cn/contributing.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/docker_demo.html b/content/cn/docker_demo.html
index e135222..da78da6 100644
--- a/content/cn/docker_demo.html
+++ b/content/cn/docker_demo.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -353,7 +353,7 @@ data infrastructure is brought up in a local docker cluster within your computer
   <li>/etc/hosts : The demo references many services running in container by the hostname. Add the following settings to /etc/hosts</li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>   127.0.0.1 adhoc-1
+<pre><code class="language-Java">   127.0.0.1 adhoc-1
    127.0.0.1 adhoc-2
    127.0.0.1 namenode
    127.0.0.1 datanode1
@@ -362,7 +362,7 @@ data infrastructure is brought up in a local docker cluster within your computer
    127.0.0.1 kafkabroker
    127.0.0.1 sparkmaster
    127.0.0.1 zookeeper
-</code></pre></div></div>
+</code></pre>
 
 <p>Also, this has not been tested on some environments like Docker on Windows.</p>
 
@@ -371,16 +371,16 @@ data infrastructure is brought up in a local docker cluster within your computer
 <h4 id="build-hudi">Build Hudi</h4>
 
 <p>The first step is to build hudi</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd &lt;HUDI_WORKSPACE&gt;
+<pre><code class="language-Java">cd &lt;HUDI_WORKSPACE&gt;
 mvn package -DskipTests
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="bringing-up-demo-cluster">Bringing up Demo Cluster</h4>
 
 <p>The next step is to run the docker compose script and setup configs for bringing up the cluster.
 This should pull the docker images from docker hub and setup docker cluster.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd docker
+<pre><code class="language-Java">cd docker
 ./setup_demo.sh
 ....
 ....
@@ -408,7 +408,7 @@ Copying spark default config and setting up configs
 Copying spark default config and setting up configs
 Copying spark default config and setting up configs
 varadarb-C02SG7Q3G8WP:docker varadarb$ docker ps
-</code></pre></div></div>
+</code></pre>
 
 <p>At this point, the docker cluster will be up and running. The demo cluster brings up the following services</p>
 
@@ -434,7 +434,7 @@ The batches are windowed intentionally so that the second batch contains updates
 
 <p>Upload the first batch to Kafka topic ‘stock ticks’</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P
+<pre><code class="language-Java">cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P
 
 To check if the new topic shows up, use
 kafkacat -b kafkabroker -L -J | jq .
@@ -475,7 +475,7 @@ kafkacat -b kafkabroker -L -J | jq .
   ]
 }
 
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-2-incrementally-ingest-data-from-kafka-topic">Step 2: Incrementally ingest data from Kafka topic</h4>
 
@@ -484,7 +484,7 @@ pull changes and apply to Hudi dataset using upsert/insert primitives. Here, we
 json data from kafka topic and ingest to both COW and MOR tables we initialized in the previous step. This tool
 automatically initializes the datasets in the file-system if they do not exist yet.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 
 # Run the following spark-submit command to execute the delta-streamer and ingest to stock_ticks_cow dataset in HDFS
 spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer $HUDI_UTILITIES_BUNDLE --storage-type COPY_ON_WRITE --source-class org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts  --target-base-path /user/hive/warehouse/stock_ticks_cow --target-table stock_ticks_cow --props /var/demo/config/kafka-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider
@@ -506,7 +506,7 @@ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
 # contain mostly Kafa connectivity settings, the avro-schema to be used for ingesting along with key and partitioning fields.
 
 exit
-</code></pre></div></div>
+</code></pre>
 
 <p>You can use HDFS web-browser to look at the datasets
 <code class="highlighter-rouge">http://namenode:50070/explorer.html#/user/hive/warehouse/stock_ticks_cow</code>.</p>
@@ -522,7 +522,7 @@ file under .hoodie which signals a successful commit.</p>
 <p>At this step, the datasets are available in HDFS. We need to sync with Hive to create new Hive tables and add partitions
 inorder to run Hive queries against those datasets.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 
 # THis command takes in HIveServer URL and COW Hudi Dataset location in HDFS and sync the HDFS state to Hive
 /var/hoodie/ws/hudi-hive/run_sync_tool.sh  --jdbc-url jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by dt --base-path /user/hive/warehouse/stock_ticks_cow --database default --table stock_ticks_cow
@@ -538,7 +538,7 @@ inorder to run Hive queries against those datasets.</p>
 2018-09-24 22:23:09,559 INFO  [main] hive.HiveSyncTool (HiveSyncTool.java:syncHoodieTable(112)) - Sync complete for stock_ticks_mor_rt
 ....
 exit
-</code></pre></div></div>
+</code></pre>
 <p>After executing the above command, you will notice</p>
 
 <ol>
@@ -553,7 +553,7 @@ provides the ReadOptimized view for the Hudi dataset and the later provides the
 (for both COW and MOR dataset)and realtime views (for MOR dataset)give the same value “10:29 a.m” as Hudi create a
 parquet file for the first batch of data.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 # List Tables
 0: jdbc:hive2://hiveserver:10000&gt; show tables;
@@ -649,13 +649,13 @@ WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the futu
 
 exit
 exit
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-4-b-run-spark-sql-queries">Step 4 (b): Run Spark-SQL Queries</h4>
 <p>Hudi support Spark as query processor just like Hive. Here are the same hive queries
 running in spark-sql</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --master local[2] --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 ...
 
@@ -746,14 +746,14 @@ scala&gt; spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, clo
 |20180924222155     |GOOG  |2018-08-31 10:29:00|3391  |1230.1899|1230.085|
 +-------------------+------+-------------------+------+---------+--------+
 
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-5-upload-second-batch-to-kafka-and-run-deltastreamer-to-ingest">Step 5: Upload second batch to Kafka and run DeltaStreamer to ingest</h4>
 
 <p>Upload the second batch of data and ingest this batch using delta-streamer. As this batch does not bring in any new
 partitions, there is no need to run hive-sync</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat docker/demo/data/batch_2.json | kafkacat -b kafkabroker -t stock_ticks -P
+<pre><code class="language-Java">cat docker/demo/data/batch_2.json | kafkacat -b kafkabroker -t stock_ticks -P
 
 # Within Docker container, run the ingestion command
 docker exec -it adhoc-2 /bin/bash
@@ -766,7 +766,7 @@ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
 spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer $HUDI_UTILITIES_BUNDLE --storage-type MERGE_ON_READ --source-class org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts  --target-base-path /user/hive/warehouse/stock_ticks_mor --target-table stock_ticks_mor --props /var/demo/config/kafka-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider --disable-compaction
 
 exit
-</code></pre></div></div>
+</code></pre>
 
 <p>With Copy-On-Write table, the second ingestion by DeltaStreamer resulted in a new version of Parquet file getting created.
 See <code class="highlighter-rouge">http://namenode:50070/explorer.html#/user/hive/warehouse/stock_ticks_cow/2018/08/31</code></p>
@@ -784,7 +784,7 @@ This is the time, when ReadOptimized and Realtime views will provide different r
 return “10:29 am” as it will only read from the Parquet file. Realtime View will do on-the-fly merge and return
 latest committed data which is “10:59 a.m”.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
 # Copy On Write Table:
@@ -848,13 +848,13 @@ WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the futu
 
 exit
 exit
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-6b-run-spark-sql-queries">Step 6(b): Run Spark SQL Queries</h4>
 
 <p>Running the same queries in Spark-SQL:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 
 # Copy On Write Table:
@@ -915,7 +915,7 @@ scala&gt; spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, clo
 
 exit
 exit
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-7--incremental-query-for-copy-on-write-table">Step 7 : Incremental Query for COPY-ON-WRITE Table</h4>
 
@@ -923,7 +923,7 @@ exit
 
 <p>Lets take the same projection query example</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
 0: jdbc:hive2://hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_cow where  symbol = 'GOOG';
@@ -933,7 +933,7 @@ beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache
 | 20180924064621       | GOOG    | 2018-08-31 09:59:00  | 6330    | 1230.5     | 1230.02   |
 | 20180924065039       | GOOG    | 2018-08-31 10:59:00  | 9021    | 1227.1993  | 1227.215  |
 +----------------------+---------+----------------------+---------+------------+-----------+--+
-</code></pre></div></div>
+</code></pre>
 
 <p>As you notice from the above queries, there are 2 commits - 20180924064621 and 20180924065039 in timeline order.
 When you follow the steps, you will be getting different timestamps for commits. Substitute them
@@ -946,19 +946,19 @@ the commit time of the first batch (20180924064621) and run incremental query</p
 <p>Hudi incremental mode provides efficient scanning for incremental queries by filtering out files that do not have any
 candidate rows using hudi-managed metadata.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 0: jdbc:hive2://hiveserver:10000&gt; set hoodie.stock_ticks_cow.consume.mode=INCREMENTAL;
 No rows affected (0.009 seconds)
 0: jdbc:hive2://hiveserver:10000&gt;  set hoodie.stock_ticks_cow.consume.max.commits=3;
 No rows affected (0.009 seconds)
 0: jdbc:hive2://hiveserver:10000&gt; set hoodie.stock_ticks_cow.consume.start.timestamp=20180924064621;
-</code></pre></div></div>
+</code></pre>
 
 <p>With the above setting, file-ids that do not have any updates from the commit 20180924065039 is filtered out without scanning.
 Here is the incremental query :</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0: jdbc:hive2://hiveserver:10000&gt;
+<pre><code class="language-Java">0: jdbc:hive2://hiveserver:10000&gt;
 0: jdbc:hive2://hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_cow where  symbol = 'GOOG' and `_hoodie_commit_time` &gt; '20180924064621';
 +----------------------+---------+----------------------+---------+------------+-----------+--+
 | _hoodie_commit_time  | symbol  |          ts          | volume  |    open    |   close   |
@@ -967,10 +967,10 @@ Here is the incremental query :</p>
 +----------------------+---------+----------------------+---------+------------+-----------+--+
 1 row selected (0.83 seconds)
 0: jdbc:hive2://hiveserver:10000&gt;
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="incremental-query-with-spark-sql">Incremental Query with Spark SQL:</h5>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 Welcome to
       ____              __
@@ -1003,14 +1003,14 @@ scala&gt; spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, clo
 | 20180924065039       | GOOG    | 2018-08-31 10:59:00  | 9021    | 1227.1993  | 1227.215  |
 +----------------------+---------+----------------------+---------+------------+-----------+--+
 
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-8-schedule-and-run-compaction-for-merge-on-read-dataset">Step 8: Schedule and Run Compaction for Merge-On-Read dataset</h4>
 
 <p>Lets schedule and run a compaction to create a new version of columnar  file so that read-optimized readers will see fresher data.
 Again, You can use Hudi CLI to manually schedule and run compaction</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 root@adhoc-1:/opt#   /var/hoodie/ws/hudi-cli/hudi-cli.sh
 ============================================
 *                                          *
@@ -1093,7 +1093,7 @@ hoodie:stock_ticks-&gt;compactions show all
     |==================================================================|
     | 20180924070031         | COMPLETED| 1                            |
 
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-9-run-hive-queries-including-incremental-queries">Step 9: Run Hive Queries including incremental queries</h4>
 
@@ -1102,7 +1102,7 @@ Lets also run the incremental query for MOR table.
 From looking at the below query output, it will be clear that the fist commit time for the MOR table is 20180924064636
 and the second commit time is 20180924070031</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
 # Read Optimized View
@@ -1158,11 +1158,11 @@ No rows affected (0.013 seconds)
 +----------------------+---------+----------------------+---------+------------+-----------+--+
 exit
 exit
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="read-optimized-and-realtime-views-for-mor-with-spark-sql-after-compaction">Read Optimized and Realtime Views for MOR with Spark-SQL after compaction</h5>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 
 # Read Optimized View
@@ -1197,28 +1197,28 @@ scala&gt; spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, clo
 | 20180924064636       | GOOG    | 2018-08-31 09:59:00  | 6330    | 1230.5     | 1230.02   |
 | 20180924070031       | GOOG    | 2018-08-31 10:59:00  | 9021    | 1227.1993  | 1227.215  |
 +----------------------+---------+----------------------+---------+------------+-----------+--+
-</code></pre></div></div>
+</code></pre>
 
 <p>This brings the demo to an end.</p>
 
 <h2 id="testing-hudi-in-local-docker-environment">Testing Hudi in Local Docker environment</h2>
 
 <p>You can bring up a hadoop docker environment containing Hadoop, Hive and Spark services with support for hudi.</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ mvn pre-integration-test -DskipTests
-</code></pre></div></div>
+<pre><code class="language-Java">$ mvn pre-integration-test -DskipTests
+</code></pre>
 <p>The above command builds docker images for all the services with
 current Hudi source installed at /var/hoodie/ws and also brings up the services using a compose file. We
 currently use Hadoop (v2.8.4), Hive (v2.3.3) and Spark (v2.3.1) in docker images.</p>
 
 <p>To bring down the containers</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cd hudi-integ-test
+<pre><code class="language-Java">$ cd hudi-integ-test
 $ mvn docker-compose:down
-</code></pre></div></div>
+</code></pre>
 
 <p>If you want to bring up the docker containers, use</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cd hudi-integ-test
+<pre><code class="language-Java">$ cd hudi-integ-test
 $  mvn docker-compose:up -DdetachedMode=true
-</code></pre></div></div>
+</code></pre>
 
 <p>Hudi is a library that is operated in a broader data analytics/ingestion environment
 involving Hadoop, Hive and Spark. Interoperability with all these systems is a key objective for us. We are
@@ -1244,7 +1244,7 @@ run the script
 
 <p>Here are the commands:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd docker
+<pre><code class="language-Java">cd docker
 ./build_local_docker_images.sh
 .....
 
@@ -1279,19 +1279,11 @@ run the script
 [INFO] Finished at: 2018-09-10T17:47:37-07:00
 [INFO] Final Memory: 236M/1848M
 [INFO] ------------------------------------------------------------------------
-</code></pre></div></div>
+</code></pre>
 
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        
-        
-        
-        
     </div>
 
     
diff --git a/content/cn/events/2016-12-30-strata-talk-2017.html b/content/cn/events/2016-12-30-strata-talk-2017.html
index df3628d..8ae4382 100644
--- a/content/cn/events/2016-12-30-strata-talk-2017.html
+++ b/content/cn/events/2016-12-30-strata-talk-2017.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -368,14 +368,6 @@ Catch our talk <strong>“Incremental Processing on Hadoop At Uber”</strong></
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        <a href="tag_news.html" class="btn btn-default navbar-btn cursorNorm" role="button">news</a>
-        
-        
-        
     </div>
 
     
diff --git a/content/cn/events/2019-01-18-asf-incubation.html b/content/cn/events/2019-01-18-asf-incubation.html
index a55e713..6e0c2de 100644
--- a/content/cn/events/2019-01-18-asf-incubation.html
+++ b/content/cn/events/2019-01-18-asf-incubation.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -367,14 +367,6 @@ $('#toc').on('click', 'a', function() {
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        <a href="tag_news.html" class="btn btn-default navbar-btn cursorNorm" role="button">news</a>
-        
-        
-        
     </div>
 
     
diff --git a/content/cn/gcs_hoodie.html b/content/cn/gcs_hoodie.html
index 228c0f4..333eec8 100644
--- a/content/cn/gcs_hoodie.html
+++ b/content/cn/gcs_hoodie.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/index.html b/content/cn/index.html
index cdcd619..82be15e 100644
--- a/content/cn/index.html
+++ b/content/cn/index.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -382,14 +382,6 @@ $('#toc').on('click', 'a', function() {
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        <a href="tag_getting_started.html" class="btn btn-default navbar-btn cursorNorm" role="button">getting_started</a>
-        
-        
-        
     </div>
 
     
diff --git a/content/cn/migration_guide.html b/content/cn/migration_guide.html
index ab5cf58..c5621ad 100644
--- a/content/cn/migration_guide.html
+++ b/content/cn/migration_guide.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -372,16 +372,18 @@ Take this approach if your dataset is an append only type of dataset and you do
 This tool essentially starts a Spark Job to read the existing parquet dataset and converts it into a HUDI managed dataset by re-writing all the data.</p>
 
 <h4 id="option-2">Option 2</h4>
-<p>For huge datasets, this could be as simple as : for partition in [list of partitions in source dataset] {
-        val inputDF = spark.read.format(“any_input_format”).load(“partition_path”)
-        inputDF.write.format(“org.apache.hudi”).option()….save(“basePath”)
-        }</p>
+<p>For huge datasets, this could be as simple as :</p>
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">partition</span> <span class="n">in</span> <span class="o">[</span><span class="n">list</span> <span class="n">of</span> <span class="n">partitions</span> <span class="n">in</span> <span class="n">source</span> <span class="n">dataset</span><span class="o">]</span> <span class="o">{</span>
+        <span class="n">val</span> <span class="n">inputDF</span> <span class="o">=</span> <span class="n">spark</span><span class="o">.</span><span class="na">read</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"any_input_format"</span><span class="o">).</span><span class="na">load</span><span class="o">(</span><span class="s">"partition_path"</span><span class="o">)</span>
+        <span class="n">inputDF</span><span class="o">.</span><span class="na">write</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span><span class="na">option</span><span class="o">()....</span><span class="na">save</span><span class="o">(</span><span class="s">"basePath"</span><span class="o">)</span>
+<span class="o">}</span>
+</code></pre></div></div>
 
 <h4 id="option-3">Option 3</h4>
 <p>Write your own custom logic of how to load an existing dataset into a Hudi managed one. Please read about the RDD API
  <a href="quickstart.html">here</a>.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Using the HDFSParquetImporter Tool. Once hudi has been built via `mvn clean install -DskipTests`, the shell can be
+<pre><code class="language-Java">Using the HDFSParquetImporter Tool. Once hudi has been built via `mvn clean install -DskipTests`, the shell can be
 fired by via `cd hudi-cli &amp;&amp; ./hudi-cli.sh`.
 
 hudi-&gt;hdfsparquetimport
@@ -398,7 +400,7 @@ hudi-&gt;hdfsparquetimport
         --format parquet
         --sparkMemory 6g
         --retry 2
-</code></pre></div></div>
+</code></pre>
 
 
     <div class="tags">
diff --git a/content/cn/news.html b/content/cn/news.html
index 4ec3ca1..0557b24 100644
--- a/content/cn/news.html
+++ b/content/cn/news.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -258,39 +258,12 @@
         
             
             
-                        
-    <h2><a class="post-link" href="/cn/events/2016-12-30-strata-talk-2017.html">Connect with us at Strata San Jose March 2017</a></h2>
-        <span class="post-meta">Dec 30, 2016 /
-            
-
-                <a href="tag_news.html">news</a>
-
-                </span>
-        <p> We will be presenting Hudi &amp; general concepts around how incremental processing works at Uber.
-Catch our talk “Incremental Processing on Hadoop At Uber”
-
- </p>
-
-            
         
             
             
         
             
             
-                        
-    <h2><a class="post-link" href="/cn/events/2019-01-18-asf-incubation.html">Hudi entered Apache Incubator</a></h2>
-        <span class="post-meta">Jan 18, 2019 /
-            
-
-                <a href="tag_news.html">news</a>
-
-                </span>
-        <p> In the coming weeks, we will be moving in our new home on the Apache Incubator.
-
- </p>
-
-            
         
             
             
diff --git a/content/cn/news_archive.html b/content/cn/news_archive.html
index df5fd1e..6ca01bc 100644
--- a/content/cn/news_archive.html
+++ b/content/cn/news_archive.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/performance.html b/content/cn/performance.html
index d859089..97c3e6b 100644
--- a/content/cn/performance.html
+++ b/content/cn/performance.html
@@ -5,7 +5,7 @@
 <meta name="viewport" content="width=device-width, initial-scale=1">
 <meta name="description" content="">
 <meta name="keywords" content="hudi, index, storage, compaction, cleaning, implementation">
-<title>Performance | Hudi</title>
+<title>性能 | Hudi</title>
 <link rel="stylesheet" href="/css/syntax.css">
 
 
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -164,7 +164,7 @@
 
 
 
-  <a class="email" title="Submit feedback" href="#" onclick="javascript:window.location='mailto:dev@hudi.apache.org?subject=Hudi Documentation feedback&body=I have some feedback about the Performance page: ' + window.location.href;"><i class="fa fa-envelope-o"></i> Feedback</a>
+  <a class="email" title="Submit feedback" href="#" onclick="javascript:window.location='mailto:dev@hudi.apache.org?subject=Hudi Documentation feedback&body=I have some feedback about the 性能 page: ' + window.location.href;"><i class="fa fa-envelope-o"></i> Feedback</a>
 
 <li>
 
@@ -187,7 +187,7 @@
                                 searchInput: document.getElementById('search-input'),
                                 resultsContainer: document.getElementById('results-container'),
                                 dataSource: '/search.json',
-                                searchResultTemplate: '<li><a href="{url}" title="Performance">{title}</a></li>',
+                                searchResultTemplate: '<li><a href="{url}" title="性能">{title}</a></li>',
                     noResultsText: 'No results found.',
                             limit: 10,
                             fuzzy: true,
@@ -324,7 +324,7 @@
     <!-- Content Column -->
     <div class="col-md-9">
         <div class="post-header">
-   <h1 class="post-title-main">Performance</h1>
+   <h1 class="post-title-main">性能</h1>
 </div>
 
 
@@ -338,45 +338,42 @@
 
     
 
-  <p>In this section, we go over some real world performance numbers for Hudi upserts, incremental pull and compare them against
-the conventional alternatives for achieving these tasks.</p>
+  <p>在本节中,我们将介绍一些有关Hudi插入更新、增量提取的实际性能数据,并将其与实现这些任务的其它传统工具进行比较。</p>
 
-<h2 id="upserts">Upserts</h2>
+<h2 id="插入更新">插入更新</h2>
 
-<p>Following shows the speed up obtained for NoSQL database ingestion, from incrementally upserting on a Hudi dataset on the copy-on-write storage,
-on 5 tables ranging from small to huge (as opposed to bulk loading the tables)</p>
+<p>下面显示了从NoSQL数据库摄取获得的速度提升,这些速度提升数据是通过在写入时复制存储上的Hudi数据集上插入更新而获得的,
+数据集包括5个从小到大的表(相对于批量加载表)。</p>
 
 <figure>
     <img class="docimage" src="/images/hudi_upsert_perf1.png" alt="hudi_upsert_perf1.png" style="max-width: 1000px" />
 </figure>
 
-<p>Given Hudi can build the dataset incrementally, it opens doors for also scheduling ingesting more frequently thus reducing latency, with
-significant savings on the overall compute cost.</p>
+<p>由于Hudi可以通过增量构建数据集,它也为更频繁地调度摄取提供了可能性,从而减少了延迟,并显著节省了总体计算成本。</p>
 
 <figure>
     <img class="docimage" src="/images/hudi_upsert_perf2.png" alt="hudi_upsert_perf2.png" style="max-width: 1000px" />
 </figure>
 
-<p>Hudi upserts have been stress tested upto 4TB in a single commit across the t1 table. 
-See <a href="https://cwiki.apache.org/confluence/display/HUDI/Tuning+Guide">here</a> for some tuning tips.</p>
+<p>Hudi插入更新在t1表的一次提交中就进行了高达4TB的压力测试。
+有关一些调优技巧,请参见<a href="https://cwiki.apache.org/confluence/display/HUDI/Tuning+Guide">这里</a>。</p>
 
-<h2 id="indexing">Indexing</h2>
+<h2 id="索引">索引</h2>
 
-<p>In order to efficiently upsert data, Hudi needs to classify records in a write batch into inserts &amp; updates (tagged with the file group 
-it belongs to). In order to speed this operation, Hudi employs a pluggable index mechanism that stores a mapping between recordKey and 
-the file group id it belongs to. By default, Hudi uses a built in index that uses file ranges and bloom filters to accomplish this, with
-upto 10x speed up over a spark join to do the same.</p>
+<p>为了有效地插入更新数据,Hudi需要将要写入的批量数据中的记录分类为插入和更新(并标记它所属的文件组)。
+为了加快此操作的速度,Hudi采用了可插拔索引机制,该机制存储了recordKey和它所属的文件组ID之间的映射。
+默认情况下,Hudi使用内置索引,该索引使用文件范围和布隆过滤器来完成此任务,相比于Spark Join,其速度最高可提高10倍。</p>
 
-<p>Hudi provides best indexing performance when you model the recordKey to be monotonically increasing (e.g timestamp prefix), leading to range pruning filtering
-out a lot of files for comparison. Even for UUID based keys, there are <a href="https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/">known techniques</a> to achieve this.
-For e.g , with 100M timestamp prefixed keys (5% updates, 95% inserts) on a event table with 80B keys/3 partitions/11416 files/10TB data, Hudi index achieves a 
-<strong>~7X (2880 secs vs 440 secs) speed up</strong> over vanilla spark join. Even for a challenging workload like an ‘100% update’ database ingestion workload spanning 
-3.25B UUID keys/30 partitions/6180 files using 300 cores, Hudi indexing offers a <strong>80-100% speedup</strong>.</p>
+<p>当您将recordKey建模为单调递增时(例如时间戳前缀),Hudi提供了最佳的索引性能,从而进行范围过滤来避免与许多文件进行比较。
+即使对于基于UUID的键,也有<a href="https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/">已知技术</a>来达到同样目的。
+例如,在具有80B键、3个分区、11416个文件、10TB数据的事件表上使用100M个时间戳前缀的键(5%的更新,95%的插入)时,
+相比于原始Spark Join,Hudi索引速度的提升<strong>约为7倍(440秒相比于2880秒)</strong>。
+即使对于具有挑战性的工作负载,如使用300个核对3.25B UUID键、30个分区、6180个文件的“100%更新”的数据库摄取工作负载,Hudi索引也可以提供<strong>80-100%的加速</strong>。</p>
 
-<h2 id="read-optimized-queries">Read Optimized Queries</h2>
+<h2 id="读优化查询">读优化查询</h2>
 
-<p>The major design goal for read optimized view is to achieve the latency reduction &amp; efficiency gains in previous section,
-with no impact on queries. Following charts compare the Hudi vs non-Hudi datasets across Hive/Presto/Spark queries and demonstrate this.</p>
+<p>读优化视图的主要设计目标是在不影响查询的情况下实现上一节中提到的延迟减少和效率提高。
+下图比较了对Hudi和非Hudi数据集的Hive、Presto、Spark查询,并对此进行说明。</p>
 
 <p><strong>Hive</strong></p>
 
diff --git a/content/cn/powered_by.html b/content/cn/powered_by.html
index 4f5c6c0..d6c80a9 100644
--- a/content/cn/powered_by.html
+++ b/content/cn/powered_by.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/privacy.html b/content/cn/privacy.html
index 897801c..8c43aad 100644
--- a/content/cn/privacy.html
+++ b/content/cn/privacy.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/querying_data.html b/content/cn/querying_data.html
index ce77e1d..c44e68e 100644
--- a/content/cn/querying_data.html
+++ b/content/cn/querying_data.html
@@ -3,9 +3,9 @@
     <meta charset="utf-8">
 <meta http-equiv="X-UA-Compatible" content="IE=edge">
 <meta name="viewport" content="width=device-width, initial-scale=1">
-<meta name="description" content="In this page, we go over how to enable SQL queries on Hudi built tables.">
+<meta name="description" content="在这一页里,我们介绍了如何在Hudi构建的表上启用SQL查询。">
 <meta name="keywords" content="hudi, hive, spark, sql, presto">
-<title>Querying Hudi Datasets | Hudi</title>
+<title>查询 Hudi 数据集 | Hudi</title>
 <link rel="stylesheet" href="/css/syntax.css">
 
 
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -164,7 +164,7 @@
 
 
 
-  <a class="email" title="Submit feedback" href="#" onclick="javascript:window.location='mailto:dev@hudi.apache.org?subject=Hudi Documentation feedback&body=I have some feedback about the Querying Hudi Datasets page: ' + window.location.href;"><i class="fa fa-envelope-o"></i> Feedback</a>
+  <a class="email" title="Submit feedback" href="#" onclick="javascript:window.location='mailto:dev@hudi.apache.org?subject=Hudi Documentation feedback&body=I have some feedback about the 查询 Hudi 数据集 page: ' + window.location.href;"><i class="fa fa-envelope-o"></i> Feedback</a>
 
 <li>
 
@@ -187,7 +187,7 @@
                                 searchInput: document.getElementById('search-input'),
                                 resultsContainer: document.getElementById('results-container'),
                                 dataSource: '/search.json',
-                                searchResultTemplate: '<li><a href="{url}" title="Querying Hudi Datasets">{title}</a></li>',
+                                searchResultTemplate: '<li><a href="{url}" title="查询 Hudi 数据集">{title}</a></li>',
                     noResultsText: 'No results found.',
                             limit: 10,
                             fuzzy: true,
@@ -324,7 +324,7 @@
     <!-- Content Column -->
     <div class="col-md-9">
         <div class="post-header">
-   <h1 class="post-title-main">Querying Hudi Datasets</h1>
+   <h1 class="post-title-main">查询 Hudi 数据集</h1>
 </div>
 
 
@@ -332,7 +332,7 @@
 <div class="post-content">
 
    
-    <div class="summary">In this page, we go over how to enable SQL queries on Hudi built tables.</div>
+    <div class="summary">在这一页里,我们介绍了如何在Hudi构建的表上启用SQL查询。</div>
    
 
     
@@ -340,219 +340,219 @@
 
     
 
-  <p>Conceptually, Hudi stores data physically once on DFS, while providing 3 logical views on top, as explained <a href="concepts.html#views">before</a>. 
-Once the dataset is synced to the Hive metastore, it provides external Hive tables backed by Hudi’s custom inputformats. Once the proper hudi
-bundle has been provided, the dataset can be queried by popular query engines like Hive, Spark and Presto.</p>
+  <p>从概念上讲,Hudi物理存储一次数据到DFS上,同时在其上提供三个逻辑视图,如<a href="concepts.html#views">之前</a>所述。
+数据集同步到Hive Metastore后,它将提供由Hudi的自定义输入格式支持的Hive外部表。一旦提供了适当的Hudi捆绑包,
+就可以通过Hive、Spark和Presto之类的常用查询引擎来查询数据集。</p>
 
-<p>Specifically, there are two Hive tables named off <a href="configurations.html#TABLE_NAME_OPT_KEY">table name</a> passed during write. 
-For e.g, if <code class="highlighter-rouge">table name = hudi_tbl</code>, then we get</p>
+<p>具体来说,在写入过程中传递了两个由<a href="configurations.html#TABLE_NAME_OPT_KEY">table name</a>命名的Hive表。
+例如,如果<code class="highlighter-rouge">table name = hudi_tbl</code>,我们得到</p>
 
 <ul>
-  <li><code class="highlighter-rouge">hudi_tbl</code> realizes the read optimized view of the dataset backed by <code class="highlighter-rouge">HoodieParquetInputFormat</code>, exposing purely columnar data.</li>
-  <li><code class="highlighter-rouge">hudi_tbl_rt</code> realizes the real time view of the dataset  backed by <code class="highlighter-rouge">HoodieParquetRealtimeInputFormat</code>, exposing merged view of base and log data.</li>
+  <li><code class="highlighter-rouge">hudi_tbl</code> 实现了由 <code class="highlighter-rouge">HoodieParquetInputFormat</code> 支持的数据集的读优化视图,从而提供了纯列式数据。</li>
+  <li><code class="highlighter-rouge">hudi_tbl_rt</code> 实现了由 <code class="highlighter-rouge">HoodieParquetRealtimeInputFormat</code> 支持的数据集的实时视图,从而提供了基础数据和日志数据的合并视图。</li>
 </ul>
 
-<p>As discussed in the concepts section, the one key primitive needed for <a href="https://www.oreilly.com/ideas/ubers-case-for-incremental-processing-on-hadoop">incrementally processing</a>,
-is <code class="highlighter-rouge">incremental pulls</code> (to obtain a change stream/log from a dataset). Hudi datasets can be pulled incrementally, which means you can get ALL and ONLY the updated &amp; new rows 
-since a specified instant time. This, together with upserts, are particularly useful for building data pipelines where 1 or more source Hudi tables are incrementally pulled (streams/facts),
-joined with other tables (datasets/dimensions), to <a href="writing_data.html">write out deltas</a> to a target Hudi dataset. Incremental view is realized by querying one of the tables above, 
-with special configurations that indicates to query planning that only incremental data needs to be fetched out of the dataset.</p>
+<p>如概念部分所述,<a href="https://www.oreilly.com/ideas/ubers-case-for-incremental-processing-on-hadoop">增量处理</a>所需要的
+一个关键原语是<code class="highlighter-rouge">增量拉取</code>(以从数据集中获取更改流/日志)。您可以增量提取Hudi数据集,这意味着自指定的即时时间起,
+您可以只获得全部更新和新行。 这与插入更新一起使用,对于构建某些数据管道尤其有用,包括将1个或多个源Hudi表(数据流/事实)以增量方式拉出(流/事实)
+并与其他表(数据集/维度)结合以<a href="write_data.html">写出增量</a>到目标Hudi数据集。增量视图是通过查询上表之一实现的,并具有特殊配置,
+该特殊配置指示查询计划仅需要从数据集中获取增量数据。</p>
 
-<p>In sections, below we will discuss in detail how to access all the 3 views on each query engine.</p>
+<p>接下来,我们将详细讨论在每个查询引擎上如何访问所有三个视图。</p>
 
 <h2 id="hive">Hive</h2>
 
-<p>In order for Hive to recognize Hudi datasets and query correctly, the HiveServer2 needs to be provided with the <code class="highlighter-rouge">hudi-hadoop-mr-bundle-x.y.z-SNAPSHOT.jar</code> 
-in its <a href="https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_mc_hive_udf.html#concept_nc3_mms_lr">aux jars path</a>. This will ensure the input format 
-classes with its dependencies are available for query planning &amp; execution.</p>
+<p>为了使Hive能够识别Hudi数据集并正确查询,
+HiveServer2需要在其<a href="https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_mc_hive_udf.html#concept_nc3_mms_lr">辅助jars路径</a>中提供<code class="highlighter-rouge">hudi-hadoop-mr-bundle-x.y.z-SNAPSHOT.jar</code>。 
+这将确保输入格式类及其依赖项可用于查询计划和执行。</p>
 
-<h3 id="hive-ro-view">Read Optimized table</h3>
-<p>In addition to setup above, for beeline cli access, the <code class="highlighter-rouge">hive.input.format</code> variable needs to be set to the  fully qualified path name of the 
-inputformat <code class="highlighter-rouge">org.apache.hudi.hadoop.HoodieParquetInputFormat</code>. For Tez, additionally the <code class="highlighter-rouge">hive.tez.input.format</code> needs to be set 
-to <code class="highlighter-rouge">org.apache.hadoop.hive.ql.io.HiveInputFormat</code></p>
+<h3 id="hive-ro-view">读优化表</h3>
+<p>除了上述设置之外,对于beeline cli访问,还需要将<code class="highlighter-rouge">hive.input.format</code>变量设置为<code class="highlighter-rouge">org.apache.hudi.hadoop.HoodieParquetInputFormat</code>输入格式的完全限定路径名。
+对于Tez,还需要将<code class="highlighter-rouge">hive.tez.input.format</code>设置为<code class="highlighter-rouge">org.apache.hadoop.hive.ql.io.HiveInputFormat</code>。</p>
 
-<h3 id="hive-rt-view">Real time table</h3>
-<p>In addition to installing the hive bundle jar on the HiveServer2, it needs to be put on the hadoop/hive installation across the cluster, so that
-queries can pick up the custom RecordReader as well.</p>
+<h3 id="hive-rt-view">实时表</h3>
+<p>除了在HiveServer2上安装Hive捆绑jars之外,还需要将其放在整个集群的hadoop/hive安装中,这样查询也可以使用自定义RecordReader。</p>
 
-<h3 id="hive-incr-pull">Incremental Pulling</h3>
+<h3 id="hive-incr-pull">增量拉取</h3>
 
-<p><code class="highlighter-rouge">HiveIncrementalPuller</code> allows incrementally extracting changes from large fact/dimension tables via HiveQL, combining the benefits of Hive (reliably process complex SQL queries) and 
-incremental primitives (speed up query by pulling tables incrementally instead of scanning fully). The tool uses Hive JDBC to run the hive query and saves its results in a temp table.
-that can later be upserted. Upsert utility (<code class="highlighter-rouge">HoodieDeltaStreamer</code>) has all the state it needs from the directory structure to know what should be the commit time on the target table.
-e.g: <code class="highlighter-rouge">/app/incremental-hql/intermediate/{source_table_name}_temp/{last_commit_included}</code>.The Delta Hive table registered will be of the form <code class="highlighter-rouge">{tmpdb}.{source_table}_{last_commit_included}</code>.</p>
+<p><code class="highlighter-rouge">HiveIncrementalPuller</code>允许通过HiveQL从大型事实/维表中增量提取更改,
+结合了Hive(可靠地处理复杂的SQL查询)和增量原语的好处(通过增量拉取而不是完全扫描来加快查询速度)。
+该工具使用Hive JDBC运行hive查询并将其结果保存在临时表中,这个表可以被插入更新。
+Upsert实用程序(<code class="highlighter-rouge">HoodieDeltaStreamer</code>)具有目录结构所需的所有状态,以了解目标表上的提交时间应为多少。
+例如:<code class="highlighter-rouge">/app/incremental-hql/intermediate/{source_table_name}_temp/{last_commit_included}</code>。
+已注册的Delta Hive表的格式为<code class="highlighter-rouge">{tmpdb}.{source_table}_{last_commit_included}</code>。</p>
 
-<p>The following are the configuration options for HiveIncrementalPuller</p>
+<p>以下是HiveIncrementalPuller的配置选项</p>
 
 <table>
   <tbody>
     <tr>
-      <td><strong>Config</strong></td>
-      <td><strong>Description</strong></td>
-      <td><strong>Default</strong></td>
+      <td><strong>配置</strong></td>
+      <td><strong>描述</strong></td>
+      <td><strong>默认值</strong></td>
     </tr>
     <tr>
       <td>hiveUrl</td>
-      <td>Hive Server 2 URL to connect to</td>
+      <td>要连接的Hive Server 2的URL</td>
       <td> </td>
     </tr>
     <tr>
       <td>hiveUser</td>
-      <td>Hive Server 2 Username</td>
+      <td>Hive Server 2 用户名</td>
       <td> </td>
     </tr>
     <tr>
       <td>hivePass</td>
-      <td>Hive Server 2 Password</td>
+      <td>Hive Server 2 密码</td>
       <td> </td>
     </tr>
     <tr>
       <td>queue</td>
-      <td>YARN Queue name</td>
+      <td>YARN 队列名称</td>
       <td> </td>
     </tr>
     <tr>
       <td>tmp</td>
-      <td>Directory where the temporary delta data is stored in DFS. The directory structure will follow conventions. Please see the below section.</td>
+      <td>DFS中存储临时增量数据的目录。目录结构将遵循约定。请参阅以下部分。</td>
       <td> </td>
     </tr>
     <tr>
       <td>extractSQLFile</td>
-      <td>The SQL to execute on the source table to extract the data. The data extracted will be all the rows that changed since a particular point in time.</td>
+      <td>在源表上要执行的提取数据的SQL。提取的数据将是自特定时间点以来已更改的所有行。</td>
       <td> </td>
     </tr>
     <tr>
       <td>sourceTable</td>
-      <td>Source Table Name. Needed to set hive environment properties.</td>
+      <td>源表名称。在Hive环境属性中需要设置。</td>
       <td> </td>
     </tr>
     <tr>
       <td>targetTable</td>
-      <td>Target Table Name. Needed for the intermediate storage directory structure.</td>
+      <td>目标表名称。中间存储目录结构需要。</td>
       <td> </td>
     </tr>
     <tr>
       <td>sourceDataPath</td>
-      <td>Source DFS Base Path. This is where the Hudi metadata will be read.</td>
+      <td>源DFS基本路径。这是读取Hudi元数据的地方。</td>
       <td> </td>
     </tr>
     <tr>
       <td>targetDataPath</td>
-      <td>Target DFS Base path. This is needed to compute the fromCommitTime. This is not needed if fromCommitTime is specified explicitly.</td>
+      <td>目标DFS基本路径。 这是计算fromCommitTime所必需的。 如果显式指定了fromCommitTime,则不需要设置这个参数。</td>
       <td> </td>
     </tr>
     <tr>
       <td>tmpdb</td>
-      <td>The database to which the intermediate temp delta table will be created</td>
+      <td>用来创建中间临时增量表的数据库</td>
       <td>hoodie_temp</td>
     </tr>
     <tr>
       <td>fromCommitTime</td>
-      <td>This is the most important parameter. This is the point in time from which the changed records are pulled from.</td>
+      <td>这是最重要的参数。 这是从中提取更改的记录的时间点。</td>
       <td> </td>
     </tr>
     <tr>
       <td>maxCommits</td>
-      <td>Number of commits to include in the pull. Setting this to -1 will include all the commits from fromCommitTime. Setting this to a value &gt; 0, will include records that ONLY changed in the specified number of commits after fromCommitTime. This may be needed if you need to catch up say 2 commits at a time.</td>
+      <td>要包含在拉取中的提交数。将此设置为-1将包括从fromCommitTime开始的所有提交。将此设置为大于0的值,将包括在fromCommitTime之后仅更改指定提交次数的记录。如果您需要一次赶上两次提交,则可能需要这样做。</td>
       <td>3</td>
     </tr>
     <tr>
       <td>help</td>
-      <td>Utility Help</td>
+      <td>实用程序帮助</td>
       <td> </td>
     </tr>
   </tbody>
 </table>
 
-<p>Setting fromCommitTime=0 and maxCommits=-1 will pull in the entire source dataset and can be used to initiate backfills. If the target dataset is a Hudi dataset,
-then the utility can determine if the target dataset has no commits or is behind more than 24 hour (this is configurable),
-it will automatically use the backfill configuration, since applying the last 24 hours incrementally could take more time than doing a backfill. The current limitation of the tool
-is the lack of support for self-joining the same table in mixed mode (normal and incremental modes).</p>
+<p>设置fromCommitTime=0和maxCommits=-1将提取整个源数据集,可用于启动Backfill。
+如果目标数据集是Hudi数据集,则该实用程序可以确定目标数据集是否没有提交或延迟超过24小时(这是可配置的),
+它将自动使用Backfill配置,因为增量应用最近24小时的更改会比Backfill花费更多的时间。
+该工具当前的局限性在于缺乏在混合模式(正常模式和增量模式)下自联接同一表的支持。</p>
 
-<p><strong>NOTE on Hive queries that are executed using Fetch task:</strong>
-Since Fetch tasks invoke InputFormat.listStatus() per partition, Hoodie metadata can be listed in
-every such listStatus() call. In order to avoid this, it might be useful to disable fetch tasks
-using the hive session property for incremental queries: <code class="highlighter-rouge">set hive.fetch.task.conversion=none;</code> This
-would ensure Map Reduce execution is chosen for a Hive query, which combines partitions (comma
-separated) and calls InputFormat.listStatus() only once with all those partitions.</p>
+<p><strong>关于使用Fetch任务执行的Hive查询的说明:</strong>
+由于Fetch任务为每个分区调用InputFormat.listStatus(),每个listStatus()调用都会列出Hoodie元数据。
+为了避免这种情况,如下操作可能是有用的,即使用Hive session属性对增量查询禁用Fetch任务:
+<code class="highlighter-rouge">set hive.fetch.task.conversion = none;</code>。这将确保Hive查询使用Map Reduce执行,
+合并分区(用逗号分隔),并且对所有这些分区仅调用一次InputFormat.listStatus()。</p>
 
 <h2 id="spark">Spark</h2>
 
-<p>Spark provides much easier deployment &amp; management of Hudi jars and bundles into jobs/notebooks. At a high level, there are two ways to access Hudi datasets in Spark.</p>
+<p>Spark可将Hudi jars和捆绑包轻松部署和管理到作业/笔记本中。简而言之,通过Spark有两种方法可以访问Hudi数据集。</p>
 
 <ul>
-  <li><strong>Hudi DataSource</strong> : Supports Read Optimized, Incremental Pulls similar to how standard datasources (e.g: <code class="highlighter-rouge">spark.read.parquet</code>) work.</li>
-  <li><strong>Read as Hive tables</strong> : Supports all three views, including the real time view, relying on the custom Hudi input formats again like Hive.</li>
+  <li><strong>Hudi DataSource</strong>:支持读取优化和增量拉取,类似于标准数据源(例如:<code class="highlighter-rouge">spark.read.parquet</code>)的工作方式。</li>
+  <li><strong>以Hive表读取</strong>:支持所有三个视图,包括实时视图,依赖于自定义的Hudi输入格式(再次类似Hive)。</li>
 </ul>
 
-<p>In general, your spark job needs a dependency to <code class="highlighter-rouge">hudi-spark</code> or <code class="highlighter-rouge">hudi-spark-bundle-x.y.z.jar</code> needs to be on the class path of driver &amp; executors (hint: use <code class="highlighter-rouge">--jars</code> argument)</p>
+<p>通常,您的spark作业需要依赖<code class="highlighter-rouge">hudi-spark</code>或<code class="highlighter-rouge">hudi-spark-bundle-x.y.z.jar</code>,
+它们必须位于驱动程序和执行程序的类路径上(提示:使用<code class="highlighter-rouge">--jars</code>参数)。</p>
 
-<h3 id="spark-ro-view">Read Optimized table</h3>
+<h3 id="spark-ro-view">读优化表</h3>
 
-<p>To read RO table as a Hive table using SparkSQL, simply push a path filter into sparkContext as follows. 
-This method retains Spark built-in optimizations for reading Parquet files like vectorized reading on Hudi tables.</p>
+<p>要使用SparkSQL将RO表读取为Hive表,只需按如下所示将路径过滤器推入sparkContext。
+对于Hudi表,该方法保留了Spark内置的读取Parquet文件的优化功能,例如进行矢量化读取。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spark.sparkContext.hadoopConfiguration.setClass("mapreduce.input.pathFilter.class", classOf[org.apache.hudi.hadoop.HoodieROTablePathFilter], classOf[org.apache.hadoop.fs.PathFilter]);
-</code></pre></div></div>
+<pre><code class="language-Scala">spark.sparkContext.hadoopConfiguration.setClass("mapreduce.input.pathFilter.class", classOf[org.apache.hudi.hadoop.HoodieROTablePathFilter], classOf[org.apache.hadoop.fs.PathFilter]);
+</code></pre>
 
-<p>If you prefer to glob paths on DFS via the datasource, you can simply do something like below to get a Spark dataframe to work with.</p>
+<p>如果您希望通过数据源在DFS上使用全局路径,则只需执行以下类似操作即可得到Spark数据帧。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Dataset&lt;Row&gt; hoodieROViewDF = spark.read().format("org.apache.hudi")
+<pre><code class="language-Scala">Dataset&lt;Row&gt; hoodieROViewDF = spark.read().format("org.apache.hudi")
 // pass any path glob, can include hudi &amp; non-hudi datasets
 .load("/glob/path/pattern");
-</code></pre></div></div>
+</code></pre>
 
-<h3 id="spark-rt-view">Real time table</h3>
-<p>Currently, real time table can only be queried as a Hive table in Spark. In order to do this, set <code class="highlighter-rouge">spark.sql.hive.convertMetastoreParquet=false</code>, forcing Spark to fallback 
-to using the Hive Serde to read the data (planning/executions is still Spark).</p>
+<h3 id="spark-rt-view">实时表</h3>
+<p>当前,实时表只能在Spark中作为Hive表进行查询。为了做到这一点,设置<code class="highlighter-rouge">spark.sql.hive.convertMetastoreParquet = false</code>,
+迫使Spark回退到使用Hive Serde读取数据(计划/执行仍然是Spark)。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ spark-shell --jars hudi-spark-bundle-x.y.z-SNAPSHOT.jar --driver-class-path /etc/hive/conf  --packages com.databricks:spark-avro_2.11:4.0.0 --conf spark.sql.hive.convertMetastoreParquet=false --num-executors 10 --driver-memory 7g --executor-memory 2g  --master yarn-client
+<pre><code class="language-Scala">$ spark-shell --jars hudi-spark-bundle-x.y.z-SNAPSHOT.jar --driver-class-path /etc/hive/conf  --packages com.databricks:spark-avro_2.11:4.0.0 --conf spark.sql.hive.convertMetastoreParquet=false --num-executors 10 --driver-memory 7g --executor-memory 2g  --master yarn-client
 
 scala&gt; sqlContext.sql("select count(*) from hudi_rt where datestr = '2016-10-02'").show()
-</code></pre></div></div>
+</code></pre>
 
-<h3 id="spark-incr-pull">Incremental Pulling</h3>
-<p>The <code class="highlighter-rouge">hudi-spark</code> module offers the DataSource API, a more elegant way to pull data from Hudi dataset and process it via Spark.
-A sample incremental pull, that will obtain all records written since <code class="highlighter-rouge">beginInstantTime</code>, looks like below.</p>
+<h3 id="spark-incr-pull">增量拉取</h3>
+<p><code class="highlighter-rouge">hudi-spark</code>模块提供了DataSource API,这是一种从Hudi数据集中提取数据并通过Spark处理数据的更优雅的方法。
+如下所示是一个示例增量拉取,它将获取自<code class="highlighter-rouge">beginInstantTime</code>以来写入的所有记录。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Dataset&lt;Row&gt; hoodieIncViewDF = spark.read()
+<pre><code class="language-Java"> Dataset&lt;Row&gt; hoodieIncViewDF = spark.read()
      .format("org.apache.hudi")
      .option(DataSourceReadOptions.VIEW_TYPE_OPT_KEY(),
              DataSourceReadOptions.VIEW_TYPE_INCREMENTAL_OPT_VAL())
      .option(DataSourceReadOptions.BEGIN_INSTANTTIME_OPT_KEY(),
             &lt;beginInstantTime&gt;)
      .load(tablePath); // For incremental view, pass in the root/base path of dataset
-</code></pre></div></div>
+</code></pre>
 
-<p>Please refer to <a href="configurations.html#spark-datasource">configurations</a> section, to view all datasource options.</p>
+<p>请参阅<a href="configurations.html#spark-datasource">设置</a>部分,以查看所有数据源选项。</p>
 
-<p>Additionally, <code class="highlighter-rouge">HoodieReadClient</code> offers the following functionality using Hudi’s implicit indexing.</p>
+<p>另外,<code class="highlighter-rouge">HoodieReadClient</code>通过Hudi的隐式索引提供了以下功能。</p>
 
 <table>
   <tbody>
     <tr>
       <td><strong>API</strong></td>
-      <td><strong>Description</strong></td>
+      <td><strong>描述</strong></td>
     </tr>
     <tr>
       <td>read(keys)</td>
-      <td>Read out the data corresponding to the keys as a DataFrame, using Hudi’s own index for faster lookup</td>
+      <td>使用Hudi自己的索通过快速查找将与键对应的数据作为DataFrame读出</td>
     </tr>
     <tr>
       <td>filterExists()</td>
-      <td>Filter out already existing records from the provided RDD[HoodieRecord]. Useful for de-duplication</td>
+      <td>从提供的RDD[HoodieRecord]中过滤出已经存在的记录。对删除重复数据有用</td>
     </tr>
     <tr>
       <td>checkExists(keys)</td>
-      <td>Check if the provided keys exist in a Hudi dataset</td>
+      <td>检查提供的键是否存在于Hudi数据集中</td>
     </tr>
   </tbody>
 </table>
 
 <h2 id="presto">Presto</h2>
 
-<p>Presto is a popular query engine, providing interactive query performance. Hudi RO tables can be queries seamlessly in Presto. 
-This requires the <code class="highlighter-rouge">hudi-presto-bundle</code> jar to be placed into <code class="highlighter-rouge">&lt;presto_install&gt;/plugin/hive-hadoop2/</code>, across the installation.</p>
+<p>Presto是一种常用的查询引擎,可提供交互式查询性能。 Hudi RO表可以在Presto中无缝查询。
+这需要在整个安装过程中将<code class="highlighter-rouge">hudi-presto-bundle</code> jar放入<code class="highlighter-rouge">&lt;presto_install&gt;/plugin/hive-hadoop2/</code>中。</p>
 
 
     <div class="tags">
diff --git a/content/cn/quickstart.html b/content/cn/quickstart.html
index 0f3b332..1e3d703 100644
--- a/content/cn/quickstart.html
+++ b/content/cn/quickstart.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -343,28 +343,16 @@
 <p>本指南通过使用spark-shell简要介绍了Hudi功能。使用Spark数据源,我们将通过代码段展示如何插入和更新的Hudi默认存储类型数据集:
 <a href="https://hudi.apache.org/concepts.html#copy-on-write-storage">写时复制</a>。每次写操作之后,我们还将展示如何读取快照和增量读取数据。</p>
 
-<h2 id="编译hudi-spark整包">编译Hudi spark整包</h2>
-<p>Hudi要求在*nix系统上安装Java 8。Git检出<a href="https://github.com/apache/incubator-hudi">代码</a>,并通过命令行构建maven项目:</p>
-
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 检出和编译
-git clone https://github.com/apache/incubator-hudi.git &amp;&amp; cd incubator-hudi
-mvn clean install -DskipTests -DskipITs
-
-# 为后续使用导入hudi-spark-bundle位置
-mkdir -p /tmp/hudi &amp;&amp; cp packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar  /tmp/hudi/hudi-spark-bundle.jar
-export HUDI_SPARK_BUNDLE_PATH=/tmp/hudi/hudi-spark-bundle.jar
-</code></pre></div></div>
-
 <h2 id="设置spark-shell">设置spark-shell</h2>
 <p>Hudi适用于Spark-2.x版本。您可以按照<a href="https://spark.apache.org/downloads.html">此处</a>的说明设置spark。
 在提取的目录中,使用spark-shell运行Hudi:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/spark-shell --jars $HUDI_SPARK_BUNDLE_PATH --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
-</code></pre></div></div>
+<pre><code class="language-Scala">bin/spark-shell --packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
+</code></pre>
 
 <p>设置表名、基本路径和数据生成器来为本指南生成记录。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import org.apache.hudi.QuickstartUtils._
+<pre><code class="language-Java">import org.apache.hudi.QuickstartUtils._
 import scala.collection.JavaConversions._
 import org.apache.spark.sql.SaveMode._
 import org.apache.hudi.DataSourceReadOptions._
@@ -374,7 +362,7 @@ import org.apache.hudi.config.HoodieWriteConfig._
 val tableName = "hudi_cow_table"
 val basePath = "file:///tmp/hudi_cow_table"
 val dataGen = new DataGenerator
-</code></pre></div></div>
+</code></pre>
 
 <p><a href="https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java">数据生成器</a>
 可以基于<a href="https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java#L57">行程样本模式</a>
@@ -383,7 +371,7 @@ val dataGen = new DataGenerator
 <h2 id="inserts">插入数据</h2>
 <p>生成一些新的行程样本,将其加载到DataFrame中,然后将DataFrame写入Hudi数据集中,如下所示。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val inserts = convertToStringList(dataGen.generateInserts(10))
+<pre><code class="language-Java">val inserts = convertToStringList(dataGen.generateInserts(10))
 val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
 df.write.format("org.apache.hudi").
     options(getQuickstartWriteConfigs).
@@ -393,7 +381,7 @@ df.write.format("org.apache.hudi").
     option(TABLE_NAME, tableName).
     mode(Overwrite).
     save(basePath);
-</code></pre></div></div>
+</code></pre>
 
 <p><code class="highlighter-rouge">mode(Overwrite)</code>覆盖并重新创建数据集(如果已经存在)。
 您可以检查在<code class="highlighter-rouge">/tmp/hudi_cow_table/&lt;region&gt;/&lt;country&gt;/&lt;city&gt;/</code>下生成的数据。我们提供了一个记录键
@@ -408,14 +396,14 @@ df.write.format("org.apache.hudi").
 
 <p>将数据文件加载到数据帧中。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val roViewDF = spark.
+<pre><code class="language-Java">val roViewDF = spark.
     read.
     format("org.apache.hudi").
     load(basePath + "/*/*/*/*")
 roViewDF.registerTempTable("hudi_ro_table")
 spark.sql("select fare, begin_lon, begin_lat, ts from  hudi_ro_table where fare &gt; 20.0").show()
 spark.sql("select _hoodie_commit_time, _hoodie_record_key, _hoodie_partition_path, rider, driver, fare from  hudi_ro_table").show()
-</code></pre></div></div>
+</code></pre>
 
 <p>该查询提供已提取数据的读取优化视图。由于我们的分区路径(<code class="highlighter-rouge">region/country/city</code>)是嵌套的3个级别
 从基本路径开始,我们使用了<code class="highlighter-rouge">load(basePath + "/*/*/*/*")</code>。
@@ -425,7 +413,7 @@ spark.sql("select _hoodie_commit_time, _hoodie_record_key, _hoodie_partition_pat
 
 <p>这类似于插入新数据。使用数据生成器生成对现有行程的更新,加载到数据帧并将数据帧写入hudi数据集。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val updates = convertToStringList(dataGen.generateUpdates(10))
+<pre><code class="language-Java">val updates = convertToStringList(dataGen.generateUpdates(10))
 val df = spark.read.json(spark.sparkContext.parallelize(updates, 2));
 df.write.format("org.apache.hudi").
     options(getQuickstartWriteConfigs).
@@ -435,7 +423,7 @@ df.write.format("org.apache.hudi").
     option(TABLE_NAME, tableName).
     mode(Append).
     save(basePath);
-</code></pre></div></div>
+</code></pre>
 
 <p>注意,保存模式现在为<code class="highlighter-rouge">追加</code>。通常,除非您是第一次尝试创建数据集,否则请始终使用追加模式。
 <a href="#query">查询</a>现在再次查询数据将显示更新的行程。每个写操作都会生成一个新的由时间戳表示的<a href="http://hudi.incubator.apache.org/concepts.html">commit</a>
@@ -447,7 +435,7 @@ df.write.format("org.apache.hudi").
 这可以通过使用Hudi的增量视图并提供所需更改的开始时间来实现。
 如果我们需要给定提交之后的所有更改(这是常见的情况),则无需指定结束时间。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val commits = spark.sql("select distinct(_hoodie_commit_time) as commitTime from  hudi_ro_table order by commitTime").map(k =&gt; k.getString(0)).take(50)
+<pre><code class="language-Java">val commits = spark.sql("select distinct(_hoodie_commit_time) as commitTime from  hudi_ro_table order by commitTime").map(k =&gt; k.getString(0)).take(50)
 val beginTime = commits(commits.length - 2) // commit time we are interested in
 
 // 增量查询数据
@@ -459,7 +447,7 @@ val incViewDF = spark.
     load(basePath);
 incViewDF.registerTempTable("hudi_incr_table")
 spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hudi_incr_table where fare &gt; 20.0").show()
-</code></pre></div></div>
+</code></pre>
 
 <p>这将提供在开始时间提交之后发生的所有更改,其中包含票价大于20.0的过滤器。关于此功能的独特之处在于,它现在使您可以在批量数据上创作流式管道。</p>
 
@@ -467,7 +455,7 @@ spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hu
 
 <p>让我们看一下如何查询特定时间的数据。可以通过将结束时间指向特定的提交时间,将开始时间指向”000”(表示最早的提交时间)来表示特定时间。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val beginTime = "000" // Represents all commits &gt; this time.
+<pre><code class="language-Java">val beginTime = "000" // Represents all commits &gt; this time.
 val endTime = commits(commits.length - 2) // commit time we are interested in
 
 // 增量查询数据
@@ -478,10 +466,14 @@ val incViewDF = spark.read.format("org.apache.hudi").
     load(basePath);
 incViewDF.registerTempTable("hudi_incr_table")
 spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hudi_incr_table where fare &gt; 20.0").show()
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="从这开始下一步">从这开始下一步?</h2>
 
+<p>您也可以通过<a href="https://github.com/apache/incubator-hudi#building-apache-hudi-from-source-building-hudi">自己构建hudi</a>来快速入门,
+并在spark-shell命令中使用<code class="highlighter-rouge">--jars &lt;path to hudi_code&gt;/packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar</code>,
+而不是<code class="highlighter-rouge">--packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating</code></p>
+
 <p>这里我们使用Spark演示了Hudi的功能。但是,Hudi可以支持多种存储类型/视图,并且可以从Hive,Spark,Presto等查询引擎中查询Hudi数据集。
 我们制作了一个基于Docker设置、所有依赖系统都在本地运行的<a href="https://www.youtube.com/watch?v=VhNgUsxdrD0">演示视频</a>,
 我们建议您复制相同的设置然后按照<a href="docker_demo.html">这里</a>的步骤自己运行这个演示。
@@ -490,14 +482,6 @@ spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hu
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        <a href="tag_quickstart.html" class="btn btn-default navbar-btn cursorNorm" role="button">quickstart</a>
-        
-        
-        
     </div>
 
     
diff --git a/content/cn/s3_hoodie.html b/content/cn/s3_hoodie.html
index cd9fe1e..b8aa890 100644
--- a/content/cn/s3_hoodie.html
+++ b/content/cn/s3_hoodie.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -357,48 +357,48 @@
 
 <p>Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the <code class="highlighter-rouge">fs.defaultFS</code> with your S3 bucket name and Hudi should be able to read/write from the bucket.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  &lt;property&gt;
-      &lt;name&gt;fs.defaultFS&lt;/name&gt;
-      &lt;value&gt;s3://ysharma&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-      &lt;name&gt;fs.s3.impl&lt;/name&gt;
-      &lt;value&gt;org.apache.hadoop.fs.s3native.NativeS3FileSystem&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-      &lt;name&gt;fs.s3.awsAccessKeyId&lt;/name&gt;
-      &lt;value&gt;AWS_KEY&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-       &lt;name&gt;fs.s3.awsSecretAccessKey&lt;/name&gt;
-       &lt;value&gt;AWS_SECRET&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-       &lt;name&gt;fs.s3n.awsAccessKeyId&lt;/name&gt;
-       &lt;value&gt;AWS_KEY&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-       &lt;name&gt;fs.s3n.awsSecretAccessKey&lt;/name&gt;
-       &lt;value&gt;AWS_SECRET&lt;/value&gt;
-  &lt;/property&gt;
+<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.defaultFS<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>s3://ysharma<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.s3.impl<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>org.apache.hadoop.fs.s3native.NativeS3FileSystem<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.s3.awsAccessKeyId<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>AWS_KEY<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3.awsSecretAccessKey<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_SECRET<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3n.awsAccessKeyId<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_KEY<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3n.awsSecretAccessKey<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_SECRET<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
 </code></pre></div></div>
 
 <p>Utilities such as hudi-cli or deltastreamer tool, can pick up s3 creds via environmental variable prefixed with <code class="highlighter-rouge">HOODIE_ENV_</code>. For e.g below is a bash snippet to setup
 such variables and then have cli be able to work on datasets stored in s3</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>export HOODIE_ENV_fs_DOT_s3a_DOT_access_DOT_key=$accessKey
+<pre><code class="language-Java">export HOODIE_ENV_fs_DOT_s3a_DOT_access_DOT_key=$accessKey
 export HOODIE_ENV_fs_DOT_s3a_DOT_secret_DOT_key=$secretKey
 export HOODIE_ENV_fs_DOT_s3_DOT_awsAccessKeyId=$accessKey
 export HOODIE_ENV_fs_DOT_s3_DOT_awsSecretAccessKey=$secretKey
 export HOODIE_ENV_fs_DOT_s3n_DOT_awsAccessKeyId=$accessKey
 export HOODIE_ENV_fs_DOT_s3n_DOT_awsSecretAccessKey=$secretKey
 export HOODIE_ENV_fs_DOT_s3n_DOT_impl=org.apache.hadoop.fs.s3a.S3AFileSystem
-</code></pre></div></div>
+</code></pre>
 
 <h3 id="aws-libs">AWS Libs</h3>
 
diff --git a/content/cn/use_cases.html b/content/cn/use_cases.html
index 5a7e49d..f0c8472 100644
--- a/content/cn/use_cases.html
+++ b/content/cn/use_cases.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/cn/writing_data.html b/content/cn/writing_data.html
index b0f70bd..af76ee7 100644
--- a/content/cn/writing_data.html
+++ b/content/cn/writing_data.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -376,7 +376,7 @@
 
 <p>命令行选项更详细地描述了这些功能:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` --help
+<pre><code class="language-Java">[hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` --help
 Usage: &lt;main class&gt; [options]
   Options:
     --commit-on-errors
@@ -445,7 +445,7 @@ Usage: &lt;main class&gt; [options]
       schema) before writing. Default : Not set. E:g -
       org.apache.hudi.utilities.transform.SqlQueryBasedTransformer (which
       allows a SQL query template to be passed as a transformation function)
-</code></pre></div></div>
+</code></pre>
 
 <p>该工具采用层次结构组成的属性文件,并具有可插拔的接口,用于提取数据、生成密钥和提供模式。
 从Kafka和DFS摄取数据的示例配置在这里:<code class="highlighter-rouge">hudi-utilities/src/test/resources/delta-streamer-config</code>。</p>
@@ -454,19 +454,19 @@ Usage: &lt;main class&gt; [options]
 (<a href="https://docs.confluent.io/current/ksql/docs/tutorials/generate-custom-test-data.html">impressions.avro</a>,
 由schema-registry代码库提供)</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[confluent-5.0.0]$ bin/ksql-datagen schema=../impressions.avro format=avro topic=impressions key=impressionid
-</code></pre></div></div>
+<pre><code class="language-Java">[confluent-5.0.0]$ bin/ksql-datagen schema=../impressions.avro format=avro topic=impressions key=impressionid
+</code></pre>
 
 <p>然后用如下命令摄取这些数据。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` \
+<pre><code class="language-Java">[hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` \
   --props file://${PWD}/hudi-utilities/src/test/resources/delta-streamer-config/kafka-source.properties \
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --source-ordering-field impresssiontime \
   --target-base-path file:///tmp/hudi-deltastreamer-op --target-table uber.impressions \
   --op BULK_INSERT
-</code></pre></div></div>
+</code></pre>
 
 <p>在某些情况下,您可能需要预先将现有数据集迁移到Hudi。 请参考<a href="migration_guide.html">迁移指南</a>。</p>
 
@@ -476,7 +476,7 @@ Usage: &lt;main class&gt; [options]
 以下是在指定需要使用的字段名称的之后,如何插入更新数据帧的方法,这些字段包括
 <code class="highlighter-rouge">recordKey =&gt; _row_key</code>、<code class="highlighter-rouge">partitionPath =&gt; partition</code>和<code class="highlighter-rouge">precombineKey =&gt; timestamp</code></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputDF.write()
+<pre><code class="language-Java">inputDF.write()
        .format("org.apache.hudi")
        .options(clientOpts) // 可以传入任何Hudi客户端参数
        .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY(), "_row_key")
@@ -485,7 +485,7 @@ Usage: &lt;main class&gt; [options]
        .option(HoodieWriteConfig.TABLE_NAME, tableName)
        .mode(SaveMode.Append)
        .save(basePath);
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="与hive同步">与Hive同步</h2>
 
@@ -493,7 +493,7 @@ Usage: &lt;main class&gt; [options]
 如果需要从命令行或在独立的JVM中运行它,Hudi提供了一个<code class="highlighter-rouge">HiveSyncTool</code>,
 在构建了hudi-hive模块之后,可以按以下方式调用它。</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd hudi-hive
+<pre><code class="language-Java">cd hudi-hive
 ./run_sync_tool.sh
  [hudi-hive]$ ./run_sync_tool.sh --help
 Usage: &lt;main class&gt; [options]
@@ -512,7 +512,7 @@ Usage: &lt;main class&gt; [options]
        name of the target table in Hive
   * --user
        Hive username
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="删除数据">删除数据</h2>
 
@@ -526,13 +526,13 @@ Usage: &lt;main class&gt; [options]
  Hudi附带了一个内置的<code class="highlighter-rouge">org.apache.hudi.EmptyHoodieRecordPayload</code>类,它就是实现了这一功能。</li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> deleteDF // 仅包含要删除的记录的数据帧
+<pre><code class="language-Java"> deleteDF // 仅包含要删除的记录的数据帧
    .write().format("org.apache.hudi")
    .option(...) // 根据设置需要添加HUDI参数,例如记录键、分区路径和其他参数
    // 指定record_key,partition_key,precombine_fieldkey和常规参数
    .option(DataSourceWriteOptions.PAYLOAD_CLASS_OPT_KEY, "org.apache.hudi.EmptyHoodieRecordPayload")
  
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="存储管理">存储管理</h2>
 
diff --git a/content/community.html b/content/community.html
index 09feab7..ba083c4 100644
--- a/content/community.html
+++ b/content/community.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/comparison.html b/content/comparison.html
index 879683c..cf8ab96 100644
--- a/content/comparison.html
+++ b/content/comparison.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/concepts.html b/content/concepts.html
index 54bba63..2dd9053 100644
--- a/content/concepts.html
+++ b/content/concepts.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/configurations.html b/content/configurations.html
index f387d92..d704d6f 100644
--- a/content/configurations.html
+++ b/content/configurations.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -397,7 +397,7 @@ The actual datasource level configs are listed below.</p>
 
 <p>Additionally, you can pass down any of the WriteClient level configs directly using <code class="highlighter-rouge">options()</code> or <code class="highlighter-rouge">option(k,v)</code> methods.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputDF.write()
+<pre><code class="language-Java">inputDF.write()
 .format("org.apache.hudi")
 .options(clientOpts) // any of the Hudi client opts can be passed in as well
 .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY(), "_row_key")
@@ -406,7 +406,7 @@ The actual datasource level configs are listed below.</p>
 .option(HoodieWriteConfig.TABLE_NAME, tableName)
 .mode(SaveMode.Append)
 .save(basePath);
-</code></pre></div></div>
+</code></pre>
 
 <p>Options useful for writing datasets via <code class="highlighter-rouge">write.format.option(...)</code></p>
 
@@ -520,7 +520,7 @@ necessarily correspond to an instant on the timeline. New data written with an
 <p>Jobs programming directly against the RDD level apis can build a <code class="highlighter-rouge">HoodieWriteConfig</code> object and pass it in to the <code class="highlighter-rouge">HoodieWriteClient</code> constructor. 
 HoodieWriteConfig can be built using a builder pattern as below.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>HoodieWriteConfig cfg = HoodieWriteConfig.newBuilder()
+<pre><code class="language-Java">HoodieWriteConfig cfg = HoodieWriteConfig.newBuilder()
         .withPath(basePath)
         .forTable(tableName)
         .withSchema(schemaStr)
@@ -529,7 +529,7 @@ HoodieWriteConfig can be built using a builder pattern as below.</p>
         .withIndexConfig(HoodieIndexConfig.newBuilder().withXXX(...).build())
         ...
         .build();
-</code></pre></div></div>
+</code></pre>
 
 <p>Following subsections go over different aspects of write configs, explaining most important configs with their property names, default values.</p>
 
diff --git a/content/contributing.html b/content/contributing.html
index b9fe1f8..f817d0b 100644
--- a/content/contributing.html
+++ b/content/contributing.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/css/lavish-bootstrap.css b/content/css/lavish-bootstrap.css
index a050c9a..6a0f52f 100644
--- a/content/css/lavish-bootstrap.css
+++ b/content/css/lavish-bootstrap.css
@@ -600,7 +600,7 @@ code {
   padding: 2px 4px;
   font-size: 90%;
   color: #444;
-  background-color: #f0f0f0;
+  background-color: #04b3f90d;
   white-space: nowrap;
   border-radius: 4px;
 }
@@ -613,8 +613,8 @@ pre {
   line-height: 1.428571429;
   word-break: break-all;
   word-wrap: break-word;
-  color: #77777a;
-  background-color: #f5f5f5;
+  color: #000000;
+  background-color: #04b3f90d;
   border: 1px solid #cccccc;
   border-radius: 4px;
 }
@@ -3730,6 +3730,7 @@ textarea.input-group-sm > .input-group-btn > .btn {
   }
   .navbar-right {
     float: right !important;
+    background-color: white;
   }
 }
 .navbar-form {
diff --git a/content/docker_demo.html b/content/docker_demo.html
index b4f4ec5..b6bb27c 100644
--- a/content/docker_demo.html
+++ b/content/docker_demo.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -354,7 +354,7 @@ data infrastructure is brought up in a local docker cluster within your computer
   <li>/etc/hosts : The demo references many services running in container by the hostname. Add the following settings to /etc/hosts</li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>   127.0.0.1 adhoc-1
+<pre><code class="language-Java">   127.0.0.1 adhoc-1
    127.0.0.1 adhoc-2
    127.0.0.1 namenode
    127.0.0.1 datanode1
@@ -363,7 +363,7 @@ data infrastructure is brought up in a local docker cluster within your computer
    127.0.0.1 kafkabroker
    127.0.0.1 sparkmaster
    127.0.0.1 zookeeper
-</code></pre></div></div>
+</code></pre>
 
 <p>Also, this has not been tested on some environments like Docker on Windows.</p>
 
@@ -372,16 +372,16 @@ data infrastructure is brought up in a local docker cluster within your computer
 <h4 id="build-hudi">Build Hudi</h4>
 
 <p>The first step is to build hudi</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd &lt;HUDI_WORKSPACE&gt;
+<pre><code class="language-Java">cd &lt;HUDI_WORKSPACE&gt;
 mvn package -DskipTests
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="bringing-up-demo-cluster">Bringing up Demo Cluster</h4>
 
 <p>The next step is to run the docker compose script and setup configs for bringing up the cluster.
 This should pull the docker images from docker hub and setup docker cluster.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd docker
+<pre><code class="language-Java">cd docker
 ./setup_demo.sh
 ....
 ....
@@ -410,8 +410,8 @@ Creating spark-worker-1            ... done
 Copying spark default config and setting up configs
 Copying spark default config and setting up configs
 Copying spark default config and setting up configs
-varadarb-C02SG7Q3G8WP:docker varadarb$ docker ps
-</code></pre></div></div>
+$ docker ps
+</code></pre>
 
 <p>At this point, the docker cluster will be up and running. The demo cluster brings up the following services</p>
 
@@ -435,12 +435,10 @@ The batches are windowed intentionally so that the second batch contains updates
 
 <h4 id="step-1--publish-the-first-batch-to-kafka">Step 1 : Publish the first batch to Kafka</h4>
 
-<p>Upload the first batch to Kafka topic ‘stock ticks’</p>
+<p>Upload the first batch to Kafka topic ‘stock ticks’ <code class="highlighter-rouge">cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P</code></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P
-
-To check if the new topic shows up, use
-kafkacat -b kafkabroker -L -J | jq .
+<p>To check if the new topic shows up, use</p>
+<pre><code class="language-Java">kafkacat -b kafkabroker -L -J | jq .
 {
   "originating_broker": {
     "id": 1001,
@@ -478,7 +476,7 @@ kafkacat -b kafkabroker -L -J | jq .
   ]
 }
 
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-2-incrementally-ingest-data-from-kafka-topic">Step 2: Incrementally ingest data from Kafka topic</h4>
 
@@ -487,29 +485,21 @@ pull changes and apply to Hudi dataset using upsert/insert primitives. Here, we
 json data from kafka topic and ingest to both COW and MOR tables we initialized in the previous step. This tool
 automatically initializes the datasets in the file-system if they do not exist yet.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 
 # Run the following spark-submit command to execute the delta-streamer and ingest to stock_ticks_cow dataset in HDFS
 spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer $HUDI_UTILITIES_BUNDLE --storage-type COPY_ON_WRITE --source-class org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts  --target-base-path /user/hive/warehouse/stock_ticks_cow --target-table stock_ticks_cow --props /var/demo/config/kafka-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider
-....
-....
-2018-09-24 22:20:00 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
-2018-09-24 22:20:00 INFO  SparkContext:54 - Successfully stopped SparkContext
-
 
 
 # Run the following spark-submit command to execute the delta-streamer and ingest to stock_ticks_mor dataset in HDFS
 spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer $HUDI_UTILITIES_BUNDLE --storage-type MERGE_ON_READ --source-class org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts  --target-base-path /user/hive/warehouse/stock_ticks_mor --target-table stock_ticks_mor --props /var/demo/config/kafka-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider --disable-compaction
-....
-2018-09-24 22:22:01 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
-2018-09-24 22:22:01 INFO  SparkContext:54 - Successfully stopped SparkContext
-....
+
 
 # As part of the setup (Look at setup_demo.sh), the configs needed for DeltaStreamer is uploaded to HDFS. The configs
 # contain mostly Kafa connectivity settings, the avro-schema to be used for ingesting along with key and partitioning fields.
 
 exit
-</code></pre></div></div>
+</code></pre>
 
 <p>You can use HDFS web-browser to look at the datasets
 <code class="highlighter-rouge">http://namenode:50070/explorer.html#/user/hive/warehouse/stock_ticks_cow</code>.</p>
@@ -525,7 +515,7 @@ file under .hoodie which signals a successful commit.</p>
 <p>At this step, the datasets are available in HDFS. We need to sync with Hive to create new Hive tables and add partitions
 inorder to run Hive queries against those datasets.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 
 # THis command takes in HIveServer URL and COW Hudi Dataset location in HDFS and sync the HDFS state to Hive
 /var/hoodie/ws/hudi-hive/run_sync_tool.sh  --jdbc-url jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by dt --base-path /user/hive/warehouse/stock_ticks_cow --database default --table stock_ticks_cow
@@ -541,7 +531,7 @@ inorder to run Hive queries against those datasets.</p>
 2018-09-24 22:23:09,559 INFO  [main] hive.HiveSyncTool (HiveSyncTool.java:syncHoodieTable(112)) - Sync complete for stock_ticks_mor_rt
 ....
 exit
-</code></pre></div></div>
+</code></pre>
 <p>After executing the above command, you will notice</p>
 
 <ol>
@@ -556,7 +546,7 @@ provides the ReadOptimized view for the Hudi dataset and the later provides the
 (for both COW and MOR dataset)and realtime views (for MOR dataset)give the same value “10:29 a.m” as Hudi create a
 parquet file for the first batch of data.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 # List Tables
 0: jdbc:hive2://hiveserver:10000&gt; show tables;
@@ -652,13 +642,13 @@ WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the futu
 
 exit
 exit
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-4-b-run-spark-sql-queries">Step 4 (b): Run Spark-SQL Queries</h4>
 <p>Hudi support Spark as query processor just like Hive. Here are the same hive queries
 running in spark-sql</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --master local[2] --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 ...
 
@@ -749,13 +739,13 @@ scala&gt; spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, clo
 |20180924222155     |GOOG  |2018-08-31 10:29:00|3391  |1230.1899|1230.085|
 +-------------------+------+-------------------+------+---------+--------+
 
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-4-c-run-presto-queries">Step 4 (c): Run Presto Queries</h4>
 
 <p>Here are the Presto queries for similar Hive and Spark queries. Currently, Hudi does not support Presto queries on realtime views.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
+<pre><code class="language-Java">docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
 presto&gt; show catalogs;
   Catalog
 -----------
@@ -839,14 +829,14 @@ Splits: 17 total, 17 done (100.00%)
 0:02 [197 rows, 613B] [92 rows/s, 286B/s]
 
 presto:default&gt; exit
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-5-upload-second-batch-to-kafka-and-run-deltastreamer-to-ingest">Step 5: Upload second batch to Kafka and run DeltaStreamer to ingest</h4>
 
 <p>Upload the second batch of data and ingest this batch using delta-streamer. As this batch does not bring in any new
 partitions, there is no need to run hive-sync</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat docker/demo/data/batch_2.json | kafkacat -b kafkabroker -t stock_ticks -P
+<pre><code class="language-Java">cat docker/demo/data/batch_2.json | kafkacat -b kafkabroker -t stock_ticks -P
 
 # Within Docker container, run the ingestion command
 docker exec -it adhoc-2 /bin/bash
@@ -859,7 +849,7 @@ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
 spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer $HUDI_UTILITIES_BUNDLE --storage-type MERGE_ON_READ --source-class org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts  --target-base-path /user/hive/warehouse/stock_ticks_mor --target-table stock_ticks_mor --props /var/demo/config/kafka-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider --disable-compaction
 
 exit
-</code></pre></div></div>
+</code></pre>
 
 <p>With Copy-On-Write table, the second ingestion by DeltaStreamer resulted in a new version of Parquet file getting created.
 See <code class="highlighter-rouge">http://namenode:50070/explorer.html#/user/hive/warehouse/stock_ticks_cow/2018/08/31</code></p>
@@ -877,7 +867,7 @@ This is the time, when ReadOptimized and Realtime views will provide different r
 return “10:29 am” as it will only read from the Parquet file. Realtime View will do on-the-fly merge and return
 latest committed data which is “10:59 a.m”.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
 # Copy On Write Table:
@@ -941,13 +931,13 @@ WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the futu
 
 exit
 exit
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-6b-run-spark-sql-queries">Step 6(b): Run Spark SQL Queries</h4>
 
 <p>Running the same queries in Spark-SQL:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 
 # Copy On Write Table:
@@ -1008,13 +998,13 @@ scala&gt; spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, clo
 
 exit
 exit
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-6c-run-presto-queries">Step 6(c): Run Presto Queries</h4>
 
 <p>Running the same queries on Presto for ReadOptimized views.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
+<pre><code class="language-Java">docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
 presto&gt; use hive.default;
 USE
 
@@ -1069,7 +1059,7 @@ Splits: 17 total, 17 done (100.00%)
 0:01 [197 rows, 613B] [154 rows/s, 480B/s]
 
 presto:default&gt; exit
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-7--incremental-query-for-copy-on-write-table">Step 7 : Incremental Query for COPY-ON-WRITE Table</h4>
 
@@ -1077,7 +1067,7 @@ presto:default&gt; exit
 
 <p>Lets take the same projection query example</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
 0: jdbc:hive2://hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_cow where  symbol = 'GOOG';
@@ -1087,7 +1077,7 @@ beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache
 | 20180924064621       | GOOG    | 2018-08-31 09:59:00  | 6330    | 1230.5     | 1230.02   |
 | 20180924065039       | GOOG    | 2018-08-31 10:59:00  | 9021    | 1227.1993  | 1227.215  |
 +----------------------+---------+----------------------+---------+------------+-----------+--+
-</code></pre></div></div>
+</code></pre>
 
 <p>As you notice from the above queries, there are 2 commits - 20180924064621 and 20180924065039 in timeline order.
 When you follow the steps, you will be getting different timestamps for commits. Substitute them
@@ -1100,19 +1090,19 @@ the commit time of the first batch (20180924064621) and run incremental query</p
 <p>Hudi incremental mode provides efficient scanning for incremental queries by filtering out files that do not have any
 candidate rows using hudi-managed metadata.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 0: jdbc:hive2://hiveserver:10000&gt; set hoodie.stock_ticks_cow.consume.mode=INCREMENTAL;
 No rows affected (0.009 seconds)
 0: jdbc:hive2://hiveserver:10000&gt;  set hoodie.stock_ticks_cow.consume.max.commits=3;
 No rows affected (0.009 seconds)
 0: jdbc:hive2://hiveserver:10000&gt; set hoodie.stock_ticks_cow.consume.start.timestamp=20180924064621;
-</code></pre></div></div>
+</code></pre>
 
 <p>With the above setting, file-ids that do not have any updates from the commit 20180924065039 is filtered out without scanning.
 Here is the incremental query :</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0: jdbc:hive2://hiveserver:10000&gt;
+<pre><code class="language-Java">0: jdbc:hive2://hiveserver:10000&gt;
 0: jdbc:hive2://hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_cow where  symbol = 'GOOG' and `_hoodie_commit_time` &gt; '20180924064621';
 +----------------------+---------+----------------------+---------+------------+-----------+--+
 | _hoodie_commit_time  | symbol  |          ts          | volume  |    open    |   close   |
@@ -1121,10 +1111,10 @@ Here is the incremental query :</p>
 +----------------------+---------+----------------------+---------+------------+-----------+--+
 1 row selected (0.83 seconds)
 0: jdbc:hive2://hiveserver:10000&gt;
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="incremental-query-with-spark-sql">Incremental Query with Spark SQL:</h5>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 Welcome to
       ____              __
@@ -1157,14 +1147,14 @@ scala&gt; spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, clo
 | 20180924065039       | GOOG    | 2018-08-31 10:59:00  | 9021    | 1227.1993  | 1227.215  |
 +----------------------+---------+----------------------+---------+------------+-----------+--+
 
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-8-schedule-and-run-compaction-for-merge-on-read-dataset">Step 8: Schedule and Run Compaction for Merge-On-Read dataset</h4>
 
 <p>Lets schedule and run a compaction to create a new version of columnar  file so that read-optimized readers will see fresher data.
 Again, You can use Hudi CLI to manually schedule and run compaction</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 root@adhoc-1:/opt#   /var/hoodie/ws/hudi-cli/hudi-cli.sh
 ============================================
 *                                          *
@@ -1247,7 +1237,7 @@ hoodie:stock_ticks-&gt;compactions show all
     |==================================================================|
     | 20180924070031         | COMPLETED| 1                            |
 
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="step-9-run-hive-queries-including-incremental-queries">Step 9: Run Hive Queries including incremental queries</h4>
 
@@ -1256,7 +1246,7 @@ Lets also run the incremental query for MOR table.
 From looking at the below query output, it will be clear that the fist commit time for the MOR table is 20180924064636
 and the second commit time is 20180924070031</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-2 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-2 /bin/bash
 beeline -u jdbc:hive2://hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false
 
 # Read Optimized View
@@ -1312,11 +1302,11 @@ No rows affected (0.013 seconds)
 +----------------------+---------+----------------------+---------+------------+-----------+--+
 exit
 exit
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="step-10-read-optimized-and-realtime-views-for-mor-with-spark-sql-after-compaction">Step 10: Read Optimized and Realtime Views for MOR with Spark-SQL after compaction</h5>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it adhoc-1 /bin/bash
+<pre><code class="language-Java">docker exec -it adhoc-1 /bin/bash
 bash-4.4# $SPARK_INSTALL/bin/spark-shell --jars $HUDI_SPARK_BUNDLE --driver-class-path $HADOOP_CONF_DIR --conf spark.sql.hive.convertMetastoreParquet=false --deploy-mode client  --driver-memory 1G --master local[2] --executor-memory 3G --num-executors 1  --packages com.databricks:spark-avro_2.11:4.0.0
 
 # Read Optimized View
@@ -1351,11 +1341,11 @@ scala&gt; spark.sql("select `_hoodie_commit_time`, symbol, ts, volume, open, clo
 | 20180924064636       | GOOG    | 2018-08-31 09:59:00  | 6330    | 1230.5     | 1230.02   |
 | 20180924070031       | GOOG    | 2018-08-31 10:59:00  | 9021    | 1227.1993  | 1227.215  |
 +----------------------+---------+----------------------+---------+------------+-----------+--+
-</code></pre></div></div>
+</code></pre>
 
 <h5 id="step-11--presto-queries-over-read-optimized-view-on-mor-dataset-after-compaction">Step 11:  Presto queries over Read Optimized View on MOR dataset after compaction</h5>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
+<pre><code class="language-Java">docker exec -it presto-worker-1 presto --server presto-coordinator-1:8090
 presto&gt; use hive.default;
 USE
 
@@ -1383,28 +1373,28 @@ Splits: 17 total, 17 done (100.00%)
 
 presto:default&gt;
 
-</code></pre></div></div>
+</code></pre>
 
 <p>This brings the demo to an end.</p>
 
 <h2 id="testing-hudi-in-local-docker-environment">Testing Hudi in Local Docker environment</h2>
 
 <p>You can bring up a hadoop docker environment containing Hadoop, Hive and Spark services with support for hudi.</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ mvn pre-integration-test -DskipTests
-</code></pre></div></div>
+<pre><code class="language-Java">$ mvn pre-integration-test -DskipTests
+</code></pre>
 <p>The above command builds docker images for all the services with
 current Hudi source installed at /var/hoodie/ws and also brings up the services using a compose file. We
 currently use Hadoop (v2.8.4), Hive (v2.3.3) and Spark (v2.3.1) in docker images.</p>
 
 <p>To bring down the containers</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cd hudi-integ-test
+<pre><code class="language-Java">$ cd hudi-integ-test
 $ mvn docker-compose:down
-</code></pre></div></div>
+</code></pre>
 
 <p>If you want to bring up the docker containers, use</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cd hudi-integ-test
+<pre><code class="language-Java">$ cd hudi-integ-test
 $  mvn docker-compose:up -DdetachedMode=true
-</code></pre></div></div>
+</code></pre>
 
 <p>Hudi is a library that is operated in a broader data analytics/ingestion environment
 involving Hadoop, Hive and Spark. Interoperability with all these systems is a key objective for us. We are
@@ -1430,7 +1420,7 @@ run the script
 
 <p>Here are the commands:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd docker
+<pre><code class="language-Java">cd docker
 ./build_local_docker_images.sh
 .....
 
@@ -1465,19 +1455,11 @@ run the script
 [INFO] Finished at: 2018-09-10T17:47:37-07:00
 [INFO] Final Memory: 236M/1848M
 [INFO] ------------------------------------------------------------------------
-</code></pre></div></div>
+</code></pre>
 
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        
-        
-        
-        
     </div>
 
     
diff --git a/content/events/2016-12-30-strata-talk-2017.html b/content/events/2016-12-30-strata-talk-2017.html
index c0a9c04..9c397ec 100644
--- a/content/events/2016-12-30-strata-talk-2017.html
+++ b/content/events/2016-12-30-strata-talk-2017.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -369,14 +369,6 @@ Catch our talk <strong>“Incremental Processing on Hadoop At Uber”</strong></
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        <a href="tag_news.html" class="btn btn-default navbar-btn cursorNorm" role="button">news</a>
-        
-        
-        
     </div>
 
     
diff --git a/content/events/2019-01-18-asf-incubation.html b/content/events/2019-01-18-asf-incubation.html
index d32b62e..fc895ec 100644
--- a/content/events/2019-01-18-asf-incubation.html
+++ b/content/events/2019-01-18-asf-incubation.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -368,14 +368,6 @@ $('#toc').on('click', 'a', function() {
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        <a href="tag_news.html" class="btn btn-default navbar-btn cursorNorm" role="button">news</a>
-        
-        
-        
     </div>
 
     
diff --git a/content/feed.xml b/content/feed.xml
index 8c74790..68b0748 100644
--- a/content/feed.xml
+++ b/content/feed.xml
@@ -3,10 +3,10 @@
     <channel>
         <title></title>
         <description>Apache Hudi (pronounced “Hoodie”) provides upserts and incremental processing capaibilities on Big Data</description>
-        <link>http://localhost:4000/</link>
-        <atom:link href="http://localhost:4000/feed.xml" rel="self" type="application/rss+xml"/>
-        <pubDate>Thu, 24 Oct 2019 22:27:46 -0700</pubDate>
-        <lastBuildDate>Thu, 24 Oct 2019 22:27:46 -0700</lastBuildDate>
+        <link>http://0.0.0.0:4000/</link>
+        <atom:link href="http://0.0.0.0:4000/feed.xml" rel="self" type="application/rss+xml"/>
+        <pubDate>Thu, 14 Nov 2019 14:24:54 +0000</pubDate>
+        <lastBuildDate>Thu, 14 Nov 2019 14:24:54 +0000</lastBuildDate>
         <generator>Jekyll v3.7.2</generator>
         
     </channel>
diff --git a/content/gcs_hoodie.html b/content/gcs_hoodie.html
index 5330cf1..2bb685b 100644
--- a/content/gcs_hoodie.html
+++ b/content/gcs_hoodie.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/index.html b/content/index.html
index 59f5957..5809814 100644
--- a/content/index.html
+++ b/content/index.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -383,14 +383,6 @@ $('#toc').on('click', 'a', function() {
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        <a href="tag_getting_started.html" class="btn btn-default navbar-btn cursorNorm" role="button">getting_started</a>
-        
-        
-        
     </div>
 
     
diff --git a/content/js/mydoc_scroll.html b/content/js/mydoc_scroll.html
index c677591..2b68d87 100644
--- a/content/js/mydoc_scroll.html
+++ b/content/js/mydoc_scroll.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -597,12 +597,6 @@ $('#small-box-links').localScroll({
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        
-        
     </div>
 
     
diff --git a/content/migration_guide.html b/content/migration_guide.html
index e55bd7a4..1c71b69 100644
--- a/content/migration_guide.html
+++ b/content/migration_guide.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -373,17 +373,19 @@ Take this approach if your dataset is an append only type of dataset and you do
 This tool essentially starts a Spark Job to read the existing parquet dataset and converts it into a HUDI managed dataset by re-writing all the data.</p>
 
 <h4 id="option-2">Option 2</h4>
-<p>For huge datasets, this could be as simple as : for partition in [list of partitions in source dataset] {
-        val inputDF = spark.read.format(“any_input_format”).load(“partition_path”)
-        inputDF.write.format(“org.apache.hudi”).option()….save(“basePath”)
-        }</p>
+<p>For huge datasets, this could be as simple as :</p>
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">partition</span> <span class="n">in</span> <span class="o">[</span><span class="n">list</span> <span class="n">of</span> <span class="n">partitions</span> <span class="n">in</span> <span class="n">source</span> <span class="n">dataset</span><span class="o">]</span> <span class="o">{</span>
+        <span class="n">val</span> <span class="n">inputDF</span> <span class="o">=</span> <span class="n">spark</span><span class="o">.</span><span class="na">read</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"any_input_format"</span><span class="o">).</span><span class="na">load</span><span class="o">(</span><span class="s">"partition_path"</span><span class="o">)</span>
+        <span class="n">inputDF</span><span class="o">.</span><span class="na">write</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span><span class="na">option</span><span class="o">()....</span><span class="na">save</span><span class="o">(</span><span class="s">"basePath"</span><span class="o">)</span>
+<span class="o">}</span>
+</code></pre></div></div>
 
 <h4 id="option-3">Option 3</h4>
 <p>Write your own custom logic of how to load an existing dataset into a Hudi managed one. Please read about the RDD API
- <a href="quickstart.html">here</a>.</p>
+ <a href="quickstart.html">here</a>. Using the HDFSParquetImporter Tool. Once hudi has been built via <code class="highlighter-rouge">mvn clean install -DskipTests</code>, the shell can be
+fired by via <code class="highlighter-rouge">cd hudi-cli &amp;&amp; ./hudi-cli.sh</code>.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Using the HDFSParquetImporter Tool. Once hudi has been built via `mvn clean install -DskipTests`, the shell can be
-fired by via `cd hudi-cli &amp;&amp; ./hudi-cli.sh`.
+<pre><code class="language-Java">
 
 hudi-&gt;hdfsparquetimport
         --upsert false
@@ -399,7 +401,7 @@ hudi-&gt;hdfsparquetimport
         --format parquet
         --sparkMemory 6g
         --retry 2
-</code></pre></div></div>
+</code></pre>
 
 
     <div class="tags">
diff --git a/content/news.html b/content/news.html
index b1862d7..79b4dcc 100644
--- a/content/news.html
+++ b/content/news.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -262,39 +262,12 @@
         
             
             
-                        
-    <h2><a class="post-link" href="/events/2016-12-30-strata-talk-2017.html">Connect with us at Strata San Jose March 2017</a></h2>
-        <span class="post-meta">Dec 30, 2016 /
-            
-
-                <a href="tag_news.html">news</a>
-
-                </span>
-        <p> We will be presenting Hudi &amp; general concepts around how incremental processing works at Uber.
-Catch our talk “Incremental Processing on Hadoop At Uber”
-
- </p>
-
-            
         
             
             
         
             
             
-                        
-    <h2><a class="post-link" href="/events/2019-01-18-asf-incubation.html">Hudi entered Apache Incubator</a></h2>
-        <span class="post-meta">Jan 18, 2019 /
-            
-
-                <a href="tag_news.html">news</a>
-
-                </span>
-        <p> In the coming weeks, we will be moving in our new home on the Apache Incubator.
-
- </p>
-
-            
         
             
             
diff --git a/content/news_archive.html b/content/news_archive.html
index 5d8a286..8be18b2 100644
--- a/content/news_archive.html
+++ b/content/news_archive.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -347,7 +347,6 @@
     <section id="archive">
         <h3>This year's posts</h3>
         
-        &lt;/ul&gt;
     </section>
 
 
diff --git a/content/performance.html b/content/performance.html
index 0cfb590..4386c78 100644
--- a/content/performance.html
+++ b/content/performance.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/powered_by.html b/content/powered_by.html
index 7448322..6b6063c 100644
--- a/content/powered_by.html
+++ b/content/powered_by.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/privacy.html b/content/privacy.html
index 473cdd6..b4f5e48 100644
--- a/content/privacy.html
+++ b/content/privacy.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/querying_data.html b/content/querying_data.html
index 00fc4a0..429e59a 100644
--- a/content/querying_data.html
+++ b/content/querying_data.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -493,37 +493,37 @@ separated) and calls InputFormat.listStatus() only once with all those partition
 <p>To read RO table as a Hive table using SparkSQL, simply push a path filter into sparkContext as follows. 
 This method retains Spark built-in optimizations for reading Parquet files like vectorized reading on Hudi tables.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spark.sparkContext.hadoopConfiguration.setClass("mapreduce.input.pathFilter.class", classOf[org.apache.hudi.hadoop.HoodieROTablePathFilter], classOf[org.apache.hadoop.fs.PathFilter]);
-</code></pre></div></div>
+<pre><code class="language-Scala">spark.sparkContext.hadoopConfiguration.setClass("mapreduce.input.pathFilter.class", classOf[org.apache.hudi.hadoop.HoodieROTablePathFilter], classOf[org.apache.hadoop.fs.PathFilter]);
+</code></pre>
 
 <p>If you prefer to glob paths on DFS via the datasource, you can simply do something like below to get a Spark dataframe to work with.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Dataset&lt;Row&gt; hoodieROViewDF = spark.read().format("org.apache.hudi")
+<pre><code class="language-Java">Dataset&lt;Row&gt; hoodieROViewDF = spark.read().format("org.apache.hudi")
 // pass any path glob, can include hudi &amp; non-hudi datasets
 .load("/glob/path/pattern");
-</code></pre></div></div>
+</code></pre>
 
 <h3 id="spark-rt-view">Real time table</h3>
 <p>Currently, real time table can only be queried as a Hive table in Spark. In order to do this, set <code class="highlighter-rouge">spark.sql.hive.convertMetastoreParquet=false</code>, forcing Spark to fallback 
 to using the Hive Serde to read the data (planning/executions is still Spark).</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ spark-shell --jars hudi-spark-bundle-x.y.z-SNAPSHOT.jar --driver-class-path /etc/hive/conf  --packages com.databricks:spark-avro_2.11:4.0.0 --conf spark.sql.hive.convertMetastoreParquet=false --num-executors 10 --driver-memory 7g --executor-memory 2g  --master yarn-client
+<pre><code class="language-Java">$ spark-shell --jars hudi-spark-bundle-x.y.z-SNAPSHOT.jar --driver-class-path /etc/hive/conf  --packages com.databricks:spark-avro_2.11:4.0.0 --conf spark.sql.hive.convertMetastoreParquet=false --num-executors 10 --driver-memory 7g --executor-memory 2g  --master yarn-client
 
 scala&gt; sqlContext.sql("select count(*) from hudi_rt where datestr = '2016-10-02'").show()
-</code></pre></div></div>
+</code></pre>
 
 <h3 id="spark-incr-pull">Incremental Pulling</h3>
 <p>The <code class="highlighter-rouge">hudi-spark</code> module offers the DataSource API, a more elegant way to pull data from Hudi dataset and process it via Spark.
 A sample incremental pull, that will obtain all records written since <code class="highlighter-rouge">beginInstantTime</code>, looks like below.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Dataset&lt;Row&gt; hoodieIncViewDF = spark.read()
+<pre><code class="language-Java"> Dataset&lt;Row&gt; hoodieIncViewDF = spark.read()
      .format("org.apache.hudi")
      .option(DataSourceReadOptions.VIEW_TYPE_OPT_KEY(),
              DataSourceReadOptions.VIEW_TYPE_INCREMENTAL_OPT_VAL())
      .option(DataSourceReadOptions.BEGIN_INSTANTTIME_OPT_KEY(),
             &lt;beginInstantTime&gt;)
      .load(tablePath); // For incremental view, pass in the root/base path of dataset
-</code></pre></div></div>
+</code></pre>
 
 <p>Please refer to <a href="configurations.html#spark-datasource">configurations</a> section, to view all datasource options.</p>
 
diff --git a/content/quickstart.html b/content/quickstart.html
index 599bb04..c3a252e 100644
--- a/content/quickstart.html
+++ b/content/quickstart.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -346,32 +346,18 @@ code snippets that allows you to insert and update a Hudi dataset of default sto
 <a href="https://hudi.apache.org/concepts.html#copy-on-write-storage">Copy on Write</a>. 
 After each write operation we will also show how to read the data both snapshot and incrementally.</p>
 
-<h2 id="build-hudi-spark-bundle-jar">Build Hudi spark bundle jar</h2>
-
-<p>Hudi requires Java 8 to be installed on a *nix system. Check out <a href="https://github.com/apache/incubator-hudi">code</a> and 
-normally build the maven project, from command line:</p>
-
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># checkout and build
-git clone https://github.com/apache/incubator-hudi.git &amp;&amp; cd incubator-hudi
-mvn clean install -DskipTests -DskipITs
-
-# Export the location of hudi-spark-bundle for later 
-mkdir -p /tmp/hudi &amp;&amp; cp packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar  /tmp/hudi/hudi-spark-bundle.jar 
-export HUDI_SPARK_BUNDLE_PATH=/tmp/hudi/hudi-spark-bundle.jar
-</code></pre></div></div>
-
 <h2 id="setup-spark-shell">Setup spark-shell</h2>
 <p>Hudi works with Spark-2.x versions. You can follow instructions <a href="https://spark.apache.org/downloads.html">here</a> for 
 setting up spark.</p>
 
 <p>From the extracted directory run spark-shell with Hudi as:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/spark-shell --jars $HUDI_SPARK_BUNDLE_PATH --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
-</code></pre></div></div>
+<pre><code class="language-Scala">bin/spark-shell --packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
+</code></pre>
 
 <p>Setup table name, base path and a data generator to generate records for this guide.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import org.apache.hudi.QuickstartUtils._
+<pre><code class="language-Scala">import org.apache.hudi.QuickstartUtils._
 import scala.collection.JavaConversions._
 import org.apache.spark.sql.SaveMode._
 import org.apache.hudi.DataSourceReadOptions._
@@ -381,7 +367,7 @@ import org.apache.hudi.config.HoodieWriteConfig._
 val tableName = "hudi_cow_table"
 val basePath = "file:///tmp/hudi_cow_table"
 val dataGen = new DataGenerator
-</code></pre></div></div>
+</code></pre>
 
 <p>The <a href="https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java">DataGenerator</a> 
 can generate sample inserts and updates based on the the sample trip schema 
@@ -390,7 +376,7 @@ can generate sample inserts and updates based on the the sample trip schema
 <h2 id="inserts">Insert data</h2>
 <p>Generate some new trips, load them into a DataFrame and write the DataFrame into the Hudi dataset as below.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val inserts = convertToStringList(dataGen.generateInserts(10))
+<pre><code class="language-Scala">val inserts = convertToStringList(dataGen.generateInserts(10))
 val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
 df.write.format("org.apache.hudi").
     options(getQuickstartWriteConfigs).
@@ -400,7 +386,7 @@ df.write.format("org.apache.hudi").
     option(TABLE_NAME, tableName).
     mode(Overwrite).
     save(basePath);
-</code></pre></div></div>
+</code></pre>
 
 <p><code class="highlighter-rouge">mode(Overwrite)</code> overwrites and recreates the dataset if it already exists.
 You can check the data generated under <code class="highlighter-rouge">/tmp/hudi_cow_table/&lt;region&gt;/&lt;country&gt;/&lt;city&gt;/</code>. We provided a record key 
@@ -414,14 +400,14 @@ Here we are using the default write operation : <code class="highlighter-rouge">
 
 <h2 id="query">Query data</h2>
 <p>Load the data files into a DataFrame.</p>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val roViewDF = spark.
+<pre><code class="language-Scala">val roViewDF = spark.
     read.
     format("org.apache.hudi").
     load(basePath + "/*/*/*/*")
 roViewDF.registerTempTable("hudi_ro_table")
 spark.sql("select fare, begin_lon, begin_lat, ts from  hudi_ro_table where fare &gt; 20.0").show()
 spark.sql("select _hoodie_commit_time, _hoodie_record_key, _hoodie_partition_path, rider, driver, fare from  hudi_ro_table").show()
-</code></pre></div></div>
+</code></pre>
 <p>This query provides a read optimized view of the ingested data. Since our partition path (<code class="highlighter-rouge">region/country/city</code>) is 3 levels nested 
 from base path we ve used <code class="highlighter-rouge">load(basePath + "/*/*/*/*")</code>. 
 Refer to <a href="https://hudi.apache.org/concepts.html#storage-types--views">Storage Types and Views</a> for more info on all storage types and views supported.</p>
@@ -430,7 +416,7 @@ Refer to <a href="https://hudi.apache.org/concepts.html#storage-types--views">St
 <p>This is similar to inserting new data. Generate updates to existing trips using the data generator, load into a DataFrame 
 and write DataFrame into the hudi dataset.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val updates = convertToStringList(dataGen.generateUpdates(10))
+<pre><code class="language-Scala">val updates = convertToStringList(dataGen.generateUpdates(10))
 val df = spark.read.json(spark.sparkContext.parallelize(updates, 2));
 df.write.format("org.apache.hudi").
     options(getQuickstartWriteConfigs).
@@ -440,7 +426,7 @@ df.write.format("org.apache.hudi").
     option(TABLE_NAME, tableName).
     mode(Append).
     save(basePath);
-</code></pre></div></div>
+</code></pre>
 
 <p>Notice that the save mode is now <code class="highlighter-rouge">Append</code>. In general, always use append mode unless you are trying to create the dataset for the first time.
 <a href="#query">Querying</a> the data again will now show updated trips. Each write operation generates a new <a href="http://hudi.incubator.apache.org/concepts.html">commit</a> 
@@ -452,7 +438,7 @@ denoted by the timestamp. Look for changes in <code class="highlighter-rouge">_h
 This can be achieved using Hudi’s incremental view and providing a begin time from which changes need to be streamed. 
 We do not need to specify endTime, if we want all changes after the given commit (as is the common case).</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val commits = spark.sql("select distinct(_hoodie_commit_time) as commitTime from  hudi_ro_table order by commitTime").map(k =&gt; k.getString(0)).take(50)
+<pre><code class="language-Scala">val commits = spark.sql("select distinct(_hoodie_commit_time) as commitTime from  hudi_ro_table order by commitTime").map(k =&gt; k.getString(0)).take(50)
 val beginTime = commits(commits.length - 2) // commit time we are interested in
 
 // incrementally query data
@@ -464,7 +450,7 @@ val incViewDF = spark.
     load(basePath);
 incViewDF.registerTempTable("hudi_incr_table")
 spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hudi_incr_table where fare &gt; 20.0").show()
-</code></pre></div></div>
+</code></pre>
 <p>This will give all changes that happened after the beginTime commit with the filter of fare &gt; 20.0. The unique thing about this
 feature is that it now lets you author streaming pipelines on batch data.</p>
 
@@ -472,7 +458,7 @@ feature is that it now lets you author streaming pipelines on batch data.</p>
 <p>Lets look at how to query data as of a specific time. The specific time can be represented by pointing endTime to a 
 specific commit time and beginTime to “000” (denoting earliest possible commit time).</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>val beginTime = "000" // Represents all commits &gt; this time.
+<pre><code class="language-Scala">val beginTime = "000" // Represents all commits &gt; this time.
 val endTime = commits(commits.length - 2) // commit time we are interested in
 
 //incrementally query data
@@ -483,10 +469,14 @@ val incViewDF = spark.read.format("org.apache.hudi").
     load(basePath);
 incViewDF.registerTempTable("hudi_incr_table")
 spark.sql("select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hudi_incr_table where fare &gt; 20.0").show()
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="where-to-go-from-here">Where to go from here?</h2>
-<p>Here, we used Spark to show case the capabilities of Hudi. However, Hudi can support multiple storage types/views and 
+<p>You can also do the quickstart by <a href="https://github.com/apache/incubator-hudi#building-apache-hudi-from-source-building-hudi">building hudi yourself</a>, 
+and using <code class="highlighter-rouge">--jars &lt;path to hudi_code&gt;/packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar</code> in the spark-shell command above
+instead of <code class="highlighter-rouge">--packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating</code></p>
+
+<p>Also, we used Spark here to show case the capabilities of Hudi. However, Hudi can support multiple storage types/views and 
 Hudi datasets can be queried from query engines like Hive, Spark, Presto and much more. We have put together a 
 <a href="https://www.youtube.com/watch?v=VhNgUsxdrD0">demo video</a> that showcases all of this on a docker based setup with all 
 dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, by following 
@@ -496,14 +486,6 @@ to Hudi, refer to <a href="migration_guide.html">migration guide</a>.</p>
 
     <div class="tags">
         
-        <b>Tags: </b>
-        
-        
-        
-        <a href="tag_quickstart.html" class="btn btn-default navbar-btn cursorNorm" role="button">quickstart</a>
-        
-        
-        
     </div>
 
     
diff --git a/content/releases.html b/content/releases.html
index 38371ff..5bebd48 100644
--- a/content/releases.html
+++ b/content/releases.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
diff --git a/content/s3_hoodie.html b/content/s3_hoodie.html
index aae5928..d438c99 100644
--- a/content/s3_hoodie.html
+++ b/content/s3_hoodie.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -358,48 +358,48 @@
 
 <p>Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the <code class="highlighter-rouge">fs.defaultFS</code> with your S3 bucket name and Hudi should be able to read/write from the bucket.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  &lt;property&gt;
-      &lt;name&gt;fs.defaultFS&lt;/name&gt;
-      &lt;value&gt;s3://ysharma&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-      &lt;name&gt;fs.s3.impl&lt;/name&gt;
-      &lt;value&gt;org.apache.hadoop.fs.s3native.NativeS3FileSystem&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-      &lt;name&gt;fs.s3.awsAccessKeyId&lt;/name&gt;
-      &lt;value&gt;AWS_KEY&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-       &lt;name&gt;fs.s3.awsSecretAccessKey&lt;/name&gt;
-       &lt;value&gt;AWS_SECRET&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-       &lt;name&gt;fs.s3n.awsAccessKeyId&lt;/name&gt;
-       &lt;value&gt;AWS_KEY&lt;/value&gt;
-  &lt;/property&gt;
-
-  &lt;property&gt;
-       &lt;name&gt;fs.s3n.awsSecretAccessKey&lt;/name&gt;
-       &lt;value&gt;AWS_SECRET&lt;/value&gt;
-  &lt;/property&gt;
+<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.defaultFS<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>s3://ysharma<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.s3.impl<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>org.apache.hadoop.fs.s3native.NativeS3FileSystem<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.s3.awsAccessKeyId<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>AWS_KEY<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3.awsSecretAccessKey<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_SECRET<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3n.awsAccessKeyId<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_KEY<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3n.awsSecretAccessKey<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_SECRET<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
 </code></pre></div></div>
 
 <p>Utilities such as hudi-cli or deltastreamer tool, can pick up s3 creds via environmental variable prefixed with <code class="highlighter-rouge">HOODIE_ENV_</code>. For e.g below is a bash snippet to setup
 such variables and then have cli be able to work on datasets stored in s3</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>export HOODIE_ENV_fs_DOT_s3a_DOT_access_DOT_key=$accessKey
+<pre><code class="language-Java">export HOODIE_ENV_fs_DOT_s3a_DOT_access_DOT_key=$accessKey
 export HOODIE_ENV_fs_DOT_s3a_DOT_secret_DOT_key=$secretKey
 export HOODIE_ENV_fs_DOT_s3_DOT_awsAccessKeyId=$accessKey
 export HOODIE_ENV_fs_DOT_s3_DOT_awsSecretAccessKey=$secretKey
 export HOODIE_ENV_fs_DOT_s3n_DOT_awsAccessKeyId=$accessKey
 export HOODIE_ENV_fs_DOT_s3n_DOT_awsSecretAccessKey=$secretKey
 export HOODIE_ENV_fs_DOT_s3n_DOT_impl=org.apache.hadoop.fs.s3a.S3AFileSystem
-</code></pre></div></div>
+</code></pre>
 
 <h3 id="aws-libs">AWS Libs</h3>
 
diff --git a/content/search.json b/content/search.json
index 606cfe6..3143044 100644
--- a/content/search.json
+++ b/content/search.json
@@ -3,6 +3,7 @@
 
 {
 "title": "Connect with us at Strata San Jose March 2017",
+"tags": "",
 "keywords": "",
 "url": "cnevents2016-12-30-strata-talk-2017.html",
 "summary": ""
@@ -13,6 +14,7 @@
 
 {
 "title": "Connect with us at Strata San Jose March 2017",
+"tags": "",
 "keywords": "",
 "url": "events2016-12-30-strata-talk-2017.html",
 "summary": ""
@@ -23,6 +25,7 @@
 
 {
 "title": "Hudi entered Apache Incubator",
+"tags": "",
 "keywords": "",
 "url": "cnevents2019-01-18-asf-incubation.html",
 "summary": ""
@@ -33,6 +36,7 @@
 
 {
 "title": "Hudi entered Apache Incubator",
+"tags": "",
 "keywords": "",
 "url": "events2019-01-18-asf-incubation.html",
 "summary": ""
@@ -47,6 +51,7 @@
 
 {
 "title": "Administering Hudi Pipelines",
+"tags": "",
 "keywords": "hudi, administration, operation, devops",
 "url": "cnadmin_guide.html",
 "summary": "本节概述了可用于操作Hudi数据集生态系统的工具"
@@ -57,6 +62,7 @@
 
 {
 "title": "Administering Hudi Pipelines",
+"tags": "",
 "keywords": "hudi, administration, operation, devops",
 "url": "admin_guide.html",
 "summary": "This section offers an overview of tools available to operate an ecosystem of Hudi datasets"
@@ -67,6 +73,7 @@
 
 {
 "title": "Community",
+"tags": "",
 "keywords": "hudi, use cases, big data, apache",
 "url": "cncommunity.html",
 "summary": ""
@@ -77,6 +84,7 @@
 
 {
 "title": "Community",
+"tags": "",
 "keywords": "hudi, use cases, big data, apache",
 "url": "community.html",
 "summary": ""
@@ -87,6 +95,7 @@
 
 {
 "title": "Comparison",
+"tags": "",
 "keywords": "apache, hudi, kafka, kudu, hive, hbase, stream processing",
 "url": "cncomparison.html",
 "summary": ""
@@ -97,6 +106,7 @@
 
 {
 "title": "Comparison",
+"tags": "",
 "keywords": "apache, hudi, kafka, kudu, hive, hbase, stream processing",
 "url": "comparison.html",
 "summary": ""
@@ -107,9 +117,10 @@
 
 {
 "title": "Concepts",
+"tags": "",
 "keywords": "hudi, design, storage, views, timeline",
 "url": "cnconcepts.html",
-"summary": "Here we introduce some basic concepts & give a broad technical overview of Hudi"
+"summary": "这里我们将介绍Hudi的一些基本概念并提供关于Hudi的技术概述"
 }
 ,
 
@@ -117,6 +128,7 @@
 
 {
 "title": "Concepts",
+"tags": "",
 "keywords": "hudi, design, storage, views, timeline",
 "url": "concepts.html",
 "summary": "Here we introduce some basic concepts & give a broad technical overview of Hudi"
@@ -126,10 +138,11 @@
 
 
 {
-"title": "Configurations",
+"title": "配置",
+"tags": "",
 "keywords": "garbage collection, hudi, jvm, configs, tuning",
 "url": "cnconfigurations.html",
-"summary": "Here we list all possible configurations and what they mean"
+"summary": "在这里,我们列出了所有可能的配置及其含义。"
 }
 ,
 
@@ -137,6 +150,7 @@
 
 {
 "title": "Configurations",
+"tags": "",
 "keywords": "garbage collection, hudi, jvm, configs, tuning",
 "url": "configurations.html",
 "summary": "Here we list all possible configurations and what they mean"
@@ -147,6 +161,7 @@
 
 {
 "title": "Developer Setup",
+"tags": "",
 "keywords": "hudi, ide, developer, setup",
 "url": "cncontributing.html",
 "summary": ""
@@ -157,6 +172,7 @@
 
 {
 "title": "Developer Setup",
+"tags": "",
 "keywords": "hudi, ide, developer, setup",
 "url": "contributing.html",
 "summary": ""
@@ -167,6 +183,7 @@
 
 {
 "title": "Docker Demo",
+"tags": "",
 "keywords": "hudi, docker, demo",
 "url": "cndocker_demo.html",
 "summary": ""
@@ -177,6 +194,7 @@
 
 {
 "title": "Docker Demo",
+"tags": "",
 "keywords": "hudi, docker, demo",
 "url": "docker_demo.html",
 "summary": ""
@@ -189,6 +207,7 @@
 
 {
 "title": "GCS Filesystem",
+"tags": "",
 "keywords": "hudi, hive, google cloud, storage, spark, presto",
 "url": "cngcs_hoodie.html",
 "summary": "In this page, we go over how to configure hudi with Google Cloud Storage."
@@ -199,6 +218,7 @@
 
 {
 "title": "GCS Filesystem",
+"tags": "",
 "keywords": "hudi, hive, google cloud, storage, spark, presto",
 "url": "gcs_hoodie.html",
 "summary": "In this page, we go over how to configure hudi with Google Cloud Storage."
@@ -209,6 +229,7 @@
 
 {
 "title": "什么是Hudi?",
+"tags": "",
 "keywords": "big data, stream processing, cloud, hdfs, storage, upserts, change capture",
 "url": "cnindex.html",
 "summary": "Hudi为大数据带来流处理,在提供新数据的同时,比传统的批处理效率高出一个数量级。"
@@ -219,6 +240,7 @@
 
 {
 "title": "What is Hudi?",
+"tags": "",
 "keywords": "big data, stream processing, cloud, hdfs, storage, upserts, change capture",
 "url": "index.html",
 "summary": "Hudi brings stream processing to big data, providing fresh data while being an order of magnitude efficient over traditional batch processing."
@@ -229,6 +251,7 @@
 
 {
 "title": "Migration Guide",
+"tags": "",
 "keywords": "hudi, migration, use case",
 "url": "cnmigration_guide.html",
 "summary": "In this page, we will discuss some available tools for migrating your existing dataset into a Hudi dataset"
@@ -239,6 +262,7 @@
 
 {
 "title": "Migration Guide",
+"tags": "",
 "keywords": "hudi, migration, use case",
 "url": "migration_guide.html",
 "summary": "In this page, we will discuss some available tools for migrating your existing dataset into a Hudi dataset"
@@ -249,6 +273,7 @@
 
 {
 "title": "Scroll layout",
+"tags": "",
 "keywords": "json, scrolling, scrollto, jquery plugin",
 "url": "jsmydoc_scroll.html",
 "summary": "This page demonstrates how you the integration of a script called ScrollTo, which is used here to link definitions of a JSON code sample to a list of definitions for that particular term. The scenario here is that the JSON blocks are really long, with extensive nesting and subnesting, which makes it difficult for tables below the JSON to adequately explain the term in a usable way."
@@ -259,6 +284,7 @@
 
 {
 "title": "News",
+"tags": "",
 "keywords": "apache, hudi, news, blog, updates, release notes, announcements",
 "url": "cnnews.html",
 "summary": ""
@@ -269,6 +295,7 @@
 
 {
 "title": "News",
+"tags": "",
 "keywords": "apache, hudi, news, blog, updates, release notes, announcements",
 "url": "news.html",
 "summary": ""
@@ -279,6 +306,7 @@
 
 {
 "title": "News",
+"tags": "",
 "keywords": "news, blog, updates, release notes, announcements",
 "url": "cnnews_archive.html",
 "summary": ""
@@ -289,6 +317,7 @@
 
 {
 "title": "News",
+"tags": "",
 "keywords": "news, blog, updates, release notes, announcements",
 "url": "news_archive.html",
 "summary": ""
@@ -298,7 +327,8 @@
 
 
 {
-"title": "Performance",
+"title": "性能",
+"tags": "",
 "keywords": "hudi, index, storage, compaction, cleaning, implementation",
 "url": "cnperformance.html",
 "summary": ""
@@ -309,6 +339,7 @@
 
 {
 "title": "Performance",
+"tags": "",
 "keywords": "hudi, index, storage, compaction, cleaning, implementation",
 "url": "performance.html",
 "summary": ""
@@ -319,6 +350,7 @@
 
 {
 "title": "Talks &amp; Powered By",
+"tags": "",
 "keywords": "hudi, talks, presentation",
 "url": "cnpowered_by.html",
 "summary": ""
@@ -329,6 +361,7 @@
 
 {
 "title": "Talks &amp; Powered By",
+"tags": "",
 "keywords": "hudi, talks, presentation",
 "url": "powered_by.html",
 "summary": ""
@@ -339,6 +372,7 @@
 
 {
 "title": "Privacy Policy",
+"tags": "",
 "keywords": "hudi, privacy",
 "url": "cnprivacy.html",
 "summary": ""
@@ -349,6 +383,7 @@
 
 {
 "title": "Privacy Policy",
+"tags": "",
 "keywords": "hudi, privacy",
 "url": "privacy.html",
 "summary": ""
@@ -358,10 +393,11 @@
 
 
 {
-"title": "Querying Hudi Datasets",
+"title": "查询 Hudi 数据集",
+"tags": "",
 "keywords": "hudi, hive, spark, sql, presto",
 "url": "cnquerying_data.html",
-"summary": "In this page, we go over how to enable SQL queries on Hudi built tables."
+"summary": "在这一页里,我们介绍了如何在Hudi构建的表上启用SQL查询。"
 }
 ,
 
@@ -369,6 +405,7 @@
 
 {
 "title": "Querying Hudi Datasets",
+"tags": "",
 "keywords": "hudi, hive, spark, sql, presto",
 "url": "querying_data.html",
 "summary": "In this page, we go over how to enable SQL queries on Hudi built tables."
@@ -379,6 +416,7 @@
 
 {
 "title": "Quickstart",
+"tags": "",
 "keywords": "hudi, quickstart",
 "url": "cnquickstart.html",
 "summary": ""
@@ -389,6 +427,7 @@
 
 {
 "title": "Quickstart",
+"tags": "",
 "keywords": "hudi, quickstart",
 "url": "quickstart.html",
 "summary": ""
@@ -399,6 +438,7 @@
 
 {
 "title": "Releases",
+"tags": "",
 "keywords": "apache, hudi, release, data lake, upsert,",
 "url": "releases.html",
 "summary": "Apache Hudi (incubating) Releases Page"
@@ -409,6 +449,7 @@
 
 {
 "title": "S3 Filesystem",
+"tags": "",
 "keywords": "hudi, hive, aws, s3, spark, presto",
 "url": "cns3_hoodie.html",
 "summary": "In this page, we go over how to configure Hudi with S3 filesystem."
@@ -419,6 +460,7 @@
 
 {
 "title": "S3 Filesystem",
+"tags": "",
 "keywords": "hudi, hive, aws, s3, spark, presto",
 "url": "s3_hoodie.html",
 "summary": "In this page, we go over how to configure Hudi with S3 filesystem."
@@ -433,6 +475,7 @@
 
 {
 "title": "Use Cases",
+"tags": "",
 "keywords": "hudi, data ingestion, etl, real time, use cases",
 "url": "cnuse_cases.html",
 "summary": "以下是一些使用Hudi的示例,说明了加快处理速度和提高效率的好处"
@@ -443,6 +486,7 @@
 
 {
 "title": "Use Cases",
+"tags": "",
 "keywords": "hudi, data ingestion, etl, real time, use cases",
 "url": "use_cases.html",
 "summary": "Following are some sample use-cases for Hudi, which illustrate the benefits in terms of faster processing & increased efficiency"
@@ -453,6 +497,7 @@
 
 {
 "title": "写入 Hudi 数据集",
+"tags": "",
 "keywords": "hudi, incremental, batch, stream, processing, Hive, ETL, Spark SQL",
 "url": "cnwriting_data.html",
 "summary": "这一页里,我们将讨论一些可用的工具,这些工具可用于增量摄取和存储数据。"
@@ -463,6 +508,7 @@
 
 {
 "title": "Writing Hudi Datasets",
+"tags": "",
 "keywords": "hudi, incremental, batch, stream, processing, Hive, ETL, Spark SQL",
 "url": "writing_data.html",
 "summary": "In this page, we will discuss some available tools for incrementally ingesting & storing data."
diff --git a/content/sitemap.xml b/content/sitemap.xml
index 003ca1b..3a30e50 100644
--- a/content/sitemap.xml
+++ b/content/sitemap.xml
@@ -6,25 +6,25 @@
   
   
   <url>
-    <loc>http://localhost:4000/cn/events/2016-12-30-strata-talk-2017.html</loc>
+    <loc>http://0.0.0.0:4000/cn/events/2016-12-30-strata-talk-2017.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/events/2016-12-30-strata-talk-2017.html</loc>
+    <loc>http://0.0.0.0:4000/events/2016-12-30-strata-talk-2017.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/events/2019-01-18-asf-incubation.html</loc>
+    <loc>http://0.0.0.0:4000/cn/events/2019-01-18-asf-incubation.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/events/2019-01-18-asf-incubation.html</loc>
+    <loc>http://0.0.0.0:4000/events/2019-01-18-asf-incubation.html</loc>
   </url>
   
   
@@ -34,85 +34,85 @@
   
   
   <url>
-    <loc>http://localhost:4000/cn/admin_guide.html</loc>
+    <loc>http://0.0.0.0:4000/cn/admin_guide.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/admin_guide.html</loc>
+    <loc>http://0.0.0.0:4000/admin_guide.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/community.html</loc>
+    <loc>http://0.0.0.0:4000/cn/community.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/community.html</loc>
+    <loc>http://0.0.0.0:4000/community.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/comparison.html</loc>
+    <loc>http://0.0.0.0:4000/cn/comparison.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/comparison.html</loc>
+    <loc>http://0.0.0.0:4000/comparison.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/concepts.html</loc>
+    <loc>http://0.0.0.0:4000/cn/concepts.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/concepts.html</loc>
+    <loc>http://0.0.0.0:4000/concepts.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/configurations.html</loc>
+    <loc>http://0.0.0.0:4000/cn/configurations.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/configurations.html</loc>
+    <loc>http://0.0.0.0:4000/configurations.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/contributing.html</loc>
+    <loc>http://0.0.0.0:4000/cn/contributing.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/contributing.html</loc>
+    <loc>http://0.0.0.0:4000/contributing.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/docker_demo.html</loc>
+    <loc>http://0.0.0.0:4000/cn/docker_demo.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/docker_demo.html</loc>
+    <loc>http://0.0.0.0:4000/docker_demo.html</loc>
   </url>
   
   
@@ -120,145 +120,145 @@
   
   
   <url>
-    <loc>http://localhost:4000/cn/gcs_hoodie.html</loc>
+    <loc>http://0.0.0.0:4000/cn/gcs_hoodie.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/gcs_hoodie.html</loc>
+    <loc>http://0.0.0.0:4000/gcs_hoodie.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/index.html</loc>
+    <loc>http://0.0.0.0:4000/cn/index.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/index.html</loc>
+    <loc>http://0.0.0.0:4000/index.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/migration_guide.html</loc>
+    <loc>http://0.0.0.0:4000/cn/migration_guide.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/migration_guide.html</loc>
+    <loc>http://0.0.0.0:4000/migration_guide.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/js/mydoc_scroll.html</loc>
+    <loc>http://0.0.0.0:4000/js/mydoc_scroll.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/news.html</loc>
+    <loc>http://0.0.0.0:4000/cn/news.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/news.html</loc>
+    <loc>http://0.0.0.0:4000/news.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/news_archive.html</loc>
+    <loc>http://0.0.0.0:4000/cn/news_archive.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/news_archive.html</loc>
+    <loc>http://0.0.0.0:4000/news_archive.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/performance.html</loc>
+    <loc>http://0.0.0.0:4000/cn/performance.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/performance.html</loc>
+    <loc>http://0.0.0.0:4000/performance.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/powered_by.html</loc>
+    <loc>http://0.0.0.0:4000/cn/powered_by.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/powered_by.html</loc>
+    <loc>http://0.0.0.0:4000/powered_by.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/privacy.html</loc>
+    <loc>http://0.0.0.0:4000/cn/privacy.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/privacy.html</loc>
+    <loc>http://0.0.0.0:4000/privacy.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/querying_data.html</loc>
+    <loc>http://0.0.0.0:4000/cn/querying_data.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/querying_data.html</loc>
+    <loc>http://0.0.0.0:4000/querying_data.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/quickstart.html</loc>
+    <loc>http://0.0.0.0:4000/cn/quickstart.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/quickstart.html</loc>
+    <loc>http://0.0.0.0:4000/quickstart.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/releases.html</loc>
+    <loc>http://0.0.0.0:4000/releases.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/s3_hoodie.html</loc>
+    <loc>http://0.0.0.0:4000/cn/s3_hoodie.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/s3_hoodie.html</loc>
+    <loc>http://0.0.0.0:4000/s3_hoodie.html</loc>
   </url>
   
   
@@ -268,25 +268,25 @@
   
   
   <url>
-    <loc>http://localhost:4000/cn/use_cases.html</loc>
+    <loc>http://0.0.0.0:4000/cn/use_cases.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/use_cases.html</loc>
+    <loc>http://0.0.0.0:4000/use_cases.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/cn/writing_data.html</loc>
+    <loc>http://0.0.0.0:4000/cn/writing_data.html</loc>
   </url>
   
   
   
   <url>
-    <loc>http://localhost:4000/writing_data.html</loc>
+    <loc>http://0.0.0.0:4000/writing_data.html</loc>
   </url>
   
   
diff --git a/content/use_cases.html b/content/use_cases.html
index 3b57f4a..e6ea8b8 100644
--- a/content/use_cases.html
+++ b/content/use_cases.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
diff --git a/content/writing_data.html b/content/writing_data.html
index 1c9ce4d..e25b98c 100644
--- a/content/writing_data.html
+++ b/content/writing_data.html
@@ -46,7 +46,7 @@
 <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
 <![endif]-->
 
-<link rel="alternate" type="application/rss+xml" title="" href="http://localhost:4000feed.xml">
+<link rel="alternate" type="application/rss+xml" title="" href="http://0.0.0.0:4000feed.xml">
 
     <script>
         $(document).ready(function() {
@@ -220,7 +220,7 @@
 
 
 <ul id="mysidebar" class="nav">
-    <li class="sidebarTitle">Latest Version 0.5.0-incubating</li>
+    <li class="sidebarTitle">Version (0.5.0-incubating)</li>
     
     
     
@@ -375,7 +375,7 @@ can be chosen/changed across each commit/deltacommit issued against the dataset.
 
 <p>Command line options describe capabilities in more detail</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` --help
+<pre><code class="language-Java">[hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` --help
 Usage: &lt;main class&gt; [options]
   Options:
     --commit-on-errors
@@ -444,26 +444,26 @@ Usage: &lt;main class&gt; [options]
       schema) before writing. Default : Not set. E:g -
       org.apache.hudi.utilities.transform.SqlQueryBasedTransformer (which
       allows a SQL query template to be passed as a transformation function)
-</code></pre></div></div>
+</code></pre>
 
 <p>The tool takes a hierarchically composed property file and has pluggable interfaces for extracting data, key generation and providing schema. Sample configs for ingesting from kafka and dfs are
 provided under <code class="highlighter-rouge">hudi-utilities/src/test/resources/delta-streamer-config</code>.</p>
 
 <p>For e.g: once you have Confluent Kafka, Schema registry up &amp; running, produce some test data using (<a href="https://docs.confluent.io/current/ksql/docs/tutorials/generate-custom-test-data.html">impressions.avro</a> provided by schema-registry repo)</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[confluent-5.0.0]$ bin/ksql-datagen schema=../impressions.avro format=avro topic=impressions key=impressionid
-</code></pre></div></div>
+<pre><code class="language-Java">[confluent-5.0.0]$ bin/ksql-datagen schema=../impressions.avro format=avro topic=impressions key=impressionid
+</code></pre>
 
 <p>and then ingest it as follows.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` \
+<pre><code class="language-Java">[hoodie]$ spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-*.jar` \
   --props file://${PWD}/hudi-utilities/src/test/resources/delta-streamer-config/kafka-source.properties \
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --source-ordering-field impresssiontime \
   --target-base-path file:///tmp/hudi-deltastreamer-op --target-table uber.impressions \
   --op BULK_INSERT
-</code></pre></div></div>
+</code></pre>
 
 <p>In some cases, you may want to migrate your existing dataset into Hudi beforehand. Please refer to <a href="migration_guide.html">migration guide</a>.</p>
 
@@ -473,7 +473,7 @@ provided under <code class="highlighter-rouge">hudi-utilities/src/test/resources
 Following is how we can upsert a dataframe, while specifying the field names that need to be used
 for <code class="highlighter-rouge">recordKey =&gt; _row_key</code>, <code class="highlighter-rouge">partitionPath =&gt; partition</code> and <code class="highlighter-rouge">precombineKey =&gt; timestamp</code></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputDF.write()
+<pre><code class="language-Java">inputDF.write()
        .format("org.apache.hudi")
        .options(clientOpts) // any of the Hudi client opts can be passed in as well
        .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY(), "_row_key")
@@ -482,7 +482,7 @@ for <code class="highlighter-rouge">recordKey =&gt; _row_key</code>, <code class
        .option(HoodieWriteConfig.TABLE_NAME, tableName)
        .mode(SaveMode.Append)
        .save(basePath);
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="syncing-to-hive">Syncing to Hive</h2>
 
@@ -490,7 +490,7 @@ for <code class="highlighter-rouge">recordKey =&gt; _row_key</code>, <code class
 In case, its preferable to run this from commandline or in an independent jvm, Hudi provides a <code class="highlighter-rouge">HiveSyncTool</code>, which can be invoked as below, 
 once you have built the hudi-hive module.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd hudi-hive
+<pre><code class="language-Java">cd hudi-hive
 ./run_sync_tool.sh
  [hudi-hive]$ ./run_sync_tool.sh --help
 Usage: &lt;main class&gt; [options]
@@ -511,7 +511,7 @@ Usage: &lt;main class&gt; [options]
        name of the target table in Hive
   * --user
        Hive username
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="deletes">Deletes</h2>
 
@@ -524,13 +524,13 @@ Usage: &lt;main class&gt; [options]
  via either DataSource or DeltaStreamer which always returns Optional.Empty as the combined value. Hudi ships with a built-in <code class="highlighter-rouge">org.apache.hudi.EmptyHoodieRecordPayload</code> class that does exactly this.</li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> deleteDF // dataframe containing just records to be deleted
+<pre><code class="language-Java"> deleteDF // dataframe containing just records to be deleted
    .write().format("org.apache.hudi")
    .option(...) // Add HUDI options like record-key, partition-path and others as needed for your setup
    // specify record_key, partition_key, precombine_fieldkey &amp; usual params
    .option(DataSourceWriteOptions.PAYLOAD_CLASS_OPT_KEY, "org.apache.hudi.EmptyHoodieRecordPayload")
  
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="storage-management">Storage Management</h2>