You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by vi...@apache.org on 2019/09/06 22:15:35 UTC

[incubator-hudi] branch asf-site updated: [HUDI-237] Translate quickstart page

This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new a07fc11  [HUDI-237] Translate quickstart page
a07fc11 is described below

commit a07fc111912c184d8673ee5e6a2dd10f266bad05
Author: leesf <49...@qq.com>
AuthorDate: Thu Sep 5 15:37:14 2019 +0800

    [HUDI-237] Translate quickstart page
---
 docs/quickstart.cn.md | 78 ++++++++++++++++++++++++---------------------------
 1 file changed, 36 insertions(+), 42 deletions(-)

diff --git a/docs/quickstart.cn.md b/docs/quickstart.cn.md
index db59664..75bcec0 100644
--- a/docs/quickstart.cn.md
+++ b/docs/quickstart.cn.md
@@ -7,50 +7,47 @@ toc: false
 permalink: quickstart.html
 ---
 <br/>
-To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
+为快速了解Hudi的功能,我们制作了一个基于Docker设置、所有依赖系统都在本地运行的[演示视频](https://www.youtube.com/watch?V=vhngusxdrd0)。
+我们建议您复制相同的设置然后按照[这里](docker_demo.html)的步骤自己运行这个演示。另外,如果您正在寻找将现有数据迁移到Hudi的方法,请参阅[迁移指南](migration_guide.html)。
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on.
+如果您已经安装了hive、hadoop和spark,那么请继续阅读。
 
-## Download Hudi
+## 下载Hudi
 
-Check out [code](https://github.com/apache/incubator-hudi) or download [latest release](https://github.com/apache/incubator-hudi/archive/hudi-0.4.5.zip) 
-and normally build the maven project, from command line
+Git检出[代码](https://github.com/apache/incubator-hudi)或下载[最新版本](https://github.com/apache/incubator-hudi/archive/hudi-0.4.5.zip)
+
+并通过命令行构建maven项目
 
 ```
 $ mvn clean install -DskipTests -DskipITs
 ```
 
-To work with older version of Hive (pre Hive-1.2.1), use
+如果使用旧版本的Hive(Hive-1.2.1以前),使用
 ```
 $ mvn clean install -DskipTests -DskipITs -Dhive11
 ```
 
-For IDE, you can pull in the code into IntelliJ as a normal maven project. 
-You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE.
+> 对于IDE,您可以将代码作为普通的maven项目导入IntelliJ。
+您可能需要将spark jars文件夹添加到 'Module Settings' 下的项目依赖中,以便能够在IDE运行。
 
 
-### Version Compatibility
+### 版本兼容性
 
-Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. 
-Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark.
+Hudi要求在*nix系统上安装Java 8。 Hudi使用Spark-2.x版本。此外,我们已经验证Hudi可使用以下Hadoop/Hive/Spark组合。
 
-| Hadoop | Hive  | Spark | Instructions to Build Hudi |
+| Hadoop | Hive  | Spark | 构建Hudi说明 |
 | ---- | ----- | ---- | ---- |
 | 2.6.0-cdh5.7.2 | 1.1.0-cdh5.7.2 | spark-2.[1-3].x | Use “mvn clean install -DskipTests -Dhadoop.version=2.6.0-cdh5.7.2 -Dhive.version=1.1.0-cdh5.7.2” |
 | Apache hadoop-2.8.4 | Apache hive-2.3.3 | spark-2.[1-3].x | Use "mvn clean install -DskipTests" |
 | Apache hadoop-2.7.3 | Apache hive-1.2.1 | spark-2.[1-3].x | Use "mvn clean install -DskipTests" |
 
-If your environment has other versions of hadoop/hive/spark, please try out Hudi 
-and let us know if there are any issues. 
+> 如果你的环境有其他版本的hadoop/hive/spark,请使用Hudi并告诉我们是否存在任何问题。
 
-## Generate Sample Dataset
+## 生成样本数据集
 
-### Environment Variables
+### 环境变量
 
-Please set the following environment variables according to your setup. We have given an example setup with CDH version
+请根据您的设置配置以下环境变量。我们给出了一个CDH版本的示例设置
 
 ```
 cd incubator-hudi 
@@ -65,10 +62,9 @@ export SPARK_CONF_DIR=$SPARK_HOME/conf
 export PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH
 ```
 
-### Run HoodieJavaApp
+### 运行HoodieJavaApp
 
-Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously inserted 100 records) onto your DFS/local filesystem. Use the wrapper script
-to run from command-line
+运行 __hudi-spark/src/test/java/HoodieJavaApp.java__ 类,将两个提交(提交1 => 100个插入,提交2 => 100个更新到先前插入的100个记录)放到DFS/本地文件系统上。从命令行运行脚本
 
 ```
 cd hudi-spark
@@ -88,14 +84,13 @@ Usage: <main class> [options]
        Default: COPY_ON_WRITE
 ```
 
-The class lets you choose table names, output paths and one of the storage types. In your own applications, be sure to include the `hudi-spark` module as dependency
-and follow a similar pattern to write/read datasets via the datasource. 
+这个类允许您选择表名、输出路径和存储类型。在您自己的应用程序中,请确保包含`hudi-spark`模块依赖并遵循类似模式通过数据源写入/读取数据集。
 
-## Query a Hudi dataset
+## 查询Hudi数据集
 
-Next, we will register the sample dataset into Hive metastore and try to query using [Hive](#hive), [Spark](#spark) & [Presto](#presto)
+接下来,我们将样本数据集注册到Hive Metastore并尝试使用[Hive](#hive)、[Spark](#spark)和[Presto](#presto)进行查询
 
-### Start Hive Server locally
+### 本地启动Hive Server
 
 ```
 hdfs namenode # start name node
@@ -110,10 +105,10 @@ bin/hiveserver2 \
 
 ```
 
-### Run Hive Sync Tool
-Hive Sync Tool will update/create the necessary metadata(schema and partitions) in hive metastore. This allows for schema evolution and incremental addition of new partitions written to.
-It uses an incremental approach by storing the last commit time synced in the TBLPROPERTIES and only syncing the commits from the last sync commit time stored.
-Both [Spark Datasource](writing_data.html#datasource-writer) & [DeltaStreamer](writing_data.html#deltastreamer) have capability to do this, after each write.
+### 运行Hive同步工具
+Hive同步工具将在hive Metastore中更新/创建必要的元数据(模式和分区),这允许模式演变和新分区的增量添加。它使用一种增量方法,将最后一次的同步提交时间存储在TBLPROPERTIES中,并且只同步上次存储的同步提交时间后的提交。
+每次写入后, [Spark Datasource](writing_data.html #datasource-writer)和[DeltaStreamer](writing_data.html#deltastreamer)都可执行此操作。
+
 
 ```
 cd hudi-hive
@@ -127,13 +122,12 @@ cd hudi-hive
   --partitioned-by field1,field2
 
 ```
-For some reason, if you want to do this by hand. Please 
-follow [this](https://cwiki.apache.org/confluence/display/HUDI/Registering+sample+dataset+to+Hive+via+beeline).
 
+> 若你想亲手运行。请参考[这个](https://cwiki.apache.org/confluence/display/HUDI/Registering+sample+dataset+to+Hive+via+beeline).
 
 ### HiveQL {#hive}
 
-Let's first perform a query on the latest committed snapshot of the table
+我们首先对表的最新提交的快照进行查询
 
 ```
 hive> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
@@ -149,7 +143,7 @@ hive>
 
 ### SparkSQL {#spark}
 
-Spark is super easy, once you get Hive working as above. Just spin up a Spark Shell as below
+只要你按上述方式让Hive工作起来,使用Spark将非常容易。只需启动Spark Shell,如下所示
 
 ```
 $ cd $SPARK_INSTALL
@@ -164,19 +158,19 @@ scala> sqlContext.sql("select count(*) from hoodie_test").show(10000)
 
 ### Presto {#presto}
 
-Checkout the 'master' branch on OSS Presto, build it, and place your installation somewhere.
+Git检出OSS Presto上的 'master' 分支,构建并安装。
 
-* Copy the hudi/packaging/hudi-presto-bundle/target/hudi-presto-bundle-*.jar into $PRESTO_INSTALL/plugin/hive-hadoop2/
-* Startup your server and you should be able to query the same Hive table via Presto
+* 将hudi/packaging/hudi-presto-bundle/target/hudi-presto-bundle-*.jar 复制到 $PRESTO_INSTALL/plugin/hive-hadoop2/
+* 启动服务器,您应该能够通过Presto查询到相同的Hive表
 
 ```
 show columns from hive.default.hoodie_test;
 select count(*) from hive.default.hoodie_test
 ```
 
-### Incremental HiveQL
+### 增量 HiveQL
 
-Let's now perform a query, to obtain the __ONLY__ changed rows since a commit in the past.
+现在我们执行一个查询,以获取自上次提交以来已更改的行。
 
 ```
 hive> set hoodie.hoodie_test.consume.mode=INCREMENTAL;
@@ -200,4 +194,4 @@ hive>
 hive>
 ```
 
-This is only supported for Read-optimized view for now."
+> 注意:当前只有读优化视图支持。