You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@linkis.apache.org by pe...@apache.org on 2022/06/07 03:58:36 UTC

[incubator-linkis] branch release-0.11.0 updated: Linkis supports Hive on Tez deployment documents (#541)

This is an automated email from the ASF dual-hosted git repository.

peacewong pushed a commit to branch release-0.11.0
in repository https://gitbox.apache.org/repos/asf/incubator-linkis.git


The following commit(s) were added to refs/heads/release-0.11.0 by this push:
     new 19572d888 Linkis supports Hive on Tez deployment documents (#541)
19572d888 is described below

commit 19572d888d88d403ab78b6a4fe72c4ffe066b0d5
Author: Dlimeng <77...@qq.com>
AuthorDate: Tue Jun 7 11:58:29 2022 +0800

    Linkis supports Hive on Tez deployment documents (#541)
    
    * Linkis环境下Hive on Tez 部署文档
    
    * pdf 文件损坏,换成markdown文档 Linkis环境下Hive on Tez 部署文档
    
    * Linkis环境下Hive on Tez.md to directory docs/zh_CN/ch5
    
    * Linkis环境下Hive on Tez.md to directory docs/zh_CN/ch5
---
 ...\216\257\345\242\203\344\270\213Hive on Tez.md" |  82 +++++++++++++++++++++
 docs/zh_CN/images/tez/hive_tez.png                 | Bin 0 -> 67981 bytes
 docs/zh_CN/images/tez/hive_tez2.png                | Bin 0 -> 104018 bytes
 docs/zh_CN/images/tez/hive_tez3.png                | Bin 0 -> 16611 bytes
 docs/zh_CN/images/tez/hive_tez4.png                | Bin 0 -> 60338 bytes
 5 files changed, 82 insertions(+)

diff --git "a/docs/zh_CN/ch5/Linkis\347\216\257\345\242\203\344\270\213Hive on Tez.md" "b/docs/zh_CN/ch5/Linkis\347\216\257\345\242\203\344\270\213Hive on Tez.md"
new file mode 100644
index 000000000..06684618f
--- /dev/null
+++ "b/docs/zh_CN/ch5/Linkis\347\216\257\345\242\203\344\270\213Hive on Tez.md"	
@@ -0,0 +1,82 @@
+## 一.简介
+
+### 1.1 背景
+
+Linkis是为了解决计算治理的中间件,用Hive常常会用到tez引擎,本文介绍如何基于Linkis打通Hive on Tez 。
+
+### 1.2 架构
+
+![图片](../images/tez/hive_tez.png)
+
+Linkis的Hive引擎执行任务流程
+
+HiveEngine中用到org.apache.hadoop.hive.ql.Driver 去调用Hql,这里面涉及到Hive的执行流程。
+
+* 由编译器解析查询语句并从AST中生成一个Calcite逻辑计划。
+* 优化逻辑计划。
+* 优化逻辑计划转换物理计划。
+* 向量化的执行计划。
+* 生成具体的task,可以是mr或者Spark、Tez,并通过Driver提交任务到YARN。
+* 执行结束后将结果返回给用户。
+
+从执行逻辑角度上看,基于Linkis去操作Hive on Tez不需要更改源码,之所以不能用Tez因为Hive引擎中缺失依赖,所以无法去执行。
+
+## 二.部署
+
+以CDH-5.16.2环境为例,假设大数据集群已经配置好Tez。
+
+1.linkis-ujes-hive-enginemanager目录下,lib文件夹,添加Tez依赖。把/opt/cloudera/parcels/CDH/lib/tez 目录下,tez-*前缀的jar都放到linkis-ujes-hive-enginemanager/lib目录下。
+
+![图片](../images/tez/hive_tez2.png)
+
+2.CDH Hive配置tez依赖。
+
+![图片](../images/tez/hive_tez3.png)
+
+3.因为CDH配置的变量存储在元数据库中,所以把/etc/hive/conf配置独立出来,在linkis-ujes-hive-enginemanager conf文件夹下,把linkis-engine.properties和linkis.properties
+
+的hive.config.dir都配置独立出来的Hive相关的配置文件。
+
+独立出来Hive配置文件,Hive-site.xml中追加属性(重点):
+
+```plain
+ <property>
+    <name>tez.lib.uris</name>
+    <value>hdfs:///apps/tez/tez-0.8.5.tar.gz</value>
+  </property>
+   <property>
+    <name>hive.tez.container.size</name>
+    <value>10240</value>
+  </property>
+```
+linkis-ujes-hive-enginemanager/conf/linkis.properties 内容:
+
+```plain
+wds.linkis.server.restful.scan.packages=com.webank.wedatasphere.linkis.entrance.restful
+wds.linkis.engine.application.name=hiveEngine
+wds.linkis.server.component.exclude.packages=com.webank.wedatasphere.linkis.engine.,com.webank.wedatasphere.linkis.udf.
+wds.linkis.server.version=v1
+#sudo script
+wds.linkis.enginemanager.sudo.script=/home/hdfs/linkis/linkis-ujes-hive-enginemanager/bin/rootScript.sh
+#hadoop config
+hadoop.config.dir=/etc/hadoop/conf
+#hive config
+hive.config.dir=/home/hdfs/etc/hive
+```
+linkis-ujes-hive-enginemanager/conf/linkis-engine.properties 内容:
+
+```plain
+wds.linkis.server.restful.scan.packages=com.webank.wedatasphere.linkis.engine.restful
+wds.linkis.engine.application.name=hiveEngine
+wds.linkis.server.component.exclude.packages=com.webank.wedatasphere.linkis.enginemanager.,com.webank.wedatasphere.linkis.udf.
+wds.linkis.server.component.exclude.classes=com.webank.wedatasphere.linkis.resourcemanager.service.annotation.RMAnnotationParser
+wds.linkis.server.version=v1
+#hadoop config
+hadoop.config.dir=/etc/hadoop/conf
+#hive config
+hive.config.dir=/home/hdfs/etc/hive
+```
+/home/hdfs/etc/hive文件夹内容:
+
+![图片](../images/tez/hive_tez4.png)
+
diff --git a/docs/zh_CN/images/tez/hive_tez.png b/docs/zh_CN/images/tez/hive_tez.png
new file mode 100644
index 000000000..db8bd05fb
Binary files /dev/null and b/docs/zh_CN/images/tez/hive_tez.png differ
diff --git a/docs/zh_CN/images/tez/hive_tez2.png b/docs/zh_CN/images/tez/hive_tez2.png
new file mode 100644
index 000000000..f207bfe89
Binary files /dev/null and b/docs/zh_CN/images/tez/hive_tez2.png differ
diff --git a/docs/zh_CN/images/tez/hive_tez3.png b/docs/zh_CN/images/tez/hive_tez3.png
new file mode 100644
index 000000000..976c81234
Binary files /dev/null and b/docs/zh_CN/images/tez/hive_tez3.png differ
diff --git a/docs/zh_CN/images/tez/hive_tez4.png b/docs/zh_CN/images/tez/hive_tez4.png
new file mode 100644
index 000000000..1198ba6dd
Binary files /dev/null and b/docs/zh_CN/images/tez/hive_tez4.png differ


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@linkis.apache.org
For additional commands, e-mail: commits-help@linkis.apache.org