You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by ji...@apache.org on 2022/05/12 08:46:30 UTC
[incubator-doris-spark-connector] branch master updated: add quick start steps (#31)
This is an automated email from the ASF dual-hosted git repository.
jiafengzheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris-spark-connector.git
The following commit(s) were added to refs/heads/master by this push:
new a29495b add quick start steps (#31)
a29495b is described below
commit a29495b1a709b22b82b3bc75edb198481af4f20d
Author: LOVEGISER <wa...@163.com>
AuthorDate: Thu May 12 16:46:26 2022 +0800
add quick start steps (#31)
add quick start steps
---
README.md | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 94 insertions(+)
diff --git a/README.md b/README.md
index 9becaf0..cc08808 100644
--- a/README.md
+++ b/README.md
@@ -30,6 +30,100 @@ More information about compilation and usage, please visit [Spark Doris Connecto
[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
+### QuickStart
+
+1. download and compile Spark Doris Connector from https://github.com/apache/incubator-doris-spark-connector, we suggest compile Spark Doris Connector by Doris offfcial image。
+
+```bash
+$ docker pull apache/incubator-doris:build-env-ldb-toolchain-latest
+```
+
+2. the result of compile jar is like:spark-doris-connector-3.1_2.12-1.0.0-SNAPSHOT.jar
+
+3. download spark for https://spark.apache.org/downloads.html .if in china there have a good choice of tencent link https://mirrors.cloud.tencent.com/apache/spark/spark-3.1.2/
+
+```bash
+#download
+wget https://mirrors.cloud.tencent.com/apache/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz
+#decompression
+tar -xzvf spark-3.1.2-bin-hadoop3.2.tgz
+```
+
+4. config Spark environment
+
+```shell
+vim /etc/profile
+export SPARK_HOME=/your_parh/spark-3.1.2-bin-hadoop3.2
+export PATH=$PATH:$SPARK_HOME/bin
+source /etc/profile
+```
+
+5. copy spark-doris-connector-3.1_2.12-1.0.0-SNAPSHOT.jar to spark jars directory。
+
+```shell
+cp /your_path/spark-doris-connector/target/spark-doris-connector-3.1_2.12-1.0.0-SNAPSHOT.jar $SPARK_HOME/jars
+```
+
+6. created doris database and table。
+
+ ```sql
+ create database mongo_doris;
+ use mongo_doris;
+ CREATE TABLE data_sync_test_simple
+ (
+ _id VARCHAR(32) DEFAULT '',
+ id VARCHAR(32) DEFAULT '',
+ user_name VARCHAR(32) DEFAULT '',
+ member_list VARCHAR(32) DEFAULT ''
+ )
+ DUPLICATE KEY(_id)
+ DISTRIBUTED BY HASH(_id) BUCKETS 10
+ PROPERTIES("replication_num" = "1");
+ INSERT INTO data_sync_test_simple VALUES ('1','1','alex','123');
+ ```
+
+ 7. Input this coed in spark-shell.
+
+```bash
+import org.apache.doris.spark._
+val dorisSparkRDD = sc.dorisRDD(
+ tableIdentifier = Some("mongo_doris.data_sync_test"),
+ cfg = Some(Map(
+ "doris.fenodes" -> "127.0.0.1:8030",
+ "doris.request.auth.user" -> "root",
+ "doris.request.auth.password" -> ""
+ ))
+)
+dorisSparkRDD.collect()
+```
+
+- mongo_doris:doris database name
+- data_sync_test:doris table mame.
+- doris.fenodes:doris FE IP:http_port
+- doris.request.auth.user:doris user name.
+- doris.request.auth.password:doris password
+
+8. if Spark is Cluster model,upload Jar to HDFS,add doris-spark-connector jar HDFS URL in spark.yarn.jars.
+
+```bash
+spark.yarn.jars=hdfs:///spark-jars/doris-spark-connector-3.1.2-2.12-1.0.0.jar
+```
+
+Link:https://github.com/apache/incubator-doris/discussions/9486
+
+9. in pyspark,input this code in pyspark shell command.
+
+```bash
+dorisSparkDF = spark.read.format("doris")
+.option("doris.table.identifier", "mongo_doris.data_sync_test")
+.option("doris.fenodes", "127.0.0.1:8030")
+.option("user", "root")
+.option("password", "")
+.load()
+# show 5 lines data
+dorisSparkDF.show(5)
+```
+
## Report issues or submit pull request
If you find any bugs, feel free to file a [GitHub issue](https://github.com/apache/incubator-doris/issues) or fix it by submitting a [pull request](https://github.com/apache/incubator-doris/pulls).
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org