You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by ji...@apache.org on 2022/05/12 08:46:30 UTC

[incubator-doris-spark-connector] branch master updated: add quick start steps (#31)

This is an automated email from the ASF dual-hosted git repository.

jiafengzheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris-spark-connector.git


The following commit(s) were added to refs/heads/master by this push:
     new a29495b  add quick start steps (#31)
a29495b is described below

commit a29495b1a709b22b82b3bc75edb198481af4f20d
Author: LOVEGISER <wa...@163.com>
AuthorDate: Thu May 12 16:46:26 2022 +0800

    add quick start steps (#31)
    
    add quick start steps
---
 README.md | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/README.md b/README.md
index 9becaf0..cc08808 100644
--- a/README.md
+++ b/README.md
@@ -30,6 +30,100 @@ More information about compilation and usage, please visit [Spark Doris Connecto
 
 [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
 
+### QuickStart
+
+1. download and compile Spark Doris Connector from  https://github.com/apache/incubator-doris-spark-connector, we suggest compile Spark Doris Connector  by Doris offfcial image。
+
+```bash
+$ docker pull apache/incubator-doris:build-env-ldb-toolchain-latest
+```
+
+2. the result of compile jar is like:spark-doris-connector-3.1_2.12-1.0.0-SNAPSHOT.jar
+
+3. download spark for https://spark.apache.org/downloads.html   .if in china there have a good choice of tencent link  https://mirrors.cloud.tencent.com/apache/spark/spark-3.1.2/
+
+```bash
+#download
+wget https://mirrors.cloud.tencent.com/apache/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz
+#decompression
+tar -xzvf spark-3.1.2-bin-hadoop3.2.tgz
+```
+
+4. config Spark environment
+
+```shell
+vim /etc/profile
+export SPARK_HOME=/your_parh/spark-3.1.2-bin-hadoop3.2
+export PATH=$PATH:$SPARK_HOME/bin
+source /etc/profile
+```
+
+5. copy spark-doris-connector-3.1_2.12-1.0.0-SNAPSHOT.jar to spark  jars directory。
+
+```shell
+cp /your_path/spark-doris-connector/target/spark-doris-connector-3.1_2.12-1.0.0-SNAPSHOT.jar  $SPARK_HOME/jars
+```
+
+6. created  doris database and table。
+
+   ```sql
+   create database mongo_doris;
+   use mongo_doris;
+   CREATE TABLE data_sync_test_simple
+    (
+            _id VARCHAR(32) DEFAULT '',
+            id VARCHAR(32) DEFAULT '',
+            user_name VARCHAR(32) DEFAULT '',
+            member_list VARCHAR(32) DEFAULT ''
+    )
+    DUPLICATE KEY(_id)
+    DISTRIBUTED BY HASH(_id) BUCKETS 10
+    PROPERTIES("replication_num" = "1");
+   INSERT INTO data_sync_test_simple VALUES ('1','1','alex','123');
+   ```
+
+   7. Input this coed in spark-shell.
+
+```bash
+import org.apache.doris.spark._
+val dorisSparkRDD = sc.dorisRDD(
+  tableIdentifier = Some("mongo_doris.data_sync_test"),
+  cfg = Some(Map(
+    "doris.fenodes" -> "127.0.0.1:8030",
+    "doris.request.auth.user" -> "root",
+    "doris.request.auth.password" -> ""
+  ))
+)
+dorisSparkRDD.collect()
+```
+
+- mongo_doris:doris database name
+- data_sync_test:doris  table mame.
+- doris.fenodes:doris FE IP:http_port
+- doris.request.auth.user:doris  user name.
+- doris.request.auth.password:doris  password
+
+8. if Spark is Cluster model,upload Jar to HDFS,add doris-spark-connector jar HDFS URL in  spark.yarn.jars.
+
+```bash
+spark.yarn.jars=hdfs:///spark-jars/doris-spark-connector-3.1.2-2.12-1.0.0.jar
+```
+
+Link:https://github.com/apache/incubator-doris/discussions/9486
+
+9. in pyspark,input this code in pyspark shell command.
+
+```bash
+dorisSparkDF = spark.read.format("doris")
+.option("doris.table.identifier", "mongo_doris.data_sync_test")
+.option("doris.fenodes", "127.0.0.1:8030")
+.option("user", "root")
+.option("password", "")
+.load()
+# show 5 lines data 
+dorisSparkDF.show(5)
+```
+
 ## Report issues or submit pull request
 
 If you find any bugs, feel free to file a [GitHub issue](https://github.com/apache/incubator-doris/issues) or fix it by submitting a [pull request](https://github.com/apache/incubator-doris/pulls).


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org