You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by do...@apache.org on 2021/09/01 08:54:05 UTC

[incubator-inlong-website] branch master updated: [INLONG-1508] Add English Hive example.md (#135)

This is an automated email from the ASF dual-hosted git repository.

dockerzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-inlong-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 7b0495c  [INLONG-1508] Add English Hive example.md (#135)
7b0495c is described below

commit 7b0495cb2c2896b25c73dcf71270d86ddec20669
Author: Bowen Li <27...@qq.com>
AuthorDate: Wed Sep 1 16:54:00 2021 +0800

    [INLONG-1508] Add English Hive example.md (#135)
---
 docs/en-us/example.md | 103 ++++++++++++++++++++++++++++++++++++++++++++++++++
 site_config/docs.js   |   4 ++
 2 files changed, 107 insertions(+)

diff --git a/docs/en-us/example.md b/docs/en-us/example.md
new file mode 100644
index 0000000..1ca422c
--- /dev/null
+++ b/docs/en-us/example.md
@@ -0,0 +1,103 @@
+---
+title: Hive Example - Apache InLong
+---
+
+Here we use a simple example to help you experience InLong by Docker.
+
+## Install Hive
+Hive is the necessary component. If you don't have Hive in your machine, we recommand using Docker to install it. Details can be found [here](https://github.com/big-data-europe/docker-hive).
+
+> Note that if you use Docker, you need to add a port mapping `8020:8020`, because it's the port of HDFS DefaultFS, and we need to use it later.
+
+## Install InLong
+Before we begin, we need to install InLong. Here we provide two ways:
+1. Install InLong with Docker by according to the [instructions here](https://github.com/apache/incubator-inlong/tree/master/docker/docker-compose).(Recommanded)
+2. Install InLong binary according to the [instructions here](./quick_start.md).
+
+## Create a data access
+After deployment, we first enter the "Data Access" interface, click "Create an Access" in the upper right corner to create a new date access, and fill in the business information as shown in the figure below.
+
+<img src="../../img/create-business.png" align="center" alt="Create Business"/>
+
+Then we click the next button, and fill in the stream information as shown in the figure below.
+
+<img src="../../img/create-stream.png" align="center" alt="Create Stream"/>
+
+Note that the message source is "File", and we don't need to create a message source manually.
+
+Then we fill in the following information in the "data information" column below.
+
+<img src="../../img/data-information.png" align="center" alt="Data Information"/>
+
+Then we select Hive in the data flow and click "Add" to add Hive configuration
+
+<img src="../../img/hive-config.png" align="center" alt="Hive Config"/>
+
+Note that the target table does not need to be created in advance, as InLong Manager will automatically create the table for us after the access is approved. Also, please use connection test to ensure that InLong Manager can connect to your Hive.
+
+Then we click the "Submit for Approval" button, the connection will be created successfully and enter the approval state.
+
+## Approve the data access
+Then we enter the "Approval Management" interface and click "My Approval" to approve the data access that we just applied for.
+
+At this point, the data access has been created successfully. We can see that the corresponding table has been created in Hive, and we can see that the corresponding topic has been created successfully in the management GUI of TubeMQ.
+
+## Configure the agent
+Here we use `docker exec` to enter the container of the agent and configure it.
+```
+$ docker exec -it agent sh
+```
+
+Then we create a directory of `.inlong`, and new a file named `bid.local` (Here bid is the business id) and fill in the configuration of Dataproxy as follows.
+```
+$ mkdir .inlong
+$ cd .inlong
+$ touch b_test.local
+$ echo '{"cluster_id":1,"isInterVisit":1,"size":1,"address": [{"port":46801,"host":"dataproxy"}], "switch":0}' >> b_test.local
+```
+
+Then we exit the container, and use `curl` to make a request.
+```
+curl --location --request POST 'http://localhost:8008/config/job' \
+--header 'Content-Type: application/json' \
+--data '{
+"job": {
+"dir": {
+"path": "",
+"pattern": "/data/collect-data/test.log"
+},
+"trigger": "org.apache.inlong.agent.plugin.trigger.DirectoryTrigger",
+"id": 1,
+"thread": {
+"running": {
+"core": "4"
+}
+},
+"name": "fileAgentTest",
+"source": "org.apache.inlong.agent.plugin.sources.TextFileSource",
+"sink": "org.apache.inlong.agent.plugin.sinks.ProxySink",
+"channel": "org.apache.inlong.agent.plugin.channel.MemoryChannel"
+},
+"proxy": {
+"bid": "b_test",
+"tid": "test_stream"
+},
+"op": "add"
+}'
+```
+
+At this point, the agent is configured successfully.
+Then we need to create a new file `./collect-data/test.log` and add content to it to trigger the agent to send data to the dataproxy.
+
+```
+$ touch collect-data/test.log
+$ echo 'test,24' >> collect-data/test.log
+```
+
+Then we can observe the logs of agent and dataproxy, and we can see that the relevant data has been sent successfully.
+
+```
+$ docker logs agent
+$ docker logs dataproxy
+```
+
diff --git a/site_config/docs.js b/site_config/docs.js
index 50d033c..b092c66 100644
--- a/site_config/docs.js
+++ b/site_config/docs.js
@@ -8,6 +8,10 @@ export default {
             title: 'Quick Start',
             link: '/en-us/docs/quick_start.html',
           },
+          {
+            title: 'Hive Example',
+            link: '/en-us/docs/example.html'
+          }
         ],
       },
       {