You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hugegraph.apache.org by zh...@apache.org on 2022/11/27 06:43:42 UTC

[incubator-hugegraph-doc] branch hugegraph-computer-doc created (now 5bb76c06)

This is an automated email from the ASF dual-hosted git repository.

zhaocong pushed a change to branch hugegraph-computer-doc
in repository https://gitbox.apache.org/repos/asf/incubator-hugegraph-doc.git


      at 5bb76c06 Add HugeGraph-Computer Doc

This branch includes the following new commits:

     new 5bb76c06 Add HugeGraph-Computer Doc

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



[incubator-hugegraph-doc] 01/01: Add HugeGraph-Computer Doc

Posted by zh...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

zhaocong pushed a commit to branch hugegraph-computer-doc
in repository https://gitbox.apache.org/repos/asf/incubator-hugegraph-doc.git

commit 5bb76c06f02c1414f6032520b87c0c517f81ee04
Author: coderzc <zh...@apache.org>
AuthorDate: Sun Nov 27 14:43:07 2022 +0800

    Add HugeGraph-Computer Doc
---
 content/en/docs/quickstart/hugegraph-computer.md | 209 +++++++++++++++++++++++
 content/en/docs/quickstart/hugegraph-hubble.md   |   1 -
 content/en/docs/quickstart/hugegraph-spark.md    |   2 +-
 content/en/docs/quickstart/hugegraph-studio.md   |   2 +-
 content/en/docs/quickstart/hugegraph-tools.md    |   6 +-
 5 files changed, 214 insertions(+), 6 deletions(-)

diff --git a/content/en/docs/quickstart/hugegraph-computer.md b/content/en/docs/quickstart/hugegraph-computer.md
new file mode 100644
index 00000000..f6807d5e
--- /dev/null
+++ b/content/en/docs/quickstart/hugegraph-computer.md
@@ -0,0 +1,209 @@
+---
+title: "HugeGraph-Computer Quick Start"
+linkTitle: "Analysis with HugeGraph-Computer"
+weight: 7
+---
+
+## 1 HugeGraph-Computer Overview
+
+The hugegraph-computer is a distributed graph processing system for hugegraph. It is an implementation of [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It runs on Kubernetes or YARN framework.
+
+### Features
+
+- Support distributed MPP graph computing, and integrates with HugeGraph as graph input/output storage.
+- Based on BSP(Bulk Synchronous Parallel) model, an algorithm performs computing through multiple parallel iterations, every iteration is a superstep.
+- Auto memory management. The framework will never be OOM(Out of Memory) since it will split some data to disk if it doesn't have enough memory to hold all the data.
+- The part of edges or the messages of super node can be in memory, so you will never lose it.
+- You can output the results to HDFS or HugeGraph, or any other system.
+- Easy to develop a new algorithm. You just need to focus on a vertex only processing just like as in a single server, without worrying about message transfer and memory/storage management.
+
+## 2 Get Started
+
+### 2.1 Run PageRank algorithm locally
+
+> To run algorithm with hugegraph-computer, you need to install 64-bit JRE/JDK 11 or later versions.
+>
+> You also need to deploy Hugegraph-Server and [Etcd](https://etcd.io/docs/v3.5/quickstart/).
+
+#### 2.1.1 Download the compiled archive
+
+Download the latest version of the HugeGraph-Computer release package:
+
+```bash
+wget https://github.com/apache/hugegraph-computer/releases/download/v${version}/hugegraph-loader-${version}.tar.gz
+tar zxvf hugegraph-computer-${version}.tar.gz
+```
+
+#### 2.2 Clone source code to compile and install
+
+Clone the latest version of HugeGraph-Computer source package:
+
+```bash
+$ git clone https://github.com/apache/hugegraph-computer.git
+```
+
+Compile and generate tar package:
+
+```bash
+cd hugegraph-computer
+mvn clean package -DskipTests
+```
+
+#### 2.3 Start master node
+
+```bash
+cd hugegraph-computer-${version}
+bin/start-computer.sh -d local -r master
+```
+
+#### 2.4 Start worker node
+
+```
+bin/start-computer.sh -d local -r worker
+```
+
+#### 2.5 Query algorithm results
+
+2.5.1 Enable `OLAP` index query for server
+
+If OLAP index is not enabled, it needs to be enable, more reference: [modify-graphs-read-mode](/docs/clients/restful-api/graphs/#634-modify-graphs-read-mode-this-operation-requires-administrator-privileges)
+
+```http
+PUT http://localhost:8080/graphs/hugegraph/graph_read_mode
+
+"ALL"
+```
+
+2.5.2 Query `page_rank` propertie value:
+
+```bash
+curl "http://localhost:8080/graphs/hugegraph/graph/vertices?page&limit=3" | gunzip
+```
+
+### 2.2 Run PageRank algorithm in Kubernetes
+
+#### 2.2.1 Install hugegraph-computer CRD
+
+```bash
+# Kubernetes version >= v1.16
+kubectl apply -f https://raw.githubusercontent.com/hugegraph/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1.yaml
+
+# Kubernetes version < v1.16
+kubectl apply -f https://raw.githubusercontent.com/hugegraph/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1beta1.yaml
+```
+
+#### 2.2.2 Show CRD
+
+```bash
+kubectl get crd
+
+NAME                                        CREATED AT
+hugegraphcomputerjobs.hugegraph.apache.org   2021-09-16T08:01:08Z
+```
+
+#### 2.2.3 Install hugegraph-computer-operator&etcd-server
+
+```bash
+kubectl apply -f https://raw.githubusercontent.com/hugegraph/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-operator.yaml
+```
+
+#### 2.2.4 Wait for hugegraph-computer-operator&etcd-server deployment to complete
+
+```bash
+kubectl get pod -n hugegraph-computer-operator-system
+
+NAME                                                              READY   STATUS    RESTARTS   AGE
+hugegraph-computer-operator-controller-manager-58c5545949-jqvzl   1/1     Running   0          15h
+hugegraph-computer-operator-etcd-28lm67jxk5                       1/1     Running   0          15h
+```
+
+#### 2.2.5 Submit job
+
+```yaml
+cat <<EOF | kubectl apply --filename -
+apiVersion: hugegraph.apache.org/v1
+kind: HugeGraphComputerJob
+metadata:
+  namespace: hugegraph-computer-system
+  name: &jobName pagerank-sample
+spec:
+  jobId: *jobName
+  algorithmName: page_rank
+  image: hugegraph/hugegraph-computer:latest # algorithm image url
+  jarFile: /hugegraph/hugegraph-computer/algorithm/builtin-algorithm.jar # algorithm jar path
+  pullPolicy: Always
+  workerCpu: "4"
+  workerMemory: "4Gi"
+  workerInstances: 5
+  computerConf:
+    job.partitions_count: "20"
+    algorithm.params_class: org.apache.hugegraph.computer.algorithm.centrality.pagerank.PageRankParams
+    hugegraph.url: http://${hugegraph-server-host}:${hugegraph-server-port} # hugegraph server url
+    hugegraph.name: hugegraph
+EOF
+```
+
+#### 2.2.6 Show job
+
+```bash
+kubectl get hcjob/pagerank-sample -n hugegraph-computer-system
+
+NAME               JOBID              JOBSTATUS
+pagerank-sample    pagerank-sample    RUNNING
+```
+
+#### 2.2.7 Show log nodes
+
+```bash
+# Show the master log
+kubectl logs -l component=pagerank-sample-master -n hugegraph-computer-system
+
+# Show the worker log
+kubectl logs -l component=pagerank-sample-worker -n hugegraph-computer-system
+
+# Show diagnostic log of a job
+# NOTE: diagnostic log exist only when the job fails, and it will only be saved for one hour.
+kubectl get event --field-selector reason=ComputerJobFailed --field-selector involvedObject.name=pagerank-sample -n hugegraph-computer-system
+```
+
+#### 2.2.8 Show success event of a job
+
+> NOTE: it will only be saved for one hour
+
+```bash
+kubectl get event --field-selector reason=ComputerJobSucceed --field-selector involvedObject.name=pagerank-sample -n hugegraph-computer-system
+```
+
+#### 2.2.9 Query algorithm results
+
+If the output to `Hugegraph-Server` is consistent with Locally, if output to `HDFS`, please check the result file in the directory of `/hugegraph-computer/results/{jobId}` directory.
+
+### 3 Built-In algorithms document
+
+#### 3.1 Currently supported algorithms list: 
+
+##### Centrality Algorithm:
+
+* PageRank
+* BetweennessCentrality
+* ClosenessCentrality
+* DegreeCentrality
+
+##### Community Algorithm:
+
+* ClusteringCoefficient
+* Kcore
+* Lpa
+* TriangleCount
+* Wcc
+
+##### Path Algorithm:
+
+* RingsDetection
+* RingsDetectionWithFilter
+
+More please see: https://github.com/apache/hugegraph-computer/tree/master/computer-algorithm/src/main/java/com/baidu/hugegraph/computer/algorithm
+
+### 4 Algorithm development guide
+
+TODO
\ No newline at end of file
diff --git a/content/en/docs/quickstart/hugegraph-hubble.md b/content/en/docs/quickstart/hugegraph-hubble.md
index d5c46231..fcf96d29 100644
--- a/content/en/docs/quickstart/hugegraph-hubble.md
+++ b/content/en/docs/quickstart/hugegraph-hubble.md
@@ -432,4 +432,3 @@ There is no visual OLAP algorithm execution on Hubble. You can call the RESTful
 <center>
   <img src="/docs/images/images-hubble/355任务详情.png" alt="image">
 </center>
-
diff --git a/content/en/docs/quickstart/hugegraph-spark.md b/content/en/docs/quickstart/hugegraph-spark.md
index 4ca1d1af..6f3f8a3f 100644
--- a/content/en/docs/quickstart/hugegraph-spark.md
+++ b/content/en/docs/quickstart/hugegraph-spark.md
@@ -2,7 +2,7 @@
 title: "HugeGraph-Spark Quick Start"
 linkTitle: "Analysis with HugeGraph-Spark"
 draft: true
-weight: 7
+weight: 8
 ---
 
 ### 1 HugeGraph-Spark概述 (Deprecated)
diff --git a/content/en/docs/quickstart/hugegraph-studio.md b/content/en/docs/quickstart/hugegraph-studio.md
index 7d1359e5..c399bad1 100644
--- a/content/en/docs/quickstart/hugegraph-studio.md
+++ b/content/en/docs/quickstart/hugegraph-studio.md
@@ -2,7 +2,7 @@
 title: "HugeGraph-Studio Quick Start"
 linkTitle: "Display with HugeGraph-Studio"
 draft: true
-weight: 5
+weight: 9
 ---
 
 ### 1 HugeGraph-Studio概述 (Deprecated)
diff --git a/content/en/docs/quickstart/hugegraph-tools.md b/content/en/docs/quickstart/hugegraph-tools.md
index 5f860d31..8c7f1f2d 100644
--- a/content/en/docs/quickstart/hugegraph-tools.md
+++ b/content/en/docs/quickstart/hugegraph-tools.md
@@ -183,9 +183,9 @@ Usage: hugegraph [options] [command] [command options]
     - --backup-num,选填项,指定保存的最新的备份的数目,默认为 3
     - --interval,选填项,指定进行备份的周期,格式同 Linux crontab 格式
 - dump,把整张图的顶点和边全部导出,默认以`vertex vertex-edge1 vertex-edge2...`JSON格式存储。
-用户也可以自定义存储格式,只需要在`hugegraph-tools/src/main/java/com/baidu/hugegraph/formatter`
-目录下实现一个继承自`Formatter`的类,例如`CustomFormatter`,使用时指定该类为formatter即可,例如
-`bin/hugegraph dump -f CustomFormatter`
+  用户也可以自定义存储格式,只需要在`hugegraph-tools/src/main/java/com/baidu/hugegraph/formatter`
+  目录下实现一个继承自`Formatter`的类,例如`CustomFormatter`,使用时指定该类为formatter即可,例如
+  `bin/hugegraph dump -f CustomFormatter`
     - --formatter 或者 -f,指定使用的 formatter,默认为 JsonFormatter
     - --directory 或者 -d,存储 schema 或者 data 的目录,默认为当前目录
     - --log 或者 -l,指定日志目录,默认为当前目录