content/cn/docs/quickstart/computing/hugegraph-computer.md - hugegraph-doc - Git at Google

 ---
 title: "HugeGraph-Computer Quick Start"
 linkTitle: "使用 Computer 进行 OLAP 分析"
 weight: 2
 ---

 ## 1 HugeGraph-Computer 概述

 [`HugeGraph-Computer`](https://github.com/apache/hugegraph-computer) 是分布式图处理系统 (OLAP). 它是 [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf)的一个实现。它可以运行在 Kubernetes(K8s)/Yarn 上。(它侧重可支持百亿~千亿的图数据量下进行图计算, 会使用磁盘进行排序和加速, 这是它和 Vermeer 相对最大的区别之一)

 ### 特性

 - 支持分布式 MPP 图计算，集成 HugeGraph 作为图输入输出存储。
 - 算法基于 BSP(Bulk Synchronous Parallel) 模型，通过多次并行迭代进行计算，每一次迭代都是一次超步。
 - 自动内存管理。该框架永远不会出现 OOM（内存不足），因为如果它没有足够的内存来容纳所有数据，它会将一些数据拆分到磁盘。
 - 边的部分或超级节点的消息可以在内存中，所以你永远不会丢失它。
 - 您可以从 HDFS 或 HugeGraph 或任何其他系统加载数据。
 - 您可以将结果输出到 HDFS 或 HugeGraph，或任何其他系统。
 - 易于开发新算法。您只需要像在单个服务器中一样专注于仅顶点处理，而不必担心消息传输和内存存储管理。

 ## 2 依赖

 ### 2.1 安装 Java 11 (JDK 11)

 **必须**在 ≥ `Java 11` 的环境上启动 `Computer`，然后自行配置。

 **在往下阅读之前务必执行 `java -version` 命令查看 jdk 版本**

 ## 3 开始

 ### 3.1 在本地运行 PageRank 算法

 > 要使用 HugeGraph-Computer 运行算法，必须装有 Java 11 或更高版本。
 >
 > 还需要首先部署 HugeGraph-Server 和 [Etcd](https://etcd.io/docs/v3.5/quickstart/).

 有两种方式可以获取 HugeGraph-Computer：

 - 下载已编译的压缩包
 - 克隆源码编译打包

 #### 3.1.1 下载已编译的压缩包

 下载最新版本的 HugeGraph-Computer release 包：

 ```bash
 wget https://downloads.apache.org/hugegraph/${version}/apache-hugegraph-computer-incubating-${version}.tar.gz
 tar zxvf apache-hugegraph-computer-incubating-${version}.tar.gz -C hugegraph-computer
 ```

 #### 3.1.2 克隆源码编译打包

 克隆最新版本的 HugeGraph-Computer 源码包：

 ```bash
 $ git clone https://github.com/apache/hugegraph-computer.git
 ```

 编译生成 tar 包：

 ```bash
 cd hugegraph-computer
 mvn clean package -DskipTests
 ```

 #### 3.1.3 启动 master 节点

 > 您可以使用 `-c` 参数指定配置文件，更多 computer 配置请看：[Computer Config Options](/docs/quickstart/computing/hugegraph-computer-config#computer-config-options)

 ```bash
 cd hugegraph-computer
 bin/start-computer.sh -d local -r master
 ```

 #### 3.1.4 启动 worker 节点

 ```bash
 bin/start-computer.sh -d local -r worker
 ```

 #### 3.1.5 查询算法结果

 2.5.1 为 server 启用 `OLAP` 索引查询

 如果没有启用 OLAP 索引，则需要启用，更多参考：[modify-graphs-read-mode](/docs/clients/restful-api/graphs/#634-modify-graphs-read-mode-this-operation-requires-administrator-privileges)

 ```http
 PUT http://localhost:8080/graphs/hugegraph/graph_read_mode

 "ALL"
 ```

 3.1.5.2 查询 `page_rank` 属性值：

 ```bash
 curl "http://localhost:8080/graphs/hugegraph/graph/vertices?page&limit=3" | gunzip
 ```

 ### 3.2 在 Kubernetes 中运行 PageRank 算法

 > 要使用 HugeGraph-Computer 运行算法，您需要先部署 HugeGraph-Server

 #### 3.2.1 安装 HugeGraph-Computer CRD

 ```bash
 # Kubernetes version >= v1.16
 kubectl apply -f https://raw.githubusercontent.com/apache/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1.yaml

 # Kubernetes version < v1.16
 kubectl apply -f https://raw.githubusercontent.com/apache/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1beta1.yaml
 ```

 #### 3.2.2 显示 CRD

 ```bash
 kubectl get crd

 NAME                                        CREATED AT
 hugegraphcomputerjobs.hugegraph.apache.org   2021-09-16T08:01:08Z
 ```

 #### 3.2.3 安装 hugegraph-computer-operator&etcd-server

 ```bash
 kubectl apply -f https://raw.githubusercontent.com/apache/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-operator.yaml
 ```

 #### 3.2.4 等待 hugegraph-computer-operator&etcd-server 部署完成

 ```bash
 kubectl get pod -n hugegraph-computer-operator-system

 NAME                                                              READY   STATUS    RESTARTS   AGE
 hugegraph-computer-operator-controller-manager-58c5545949-jqvzl   1/1     Running   0          15h
 hugegraph-computer-operator-etcd-28lm67jxk5                       1/1     Running   0          15h
 ```

 #### 3.2.5 提交作业

 > 更多 computer crd spec 请看：[Computer CRD](/docs/quickstart/computing/hugegraph-computer-config#hugegraph-computer-crd)
 >
 > 更多 Computer 配置请看：[Computer Config Options](/docs/quickstart/computing/hugegraph-computer-config#computer-config-options)

 ```yaml
 cat <<EOF | kubectl apply --filename -
 apiVersion: hugegraph.apache.org/v1
 kind: HugeGraphComputerJob
 metadata:
   namespace: hugegraph-computer-operator-system
   name: &jobName pagerank-sample
 spec:
   jobId: *jobName
   algorithmName: page_rank
   image: hugegraph/hugegraph-computer:latest # algorithm image url
   jarFile: /hugegraph/hugegraph-computer/algorithm/builtin-algorithm.jar # algorithm jar path
   pullPolicy: Always
   workerCpu: "4"
   workerMemory: "4Gi"
   workerInstances: 5
   computerConf:
     job.partitions_count: "20"
     algorithm.params_class: org.apache.hugegraph.computer.algorithm.centrality.pagerank.PageRankParams
     hugegraph.url: http://${hugegraph-server-host}:${hugegraph-server-port} # hugegraph server url
     hugegraph.name: hugegraph # hugegraph graph name
 EOF
 ```

 #### 3.2.6 显示作业

 ```bash
 kubectl get hcjob/pagerank-sample -n hugegraph-computer-operator-system

 NAME               JOBID              JOBSTATUS
 pagerank-sample    pagerank-sample    RUNNING
 ```

 #### 3.2.7 显示节点日志

 ```bash
 # Show the master log
 kubectl logs -l component=pagerank-sample-master -n hugegraph-computer-operator-system

 # Show the worker log
 kubectl logs -l component=pagerank-sample-worker -n hugegraph-computer-operator-system

 # Show diagnostic log of a job
 # 注意: 诊断日志仅在作业失败时存在，并且只会保存一小时。
 kubectl get event --field-selector reason=ComputerJobFailed --field-selector involvedObject.name=pagerank-sample -n hugegraph-computer-operator-system
 ```

 #### 3.2.8 显示作业的成功事件

 > NOTE: it will only be saved for one hour

 ```bash
 kubectl get event --field-selector reason=ComputerJobSucceed --field-selector involvedObject.name=pagerank-sample -n hugegraph-computer-operator-system
 ```

 #### 3.2.9 查询算法结果

 如果输出到 `Hugegraph-Server` 则与 Locally 模式一致，如果输出到 `HDFS` ，请检查 `hugegraph-computerresults{jobId}`目录下的结果文件。

 ## 4 内置算法文档

 ### 4.1 支持的算法列表：

 #### 中心性算法：

 * PageRank
 * BetweennessCentrality
 * ClosenessCentrality
 * DegreeCentrality

 #### 社区算法：

 * ClusteringCoefficient
 * Kcore
 * Lpa
 * TriangleCount
 * Wcc

 #### 路径算法：

 * RingsDetection
 * RingsDetectionWithFilter

 更多算法请看：[Built-In algorithms](https://github.com/apache/hugegraph-computer/tree/master/computer-algorithm/src/main/java/org/apache/hugegraph/computer/algorithm)

 ### 4.2 算法描述

 TODO

 ## 5 算法开发指南

 TODO

 ## 6 注意事项

 - 如果 computer-k8s 模块下面的某些类不存在，你需要运行`mvn compile`来提前生成对应的类。
	---
	title: "HugeGraph-Computer Quick Start"
	linkTitle: "使用 Computer 进行 OLAP 分析"
	weight: 2
	---

	## 1 HugeGraph-Computer 概述

	[`HugeGraph-Computer`](https://github.com/apache/hugegraph-computer) 是分布式图处理系统 (OLAP). 它是 [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf)的一个实现。它可以运行在 Kubernetes(K8s)/Yarn 上。(它侧重可支持百亿~千亿的图数据量下进行图计算, 会使用磁盘进行排序和加速, 这是它和 Vermeer 相对最大的区别之一)

	### 特性

	- 支持分布式 MPP 图计算，集成 HugeGraph 作为图输入输出存储。
	- 算法基于 BSP(Bulk Synchronous Parallel) 模型，通过多次并行迭代进行计算，每一次迭代都是一次超步。
	- 自动内存管理。该框架永远不会出现 OOM（内存不足），因为如果它没有足够的内存来容纳所有数据，它会将一些数据拆分到磁盘。
	- 边的部分或超级节点的消息可以在内存中，所以你永远不会丢失它。
	- 您可以从 HDFS 或 HugeGraph 或任何其他系统加载数据。
	- 您可以将结果输出到 HDFS 或 HugeGraph，或任何其他系统。
	- 易于开发新算法。您只需要像在单个服务器中一样专注于仅顶点处理，而不必担心消息传输和内存存储管理。

	## 2 依赖

	### 2.1 安装 Java 11 (JDK 11)

	必须在 ≥ `Java 11` 的环境上启动 `Computer`，然后自行配置。

	在往下阅读之前务必执行 `java -version` 命令查看 jdk 版本

	## 3 开始

	### 3.1 在本地运行 PageRank 算法

	> 要使用 HugeGraph-Computer 运行算法，必须装有 Java 11 或更高版本。
	>
	> 还需要首先部署 HugeGraph-Server 和 [Etcd](https://etcd.io/docs/v3.5/quickstart/).

	有两种方式可以获取 HugeGraph-Computer：

	- 下载已编译的压缩包
	- 克隆源码编译打包

	#### 3.1.1 下载已编译的压缩包

	下载最新版本的 HugeGraph-Computer release 包：

	```bash
	wget https://downloads.apache.org/hugegraph/${version}/apache-hugegraph-computer-incubating-${version}.tar.gz
	tar zxvf apache-hugegraph-computer-incubating-${version}.tar.gz -C hugegraph-computer
	```

	#### 3.1.2 克隆源码编译打包

	克隆最新版本的 HugeGraph-Computer 源码包：

	```bash
	$ git clone https://github.com/apache/hugegraph-computer.git
	```

	编译生成 tar 包：

	```bash
	cd hugegraph-computer
	mvn clean package -DskipTests
	```

	#### 3.1.3 启动 master 节点

	> 您可以使用 `-c` 参数指定配置文件，更多 computer 配置请看：[Computer Config Options](/docs/quickstart/computing/hugegraph-computer-config#computer-config-options)

	```bash
	cd hugegraph-computer
	bin/start-computer.sh -d local -r master
	```

	#### 3.1.4 启动 worker 节点

	```bash
	bin/start-computer.sh -d local -r worker
	```

	#### 3.1.5 查询算法结果

	2.5.1 为 server 启用 `OLAP` 索引查询

	如果没有启用 OLAP 索引，则需要启用，更多参考：[modify-graphs-read-mode](/docs/clients/restful-api/graphs/#634-modify-graphs-read-mode-this-operation-requires-administrator-privileges)

	```http
	PUT http://localhost:8080/graphs/hugegraph/graph_read_mode

	"ALL"
	```

	3.1.5.2 查询 `page_rank` 属性值：

	```bash
	curl "http://localhost:8080/graphs/hugegraph/graph/vertices?page&limit=3" \| gunzip
	```

	### 3.2 在 Kubernetes 中运行 PageRank 算法

	> 要使用 HugeGraph-Computer 运行算法，您需要先部署 HugeGraph-Server

	#### 3.2.1 安装 HugeGraph-Computer CRD

	```bash
	# Kubernetes version >= v1.16
	kubectl apply -f https://raw.githubusercontent.com/apache/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1.yaml

	# Kubernetes version < v1.16
	kubectl apply -f https://raw.githubusercontent.com/apache/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1beta1.yaml
	```

	#### 3.2.2 显示 CRD

	```bash
	kubectl get crd

	NAME CREATED AT
	hugegraphcomputerjobs.hugegraph.apache.org 2021-09-16T08:01:08Z
	```

	#### 3.2.3 安装 hugegraph-computer-operator&etcd-server

	```bash
	kubectl apply -f https://raw.githubusercontent.com/apache/hugegraph-computer/master/computer-k8s-operator/manifest/hugegraph-computer-operator.yaml
	```

	#### 3.2.4 等待 hugegraph-computer-operator&etcd-server 部署完成

	```bash
	kubectl get pod -n hugegraph-computer-operator-system

	NAME READY STATUS RESTARTS AGE
	hugegraph-computer-operator-controller-manager-58c5545949-jqvzl 1/1 Running 0 15h
	hugegraph-computer-operator-etcd-28lm67jxk5 1/1 Running 0 15h
	```

	#### 3.2.5 提交作业

	> 更多 computer crd spec 请看：[Computer CRD](/docs/quickstart/computing/hugegraph-computer-config#hugegraph-computer-crd)
	>
	> 更多 Computer 配置请看：[Computer Config Options](/docs/quickstart/computing/hugegraph-computer-config#computer-config-options)

	```yaml
	cat <<EOF \| kubectl apply --filename -
	apiVersion: hugegraph.apache.org/v1
	kind: HugeGraphComputerJob
	metadata:
	namespace: hugegraph-computer-operator-system
	name: &jobName pagerank-sample
	spec:
	jobId: *jobName
	algorithmName: page_rank
	image: hugegraph/hugegraph-computer:latest # algorithm image url
	jarFile: /hugegraph/hugegraph-computer/algorithm/builtin-algorithm.jar # algorithm jar path
	pullPolicy: Always
	workerCpu: "4"
	workerMemory: "4Gi"
	workerInstances: 5
	computerConf:
	job.partitions_count: "20"
	algorithm.params_class: org.apache.hugegraph.computer.algorithm.centrality.pagerank.PageRankParams
	hugegraph.url: http://${hugegraph-server-host}:${hugegraph-server-port} # hugegraph server url
	hugegraph.name: hugegraph # hugegraph graph name
	EOF
	```

	#### 3.2.6 显示作业

	```bash
	kubectl get hcjob/pagerank-sample -n hugegraph-computer-operator-system

	NAME JOBID JOBSTATUS
	pagerank-sample pagerank-sample RUNNING
	```

	#### 3.2.7 显示节点日志

	```bash
	# Show the master log
	kubectl logs -l component=pagerank-sample-master -n hugegraph-computer-operator-system

	# Show the worker log
	kubectl logs -l component=pagerank-sample-worker -n hugegraph-computer-operator-system

	# Show diagnostic log of a job
	# 注意: 诊断日志仅在作业失败时存在，并且只会保存一小时。
	kubectl get event --field-selector reason=ComputerJobFailed --field-selector involvedObject.name=pagerank-sample -n hugegraph-computer-operator-system
	```

	#### 3.2.8 显示作业的成功事件

	> NOTE: it will only be saved for one hour

	```bash
	kubectl get event --field-selector reason=ComputerJobSucceed --field-selector involvedObject.name=pagerank-sample -n hugegraph-computer-operator-system
	```

	#### 3.2.9 查询算法结果

	如果输出到 `Hugegraph-Server` 则与 Locally 模式一致，如果输出到 `HDFS` ，请检查 `hugegraph-computerresults{jobId}`目录下的结果文件。

	## 4 内置算法文档

	### 4.1 支持的算法列表：

	#### 中心性算法：

	* PageRank
	* BetweennessCentrality
	* ClosenessCentrality
	* DegreeCentrality

	#### 社区算法：

	* ClusteringCoefficient
	* Kcore
	* Lpa
	* TriangleCount
	* Wcc

	#### 路径算法：

	* RingsDetection
	* RingsDetectionWithFilter

	更多算法请看：[Built-In algorithms](https://github.com/apache/hugegraph-computer/tree/master/computer-algorithm/src/main/java/org/apache/hugegraph/computer/algorithm)

	### 4.2 算法描述

	TODO

	## 5 算法开发指南

	TODO

	## 6 注意事项

	- 如果 computer-k8s 模块下面的某些类不存在，你需要运行`mvn compile`来提前生成对应的类。