Vermeer is a high-performance in-memory graph computing platform with a single-binary deployment model. It provides 20+ graph algorithms, custom algorithm extensions, and seamless integration with HugeGraph.
graph TB subgraph Client["Client Layer"] API[REST API Client] UI[Web UI Dashboard] end subgraph Master["Master Node"] HTTP[HTTP Server :6688] GRPC_M[gRPC Server :6689] GM[Graph Manager] TM[Task Manager] WM[Worker Manager] SCH[Scheduler] end subgraph Workers["Worker Nodes"] W1[Worker 1 :6789] W2[Worker 2 :6789] W3[Worker N :6789] end subgraph DataSources["Data Sources"] HG[(HugeGraph)] CSV[Local CSV] HDFS[HDFS] end API --> HTTP UI --> HTTP HTTP --> GM HTTP --> TM GRPC_M <--> W1 GRPC_M <--> W2 GRPC_M <--> W3 W1 <--> HG W2 <--> HG W3 <--> HG W1 <--> CSV W1 <--> HDFS style Master fill:#e1f5fe style Workers fill:#fff3e0 style DataSources fill:#f1f8e9
vermeer/ ├── main.go # Single binary entry point ├── Makefile # Build automation ├── algorithms/ # 20+ algorithm implementations │ ├── pagerank.go │ ├── louvain.go │ ├── sssp.go │ └── ... ├── apps/ │ ├── master/ # Master service │ │ ├── services/ # HTTP handlers │ │ ├── workers/ # Worker management | | ├── schedules/ # Task scheduling strategies │ │ └── tasks/ # Task scheduling │ ├── compute/ # Worker-side compute logic │ ├── graphio/ # Graph I/O (HugeGraph, CSV, HDFS) │ │ └── hugegraph.go # HugeGraph integration │ ├── protos/ # gRPC definitions │ └── common/ # Utilities, logging, metrics ├── config/ # Configuration templates │ ├── master.ini │ └── worker.ini ├── tools/ # Binary dependencies (supervisord, protoc) └── ui/ # Web dashboard
Pull the image:
docker pull hugegraph/vermeer:latest
Create a dedicated config directory (e.g., ~/vermeer-config/) with master.ini and worker.ini files (see Configuration section).
Run with Docker:
# Master node docker run -v ~/vermeer-config:/go/bin/config hugegraph/vermeer --env=master # Worker node docker run -v ~/vermeer-config:/go/bin/config hugegraph/vermeer --env=worker
Security Note: Only mount directories containing Vermeer configuration files. Avoid mounting your entire home directory to minimize security risks.
Update master_peer in ~/worker.ini to 172.20.0.10:6689, and edit docker-compose.yml to mount your config directory:
volumes: - ~/:/go/bin/config # Change here to your actual config path
docker-compose up -d
# Download binary (replace version and platform) wget https://github.com/apache/hugegraph-computer/releases/download/vX.X.X/vermeer-linux-amd64.tar.gz tar -xzf vermeer-linux-amd64.tar.gz cd vermeer # Run master and worker ./vermeer --env=master & ./vermeer --env=worker &
The --env parameter specifies the configuration file name in the config/ folder (e.g., master.ini, worker.ini).
Configure parameters in vermeer.sh, then:
./vermeer.sh start master ./vermeer.sh start worker
curl and unzip utilities (for downloading dependencies)Recommended: Use Makefile:
# First-time setup (downloads supervisord and protoc binaries) make init # Build for current platform make # Or build for specific platform make build-linux-amd64 make build-linux-arm64
Alternative: Use build script:
# Auto-detect platform ./build.sh # Or specify architecture ./build.sh amd64 ./build.sh arm64
For development with hot-reload of web UI:
go build -tags=dev
make clean # Remove binaries and generated assets make clean-all # Also remove downloaded tools (supervisord, protoc)
master.ini)[default] # Master HTTP listen address http_peer = 0.0.0.0:6688 # Master gRPC listen address grpc_peer = 0.0.0.0:6689 # Master peer address (self-reference for workers) master_peer = 127.0.0.1:6689 # Run mode run_mode = master # Task scheduling strategy task_strategy = 1 # Number of parallel tasks task_parallel_num = 1
Note: HugeGraph connection details (pd_peers, server, graph) are provided in the graph load API request, not in the configuration file. See HugeGraph Integration section for details.
worker.ini)[default] # Worker HTTP listen address http_peer = 0.0.0.0:6788 # Worker gRPC listen address grpc_peer = 0.0.0.0:6789 # Master gRPC address to connect master_peer = 127.0.0.1:6689 # Run mode run_mode = worker # Worker group identifier worker_group = default
| Algorithm | Category | Description |
|---|---|---|
| PageRank | Centrality | Measures vertex importance via link structure |
| Personalized PageRank | Centrality | PageRank from specific source vertices |
| Betweenness Centrality | Centrality | Measures vertex importance via shortest paths |
| Closeness Centrality | Centrality | Measures average distance to all other vertices |
| Degree Centrality | Centrality | Simple in/out degree calculation |
| Louvain | Community Detection | Modularity-based community detection |
| Louvain (Weighted) | Community Detection | Weighted variant for edge-weighted graphs |
| LPA | Community Detection | Label Propagation Algorithm |
| SLPA | Community Detection | Speaker-Listener Label Propagation |
| WCC | Community Detection | Weakly Connected Components |
| SCC | Community Detection | Strongly Connected Components |
| SSSP | Path Finding | Single Source Shortest Path (Dijkstra) |
| Triangle Count | Graph Structure | Counts triangles in the graph |
| K-Core | Graph Structure | Finds k-core subgraphs |
| K-Out | Graph Structure | K-degree filtering |
| Clustering Coefficient | Graph Structure | Measures local clustering |
| Cycle Detection | Graph Structure | Detects cycles in directed graphs |
| Jaccard Similarity | Similarity | Computes neighbor-based similarity |
| Depth (BFS) | Traversal | Breadth-First Search depth assignment |
Vermeer exposes a REST API on port 6688 (configurable in master.ini).
| Endpoint | Method | Description |
|---|---|---|
/api/v1/graphs | POST | Load graph from data source |
/api/v1/graphs/{graph_id} | GET | Get graph metadata |
/api/v1/graphs/{graph_id} | DELETE | Unload graph from memory |
/api/v1/compute | POST | Execute algorithm on loaded graph |
/api/v1/tasks/{task_id} | GET | Get task status and results |
/api/v1/workers | GET | List connected workers |
/ui/ | GET | Web UI dashboard |
# 1. Load graph from HugeGraph curl -X POST http://localhost:6688/api/v1/graphs \ -H "Content-Type: application/json" \ -d '{ "graph_name": "my_graph", "load_type": "hugegraph", "hugegraph": { "pd_peers": ["127.0.0.1:8686"], "graph_name": "hugegraph" } }' # 2. Run PageRank curl -X POST http://localhost:6688/api/v1/compute \ -H "Content-Type: application/json" \ -d '{ "graph_name": "my_graph", "algorithm": "pagerank", "params": { "max_iterations": 20, "damping_factor": 0.85 }, "output": { "type": "hugegraph", "property_name": "pagerank_value" } }' # 3. Check task status curl http://localhost:6688/api/v1/tasks/{task_id}
Vermeer integrates with HugeGraph via:
ScanPartition)Configuration in graph load request:
{ "load_type": "hugegraph", "hugegraph": { "pd_peers": ["127.0.0.1:8686"], "graph_name": "hugegraph", "vertex_label": "person", "edge_label": "knows" } }
Load graphs from local CSV files:
{ "load_type": "csv", "csv": { "vertex_file": "/path/to/vertices.csv", "edge_file": "/path/to/edges.csv", "delimiter": "," } }
Load from Hadoop Distributed File System:
{ "load_type": "hdfs", "hdfs": { "namenode": "hdfs://namenode:9000", "vertex_path": "/graph/vertices", "edge_path": "/graph/edges" } }
Custom algorithms implement the Algorithm interface in algorithms/algorithms.go:
NOTE: The following is a simplified conceptual interface for illustration purposes. For actual algorithm implementation, see the
WorkerComputerandMasterComputerinterfaces defined inapps/compute/api.go.
type Algorithm interface { // Initialize the algorithm Init(params map[string]interface{}) error // Compute one iteration for a vertex Compute(vertex *Vertex, messages []Message) (halt bool, outMessages []Message) // Aggregate global state (optional) Aggregate() interface{} // Check termination condition Terminate(iteration int) bool }
NOTE: This is a simplified conceptual example. Actual algorithms must implement the
WorkerComputerinterface. Seevermeer/algorithms/degree.gofor a working example.
package algorithms type DegreeCount struct { maxIter int } func (dc *DegreeCount) Init(params map[string]interface{}) error { dc.maxIter = params["max_iterations"].(int) return nil } func (dc *DegreeCount) Compute(vertex *Vertex, messages []Message) (bool, []Message) { // Store degree as vertex value vertex.SetValue(float64(len(vertex.OutEdges))) // Halt after first iteration return true, nil } func (dc *DegreeCount) Terminate(iteration int) bool { return iteration >= dc.maxIter }
Register the algorithm in algorithms/algorithms.go:
func init() {
RegisterAlgorithm("degree_count", &DegreeCount{})
}
Vermeer uses an in-memory-first approach:
Best Practice: Ensure total worker memory exceeds graph size by 2-3x for algorithm workspace.
Run Vermeer as a daemon with automatic restarts and log rotation:
# Configuration in config/supervisor.conf ./tools/supervisord -c config/supervisor.conf -d
Sample supervisor configuration:
[program:vermeer-master] command=/path/to/vermeer --env=master autostart=true autorestart=true stdout_logfile=/var/log/vermeer-master.log
If you modify .proto files, regenerate Go code:
# Install protobuf Go plugins go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.28.0 go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@v1.2.0 # Generate (adjust protoc path for your platform) vermeer/tools/protoc/linux64/protoc vermeer/apps/protos/*.proto --go-grpc_out=vermeer/apps/protos/. --go_out=vermeer/apps/protos/. # please note remove license header if any
damping_factor=0.85, tolerance=0.0001 for faster convergenceweighted=true only if edge weights are meaningfulAccess the Web UI dashboard at http://master-ip:6688/ui/ for:
master_peer in worker.ini matches master's gRPC address6689 (gRPC)compute_threads in worker configSee the main Contributing Guide for how to contribute to Vermeer.
Vermeer is part of Apache HugeGraph-Computer, licensed under Apache 2.0 License.