This file provides guidance to AI coding assistants when working with code in this repository.
This is the Apache HugeGraph-Computer repository containing two distinct graph computing systems:
Both integrate with HugeGraph for graph data input/output.
Prerequisites:
mvn clean install first to generate CRD classes under computer-k8sBuild:
cd computer mvn clean compile -Dmaven.javadoc.skip=true
Tests:
# Unit tests mvn test -P unit-test # Integration tests mvn test -P integrate-test
Run single test:
# Run specific test class mvn test -P unit-test -Dtest=ClassName # Run specific test method mvn test -P unit-test -Dtest=ClassName#methodName
License check:
mvn apache-rat:check
Package:
mvn clean package -DskipTests
Prerequisites:
curl and unzip (for downloading binary dependencies)First-time setup:
cd vermeer make init # Downloads supervisord and protoc binaries, installs Go deps
Build:
make # Build for current platform make build-linux-amd64 make build-linux-arm64
Development build with hot-reload UI:
go build -tags=dev
Clean:
make clean # Remove built binaries and generated assets make clean-all # Also remove downloaded tools
Run:
# Using binary directly ./vermeer --env=master ./vermeer --env=worker # Using script (configure in vermeer.sh) ./vermeer.sh start master ./vermeer.sh start worker
Regenerate protobuf (if proto files changed):
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.28.0 go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@v1.2.0 # Generate (adjust protoc path for your platform) vermeer/tools/protoc/linux64/protoc vermeer/apps/protos/*.proto --go-grpc_out=vermeer/apps/protos/. --go_out=vermeer/apps/protos/. # please note remove license header if any
Module Structure:
computer-api: Public interfaces for graph processing (Computation, Vertex, Edge, Aggregator, Combiner, GraphFactory)computer-core: Runtime implementation (WorkerService, MasterService, messaging, BSP coordination, managers)computer-algorithm: Built-in algorithms (PageRank, LPA, WCC, SSSP, TriangleCount, etc.)computer-driver: Job submission and driver-side coordinationcomputer-k8s: Kubernetes deployment integrationcomputer-yarn: YARN deployment integrationcomputer-k8s-operator: Kubernetes operator for job managementcomputer-dist: Distribution packagingcomputer-test: Integration and unit testsKey Design Patterns:
API/Implementation Separation: Algorithms depend only on computer-api interfaces; computer-core provides runtime implementation. Algorithms are dynamically loaded via config.
Manager Pattern: WorkerService composes multiple managers (MessageSendManager, MessageRecvManager, WorkerAggrManager, DataServerManager, SortManagers, SnapshotManager, etc.) with lifecycle hooks: initAll(), beforeSuperstep(), afterSuperstep(), closeAll().
BSP Coordination: Explicit barrier synchronization via etcd (EtcdBspClient). Each superstep follows:
workerStepPrepareDone → waitMasterStepPrepareDoneworkerStepComputeDone → waitMasterStepComputeDoneworkerStepDone → waitMasterStepDone (master returns SuperstepStat)Computation Contract: Algorithms implement Computation<M extends Value>:
compute0(context, vertex): Initialize at superstep 0compute(context, vertex, messages): Process messages in subsequent superstepsComputationContextImportant Files:
computer/computer-api/src/main/java/org/apache/hugegraph/computer/core/worker/Computation.javacomputer/computer-core/src/main/java/org/apache/hugegraph/computer/core/worker/WorkerService.javacomputer/computer-core/src/main/java/org/apache/hugegraph/computer/core/bsp/Bsp4Worker.javacomputer/computer-algorithm/src/main/java/org/apache/hugegraph/computer/algorithm/centrality/pagerank/PageRank.javaDirectory Structure:
algorithms/: Go algorithm implementations (pagerank.go, sssp.go, louvain.go, etc.)apps/:bsp/: BSP coordination helpersgraphio/: HugeGraph I/O adapters (reads via gRPC to store/pd, writes via HTTP REST)master/: Master scheduling, HTTP endpoints, worker managementcompute/: Worker-side compute logicprotos/: Generated protobuf/gRPC definitionscommon/: Utilities, logging, metricsclient/: Client librariestools/: Binary dependencies (supervisord, protoc)ui/: Web UI assetsKey Patterns:
Maker/Registry Pattern: Graph loaders/writers register themselves via init() (e.g., LoadMakers[LoadTypeHugegraph] = &HugegraphMaker{}). Master selects loader by type.
HugeGraph Integration:
hugegraph.go implements HugegraphMaker, HugegraphLoader, HugegraphWriterMaster-Worker: Master schedules LoadPartition tasks to workers, manages worker lifecycle via WorkerManager/WorkerClient, exposes HTTP admin endpoints.
Important Files:
vermeer/apps/graphio/hugegraph.govermeer/apps/master/tasks/tasks.govermeer/apps/master/workers/workers.govermeer/apps/master/services/http_master.govermeer/apps/master/bl/scheduler_bl.goComputer (Java):
WorkerInputManager reads vertices/edges from HugeGraph via GraphFactory abstractionVermeer (Go):
Adding a New Algorithm (Computer):
computer-algorithm implementing Computation<MessageType>compute0() for initialization and compute() for message processingcontext.sendMessage() or context.sendMessageToAllEdges() for message passingbeforeSuperstep(), read/write in compute()K8s-Operator Development:
mvn clean install in computer-k8s-operator firstcomputer-k8s/target/generated-sources/computer-k8s-operator/crd-generate/MakefileVermeer Asset Updates:
cd asset && go generatemake generate-assets from vermeer rootgo build -tags=devComputer:
.github/workflows/computer-ci.yml)computer-dist/src/assembly/travis/Vermeer:
vermeer/test/,with vermeer_test.go and vermeer_test.shvermeer/config/ (master.ini, worker.ini templates)CI pipeline (.github/workflows/computer-ci.yml) runs:
-P integrate-test)-P unit-test)mvn clean install before editing to generate CRD classesmake init to download supervisord/protocBSP_ETCD_URL)