This document provides comprehensive configuration guidance for HugeGraph PD, including parameter descriptions, deployment scenarios, and production tuning recommendations.
PD uses the following configuration files (located in conf/ directory):
| File | Purpose |
|---|---|
application.yml | Main PD configuration (gRPC, Raft, storage, etc.) |
log4j2.xml | Logging configuration (log levels, appenders, rotation) |
verify-license.json | License verification configuration (optional) |
application.yml ├── spring # Spring Boot framework settings ├── management # Actuator endpoints and metrics ├── logging # Log configuration file location ├── license # License verification (optional) ├── grpc # gRPC server settings ├── server # REST API server settings ├── pd # PD-specific settings ├── raft # Raft consensus settings ├── store # Store node management settings └── partition # Partition management settings
Controls the gRPC server for inter-service communication.
grpc: host: 127.0.0.1 # gRPC bind address port: 8686 # gRPC server port
| Parameter | Type | Default | Description |
|---|---|---|---|
grpc.host | String | 127.0.0.1 | IMPORTANT: Must be set to actual IP address (not 127.0.0.1) for distributed deployments. Store and Server nodes connect to this address. |
grpc.port | Integer | 8686 | gRPC server port. Ensure this port is accessible from Store and Server nodes. |
Production Notes:
grpc.host to the node's actual IP address (e.g., 192.168.1.10)0.0.0.0 as it may cause service discovery issuesgrpc.portControls the REST API server for management and monitoring.
server: port: 8620 # REST API port
| Parameter | Type | Default | Description |
|---|---|---|---|
server.port | Integer | 8620 | REST API port for health checks, metrics, and management operations. |
Endpoints:
http://<host>:8620/actuator/healthhttp://<host>:8620/actuator/metricshttp://<host>:8620/actuator/prometheusControls Raft consensus for PD cluster coordination.
raft: address: 127.0.0.1:8610 # This node's Raft address peers-list: 127.0.0.1:8610 # All PD nodes in the cluster
| Parameter | Type | Default | Description |
|---|---|---|---|
raft.address | String | 127.0.0.1:8610 | Raft service address for this PD node. Format: <ip>:<port>. Must be unique across all PD nodes. |
raft.peers-list | String | 127.0.0.1:8610 | Comma-separated list of all PD nodes' Raft addresses. Used for cluster formation and leader election. |
Critical Rules:
raft.address must be unique for each PD noderaft.peers-list must be identical on all PD nodesraft.peers-list must contain all PD nodes (including this node)127.0.0.1, for multi-node clustersExample (3-node cluster):
# Node 1 raft: address: 192.168.1.10:8610 peers-list: 192.168.1.10:8610,192.168.1.11:8610,192.168.1.12:8610 # Node 2 raft: address: 192.168.1.11:8610 peers-list: 192.168.1.10:8610,192.168.1.11:8610,192.168.1.12:8610 # Node 3 raft: address: 192.168.1.12:8610 peers-list: 192.168.1.10:8610,192.168.1.11:8610,192.168.1.12:8610
Controls PD-specific behavior.
pd: data-path: ./pd_data # Metadata storage path patrol-interval: 1800 # Partition rebalancing interval (seconds) initial-store-count: 1 # Minimum stores for cluster availability initial-store-list: 127.0.0.1:8500 # Auto-activated stores
| Parameter | Type | Default | Description |
|---|---|---|---|
pd.data-path | String | ./pd_data | Directory for RocksDB metadata storage and Raft logs. Ensure sufficient disk space and fast I/O (SSD recommended). |
pd.patrol-interval | Integer | 1800 | Interval (in seconds) for partition health patrol and automatic rebalancing. Lower values = more frequent checks. |
pd.initial-store-count | Integer | 1 | Minimum number of Store nodes required for cluster to be operational. Set to expected initial store count. |
pd.initial-store-list | String | 127.0.0.1:8500 | Comma-separated list of Store gRPC addresses to auto-activate on startup. Useful for bootstrapping. |
Production Recommendations:
pd.data-path: Use dedicated SSD with at least 50GB free spacepd.patrol-interval:300 (5 minutes) for fast testing1800 (30 minutes) to reduce overhead3600 (1 hour)pd.initial-store-count: Set to expected initial store count (e.g., 3 for 3 stores)Controls how PD monitors and manages Store nodes.
store: max-down-time: 172800 # Store permanent failure threshold (seconds) monitor_data_enabled: true # Enable metrics collection monitor_data_interval: 1 minute # Metrics collection interval monitor_data_retention: 1 day # Metrics retention period
| Parameter | Type | Default | Description |
|---|---|---|---|
store.max-down-time | Integer | 172800 | Time (in seconds) after which a Store is considered permanently offline and its partitions are reallocated. Default: 48 hours. |
store.monitor_data_enabled | Boolean | true | Enable collection of Store metrics (CPU, memory, disk, partition count). |
store.monitor_data_interval | Duration | 1 minute | Interval for collecting Store metrics. Format: <value> <unit> (second, minute, hour). |
store.monitor_data_retention | Duration | 1 day | Retention period for historical metrics. Format: <value> <unit> (day, month, year). |
Production Recommendations:
store.max-down-time:300 (5 minutes) for fast failover testing86400 (24 hours) to avoid false positives during maintenance172800 (48 hours) for network instabilitystore.monitor_data_interval:10 seconds1 minute5 minutesstore.monitor_data_retention:1 day7 days30 days (requires more disk space)Controls partition allocation and replication.
partition: default-shard-count: 1 # Replicas per partition store-max-shard-count: 12 # Max partitions per store
| Parameter | Type | Default | Description |
|---|---|---|---|
partition.default-shard-count | Integer | 1 | Number of replicas per partition. Typically 3 in production for high availability. |
partition.store-max-shard-count | Integer | 12 | Maximum number of partition replicas a single Store can hold. Used for initial partition allocation. |
Initial Partition Count Calculation:
initial_partitions = (store_count * store_max_shard_count) / default_shard_count
Example:
store-max-shard-count=12, default-shard-count=3(3 * 12) / 3 = 12 partitions12 * 3 / 3 = 12 shards (4 partitions as leader + 8 as follower)Production Recommendations:
partition.default-shard-count:1 (no replication)3 (standard HA configuration)5 (maximum fault tolerance)partition.store-max-shard-count:10-2050-100200-500Controls Spring Boot Actuator endpoints for monitoring.
management: metrics: export: prometheus: enabled: true # Enable Prometheus metrics export endpoints: web: exposure: include: "*" # Expose all actuator endpoints
| Parameter | Type | Default | Description |
|---|---|---|---|
management.metrics.export.prometheus.enabled | Boolean | true | Enable Prometheus-compatible metrics at /actuator/prometheus. |
management.endpoints.web.exposure.include | String | "*" | Actuator endpoints to expose. "*" = all, or specify comma-separated list (e.g., "health,metrics"). |
Minimal configuration for local development.
grpc: host: 127.0.0.1 port: 8686 server: port: 8620 raft: address: 127.0.0.1:8610 peers-list: 127.0.0.1:8610 pd: data-path: ./pd_data patrol-interval: 300 # Fast rebalancing for testing initial-store-count: 1 initial-store-list: 127.0.0.1:8500 store: max-down-time: 300 # Fast failover for testing monitor_data_enabled: true monitor_data_interval: 10 seconds monitor_data_retention: 1 day partition: default-shard-count: 1 # No replication store-max-shard-count: 10
Characteristics:
default-shard-count=1)Recommended configuration for production deployments.
grpc: host: 192.168.1.10 port: 8686 server: port: 8620 raft: address: 192.168.1.10:8610 peers-list: 192.168.1.10:8610,192.168.1.11:8610,192.168.1.12:8610 pd: data-path: /data/pd/metadata patrol-interval: 1800 initial-store-count: 3 initial-store-list: 192.168.1.20:8500,192.168.1.21:8500,192.168.1.22:8500 store: max-down-time: 86400 # 24 hours monitor_data_enabled: true monitor_data_interval: 1 minute monitor_data_retention: 7 days partition: default-shard-count: 3 # Triple replication store-max-shard-count: 50
grpc: host: 192.168.1.11 port: 8686 server: port: 8620 raft: address: 192.168.1.11:8610 peers-list: 192.168.1.10:8610,192.168.1.11:8610,192.168.1.12:8610 pd: data-path: /data/pd/metadata patrol-interval: 1800 initial-store-count: 3 initial-store-list: 192.168.1.20:8500,192.168.1.21:8500,192.168.1.22:8500 store: max-down-time: 86400 monitor_data_enabled: true monitor_data_interval: 1 minute monitor_data_retention: 7 days partition: default-shard-count: 3 store-max-shard-count: 50
grpc: host: 192.168.1.12 port: 8686 server: port: 8620 raft: address: 192.168.1.12:8610 peers-list: 192.168.1.10:8610,192.168.1.11:8610,192.168.1.12:8610 pd: data-path: /data/pd/metadata patrol-interval: 1800 initial-store-count: 3 initial-store-list: 192.168.1.20:8500,192.168.1.21:8500,192.168.1.22:8500 store: max-down-time: 86400 monitor_data_enabled: true monitor_data_interval: 1 minute monitor_data_retention: 7 days partition: default-shard-count: 3 store-max-shard-count: 50
Characteristics:
default-shard-count=3)initial-store-listNetwork Requirements:
Configuration for mission-critical deployments requiring maximum fault tolerance.
# Node 1: 192.168.1.10 grpc: host: 192.168.1.10 port: 8686 raft: address: 192.168.1.10:8610 peers-list: 192.168.1.10:8610,192.168.1.11:8610,192.168.1.12:8610,192.168.1.13:8610,192.168.1.14:8610 pd: data-path: /data/pd/metadata patrol-interval: 3600 # Lower frequency for large clusters initial-store-count: 5 initial-store-list: 192.168.1.20:8500,192.168.1.21:8500,192.168.1.22:8500,192.168.1.23:8500,192.168.1.24:8500 store: max-down-time: 172800 # 48 hours (conservative) monitor_data_enabled: true monitor_data_interval: 1 minute monitor_data_retention: 30 days # Long-term retention partition: default-shard-count: 3 # Or 5 for extreme HA store-max-shard-count: 100
Characteristics:
JVM options are specified via the startup script (bin/start-hugegraph-pd.sh).
# Option 1: Via startup script flag bin/start-hugegraph-pd.sh -j "-Xmx8g -Xms8g" # Option 2: Edit start-hugegraph-pd.sh directly JAVA_OPTIONS="-Xmx8g -Xms8g -XX:+UseG1GC"
Recommendations by Cluster Size:
| Cluster Size | Partitions | Heap Size | Notes |
|---|---|---|---|
| Small (1-3 stores, <100 partitions) | <100 | -Xmx2g -Xms2g | Development/testing |
| Medium (3-10 stores, 100-1000 partitions) | 100-1000 | -Xmx4g -Xms4g | Standard production |
| Large (10-50 stores, 1000-10000 partitions) | 1000-10000 | -Xmx8g -Xms8g | Large production |
| X-Large (50+ stores, 10000+ partitions) | 10000+ | -Xmx16g -Xms16g | Enterprise scale |
Key Principles:
-Xms equal to -Xmx to avoid heap resizingG1GC (Default, Recommended):
bin/start-hugegraph-pd.sh -g g1 -j "-Xmx8g -Xms8g \ -XX:MaxGCPauseMillis=200 \ -XX:G1HeapRegionSize=16m \ -XX:InitiatingHeapOccupancyPercent=45"
ZGC (Low-Latency, Java 11+):
bin/start-hugegraph-pd.sh -g ZGC -j "-Xmx8g -Xms8g \ -XX:ZCollectionInterval=30"
-Xlog:gc*:file=logs/gc.log:time,uptime,level,tags:filecount=10,filesize=100M
Raft parameters are typically sufficient with defaults, but can be tuned for specific scenarios.
Increase election timeout for high-latency networks.
Default: 1000ms (1 second)
Tuning (requires code changes in RaftEngine.java):
// In hg-pd-core/.../raft/RaftEngine.java nodeOptions.setElectionTimeoutMs(3000); // 3 seconds
When to Increase:
Control how often Raft snapshots are created.
Default: 3600 seconds (1 hour)
Tuning (in RaftEngine.java):
nodeOptions.setSnapshotIntervalSecs(7200); // 2 hours
Recommendations:
PD uses RocksDB for metadata storage. Optimize for your workload.
SSD Optimization (default, recommended):
HDD Optimization (not recommended): If using HDD (not recommended for production):
// In MetadataRocksDBStore.java, customize RocksDB options Options options = new Options() .setCompactionStyle(CompactionStyle.LEVEL) .setWriteBufferSize(64 * 1024 * 1024) // 64MB .setMaxWriteBufferNumber(3) .setLevelCompactionDynamicLevelBytes(true);
Key Metrics to Monitor:
For high-throughput scenarios, tune gRPC connection pool size.
Client-Side (in PDClient):
PDConfig config = PDConfig.builder() .pdServers("192.168.1.10:8686,192.168.1.11:8686,192.168.1.12:8686") .maxChannels(5) // Number of gRPC channels per PD node .build();
Recommendations:
maxChannels=1maxChannels=3-5maxChannels=10+Optimize OS-level TCP settings for low latency.
# Increase TCP buffer sizes sysctl -w net.core.rmem_max=16777216 sysctl -w net.core.wmem_max=16777216 sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216" sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216" # Reduce TIME_WAIT connections sysctl -w net.ipv4.tcp_tw_reuse=1 sysctl -w net.ipv4.tcp_fin_timeout=30
| Metric | Threshold | Action |
|---|---|---|
| PD Leader Changes | >2 per hour | Investigate network stability, increase election timeout |
| Raft Log Lag | >1000 entries | Check follower disk I/O, network latency |
| Store Heartbeat Failures | >5% | Check Store node health, network connectivity |
| Partition Imbalance | >20% deviation | Reduce patrol-interval, check rebalancing logic |
| GC Pause Time | >500ms | Tune GC settings, increase heap size |
Disk Usage (pd.data-path) | >80% | Clean up old snapshots, expand disk, increase monitor_data_retention |
scrape_configs: - job_name: 'hugegraph-pd' static_configs: - targets: - '192.168.1.10:8620' - '192.168.1.11:8620' - '192.168.1.12:8620' metrics_path: '/actuator/prometheus' scrape_interval: 15s
Key panels to create:
Located at conf/log4j2.xml.
<Loggers> <!-- PD application logs --> <Logger name="org.apache.hugegraph.pd" level="INFO"/> <!-- Raft consensus logs (verbose, set to WARN in production) --> <Logger name="com.alipay.sofa.jraft" level="WARN"/> <!-- RocksDB logs --> <Logger name="org.rocksdb" level="WARN"/> <!-- gRPC logs --> <Logger name="io.grpc" level="WARN"/> <!-- Root logger --> <Root level="INFO"> <AppenderRef ref="RollingFile"/> <AppenderRef ref="Console"/> </Root> </Loggers>
Recommendations:
DEBUG for detailed tracingINFO (default) or WARN for lower overheadDEBUG<RollingFile name="RollingFile" fileName="logs/hugegraph-pd.log" filePattern="logs/hugegraph-pd-%d{yyyy-MM-dd}-%i.log.gz"> <PatternLayout> <Pattern>%d{ISO8601} [%t] %-5level %logger{36} - %msg%n</Pattern> </PatternLayout> <Policies> <TimeBasedTriggeringPolicy interval="1" modulate="true"/> <SizeBasedTriggeringPolicy size="100 MB"/> </Policies> <DefaultRolloverStrategy max="30"/> </RollingFile>
Configuration:
curl http://localhost:8620/actuator/health
Response (healthy):
{ "status": "UP" }
curl http://localhost:8620/actuator/metrics
Available Metrics:
pd.raft.state: Raft state (0=Follower, 1=Candidate, 2=Leader)pd.store.count: Number of stores by statepd.partition.count: Total partitionsjvm.memory.used: JVM memory usagejvm.gc.pause: GC pause timescurl http://localhost:8620/actuator/prometheus
Sample Output:
# HELP pd_raft_state Raft state
# TYPE pd_raft_state gauge
pd_raft_state 2.0
# HELP pd_store_count Store count by state
# TYPE pd_store_count gauge
pd_store_count{state="Up"} 3.0
pd_store_count{state="Offline"} 0.0
# HELP pd_partition_count Total partitions
# TYPE pd_partition_count gauge
pd_partition_count 36.0
grpc.host set to actual IP address (not 127.0.0.1)raft.address unique for each PD noderaft.peers-list identical on all PD nodesraft.peers-list contains all PD node addressespd.data-path has sufficient disk space (>50GB)pd.initial-store-count matches expected store countpartition.default-shard-count = 3 (for production HA)# Check Raft configuration grep -A2 "^raft:" conf/application.yml # Verify peers list on all nodes for node in 192.168.1.{10,11,12}; do echo "Node $node:" ssh $node "grep peers-list /path/to/conf/application.yml" done # Check port accessibility nc -zv 192.168.1.10 8620 8686 8610
Key configuration guidelines:
127.0.0.1 addressesFor architecture details, see Architecture Documentation.
For API usage, see API Reference.
For development, see Development Guide.