This file provides guidance to an AI coding tool when working with code in this repository.
hugegraph-struct is a foundational data structures module that defines the core abstractions shared across HugeGraph distributed components. This module must be built before hugegraph-pd and hugegraph-store as they depend on its structure definitions.
Key Responsibilities:
# From hugegraph-struct directory mvn clean install -DskipTests # Build with tests (if any exist in future) mvn clean install # From parent directory (hugegraph root) mvn install -pl hugegraph-struct -am -DskipTests
This module is a critical dependency for distributed components:
# Correct build order for distributed components: # 1. Build hugegraph-struct first mvn install -pl hugegraph-struct -am -DskipTests # 2. Then build PD mvn clean package -pl hugegraph-pd -am -DskipTests # 3. Then build Store mvn clean package -pl hugegraph-store -am -DskipTests
org.apache.hugegraph/
├── struct/schema/ # Schema element definitions
│ ├── SchemaElement # Base class for all schema types
│ ├── VertexLabel # Vertex label definitions
│ ├── EdgeLabel # Edge label definitions
│ ├── PropertyKey # Property key definitions
│ ├── IndexLabel # Index label definitions
│ └── builder/ # Builder pattern implementations
├── structure/ # Graph element structures
│ ├── BaseElement # Base class for vertices/edges
│ ├── BaseVertex # Vertex implementation
│ ├── BaseEdge # Edge implementation
│ ├── BaseProperty # Property implementation
│ └── builder/ # Element builders
├── type/ # Type system
│ ├── HugeType # Enum for all graph types (VERTEX, EDGE, etc.)
│ ├── GraphType # Type interface
│ ├── Namifiable # Name-based types
│ ├── Idfiable # ID-based types
│ └── define/ # Type definitions (DataType, IdStrategy, etc.)
├── id/ # ID generation and management
│ ├── Id # ID interface
│ ├── IdGenerator # ID generation utilities
│ ├── EdgeId # Edge-specific ID handling
│ └── IdUtil # ID utility methods
├── serializer/ # Binary serialization
│ ├── BytesBuffer # Buffer for binary I/O
│ ├── BinaryElementSerializer # Element serialization
│ └── DirectBinarySerializer # Direct binary access
├── query/ # Query abstractions
│ ├── Query # Base query interface
│ ├── ConditionQuery # Conditional queries
│ ├── IdQuery # ID-based queries
│ ├── Condition # Query conditions
│ └── Aggregate # Aggregation queries
├── analyzer/ # Text analyzers (Chinese NLP)
│ ├── Analyzer # Base analyzer interface
│ ├── AnalyzerFactory # Factory for creating analyzers
│ ├── IKAnalyzer # IK Chinese word segmentation
│ ├── JiebaAnalyzer # Jieba segmentation
│ ├── HanLPAnalyzer # HanLP NLP
│ ├── AnsjAnalyzer # Ansj segmentation
│ ├── WordAnalyzer # Word-based analysis
│ ├── JcsegAnalyzer # Jcseg segmentation
│ ├── MMSeg4JAnalyzer # MMSeg4J segmentation
│ └── SmartCNAnalyzer # Lucene SmartCN
├── auth/ # Authentication utilities
│ ├── TokenGenerator # JWT token generation
│ └── AuthConstant # Auth constants
├── backend/ # Backend abstractions
│ ├── BinaryId # Binary ID representation
│ ├── BackendColumn # Column abstraction
│ └── Shard # Shard information
├── options/ # Configuration options
│ ├── CoreOptions # Core configuration
│ └── AuthOptions # Auth configuration
├── util/ # Utilities
│ ├── StringEncoding # String encoding utilities
│ ├── GraphUtils # Graph utility methods
│ ├── LZ4Util # LZ4 compression
│ ├── Blob # Binary blob handling
│ └── collection/ # Collection utilities (IdSet, CollectionFactory)
└── exception/ # Exception hierarchy
├── HugeException # Base exception
├── BackendException # Backend errors
├── NotSupportException # Unsupported operations
├── NotFoundException # Not found errors
└── NotAllowException # Permission errors
The module defines a dual schema hierarchy:
struct.schema.*: Schema element definitions (VertexLabel, EdgeLabel, etc.) - these are metadata about the graph structurestructure.*: Actual graph elements (BaseVertex, BaseEdge, etc.) - these are data instancesThe schema layer defines the “blueprint” while the structure layer implements the “instances”.
The HugeType enum (type/HugeType.java) defines all possible types:
VERTEX_LABEL, EDGE_LABEL, PROPERTY_KEY, INDEX_LABELVERTEX, EDGE, PROPERTY, AGGR_PROPERTY_V, AGGR_PROPERTY_EMETA, COUNTER, TASK, OLAP, INDEXIDs are critical for distributed systems:
Id interface provides abstraction over different ID typesIdGenerator creates IDs based on strategy (AUTO_INCREMENT, PRIMARY_KEY, CUSTOMIZE)EdgeId uses special encoding: source vertex ID + edge label ID + sort values + target vertex IDBytesBuffer and serializers enable:
Query classes provide backend-agnostic query building:
Query: Base interface with limit, offset, orderingConditionQuery: Supports conditions (EQ, GT, LT, IN, CONTAINS, etc.)IdQuery: Direct ID-based lookupsAggregate: Aggregation operations (SUM, MAX, MIN, AVG)Multiple Chinese NLP libraries for different use cases:
When adding or modifying schema elements in struct/schema/:
SchemaElement base classNamifiable, Typifiable)HugeType enum value if neededBinaryElementSerializerstruct/schema/builder/When modifying serialization:
BytesBuffer format require version migrationTo add a new text analyzer:
Analyzer interface in analyzer/AnalyzerFactory// Schema elements use builders PropertyKey propertyKey = schema.propertyKey("name") .asText() .valueSingle() .create();
// Generate IDs based on strategy Id id = IdGenerator.of(value, IdType.LONG); Id edgeId = EdgeId.parse(sourceId, direction, label, sortValues, targetId);
// Write to buffer BytesBuffer buffer = BytesBuffer.allocate(size); buffer.writeId(id); buffer.writeString(name); // Read from buffer Id id = buffer.readId(); String name = buffer.readString();
This module is referenced by:
This module follows Apache Software Foundation guidelines:
install-dist/release-docs/licenses/