Architecture and Design

Hierarchical Architecture System

Architecture

Interface Layer

  • Functional Modules:
    • REST API: Provide HTTP interfaces that comply with the OpenAPI 3.0 specification.
    • Web UI: An interactive management interface based on Vue 3.
  • Interaction Protocol: All interface requests are transmitted via HTTP.

Core Layer

ModuleResponsibility Description
ServerCluster metadata management, Job scheduling, and global state maintenance
AgentHost-level service lifecycle management (deployment/start-stop/configuration)
LLMGenerate intelligent operation and maintenance suggestions based on natural language processing
StackDefine component stacks
gRPCImplement a two-way communication protocol between the Server and the Agent
JobA task execution unit that records operation status and logs

Component Layer

  • Supported Components: Include but not limited to ZooKeeper/Hadoop/Kafka, etc.
  • Extension Mechanism: Define installation scripts and configuration templates for new components through the Stack module.

Cluster Topology Rules

Cluster

Deployment Constraints

  • Service Instances:
    • The Server runs in a single-instance mode on the Host.
    • At most one Agent instance can be deployed on each Host.
  • Resource Allocation:
    • Each Host can only join one cluster.
    • The Agent is responsible for managing all components within the Host.
  • Communication Path:
graph TD
    Server-->|Management|Cluster-A
    Server-->|Management|Cluster-B
    Cluster-A-->|Agent|Host-1
    Cluster-A-->|Agent|Host-2
    Cluster-B-->|Agent|Host-3

Job Processing Flow

Command

Instruction Execution Phase

  • Request Reception:
    • Users initiate operation requests (such as starting Kafka) through the REST API or Web UI.
    • After the Server verifies the permissions, it creates the corresponding Job record.
  • Job Scheduling:
graph LR
    A[Job Enqueue] --> B(Job Queue)
    B --> C{Scheduling Strategy}
    C -->|FIFO| D[Split Task]
    D --> E[Distribute to Agent via gRPC]
  • Script Execution:
    • The Agent loads the corresponding script in the Stack.
    • Execution logs are written to the local log file in real time.

State Management Mechanism

State TypeTrigger ConditionHandling Strategy
PENDINGTask created but not scheduledWait for invocation
RUNNINGTask has been issued to the AgentMonitor the timeout threshold
SUCCESSFUL/FAILEDThe Agent returns the execution resultUpdate component status
CANCELEDThe preceding task failsCancel subsequent tasks