For APM, agent or SDKs are just the technical details about how to instrument the libs. Manual or auto are nothing about the architecture, so in this document, we will consider them as a client lib only.
The basic design principles of SkyWalking architecture are easy to maintain, controllable and streaming.
In order to achieve these goals, SkyWalking backend provides the following designs.
SkyWalking collector is based on pure modulization design. End user can switch or assemble the collector features by their own requirements.
Module defines a collection of features, which could include techenical implementors(such as: gRPC/Jetty server managements), trace analysis(such as: trace segment or zipkin span parser), or aggregation feature. Totally decided by the module definition and its implementors.
Each module could define their services in Java Interface, and every providers of the module must provide implementors for these services. And the provider should define the dependency modules based its own implementation. So it means, even two different implementors of the module, could depend different modules.
Also the collector modulization core checks the startup sequences, if cycle dependency or dependency not found occurs, collector should be terminated by core.
The collector startup all modules, which are decleared in application.yml
. In this yaml file
cluster
, naming
zookeeper
is the cluster
modulehostPort
and sessionTimeout
are required attributes of zookepper
.The example part of the yaml definitation
cluster: zookeeper: hostPort: localhost:2181 sessionTimeout: 100000 naming: jetty: #OS real network IP(binding required), for agent to find collector cluster host: localhost port: 10800 contextPath: /
First of all, the collector provides two types of connections, also two protocols(HTTP and gRPC). These two are
Such as in SkyWalking Java agent
collector.servers
means the naming service, which maps to naming/jetty/ip:port
of collector, in HTTP.collector.direct_servers
means setting Uplink service directly, and using gRPC to send monitoring data.Example of the process flow between client lib and collector cluster
Client lib Collector1 Collector2 Collector3 (Set collector.servers=Collector2) (Collector 1,2,3 constitute the cluster) | +-----------> naming service ---------------------------->| | |<------- receive gRPC IP:Port(s) of Collector 1,2,3---<--| | |Select a random gRPC service |For example collector 3 | |------------------------->Uplink gRPC service----------------------------------->|
When collectors are running in cluster mode, collector must discovery each other in some way. In default, SkyWalking uses Zookeeper to coordinate and as register center for instance discovery.
Through the above section(Multiple connection ways), client lib will not use the Zookeeper to find cluster. And we suggest the client shouldn't do it in that way. Because the cluster discovery mechanism is switchable, provided by modulization core. Relying on that breaks the switchable capability.
We hope the community provides more implementor to do cluster discovery, such as Eureka, Consul, Kubernate.
Streaming mode likes a lightweight storm/spark implementation, which allows using APIs to build streaming process graph(DAG), and the input/output data contracts of each node.
New module can find and extend the existed process graph.
There are three cases in processing
By having these features, collector cluster runs like as a streaming net, to aggregate the metrics and don't rely on the storage implementor to support writing the same metric id concurrently.
Because streaming mode takes care of the concurrent, storage implementor responsibilities are provide high speed write, and group query.
Right now, we supported ElasticSearch as primary implementor, H2 for preview, and MySQL Relational Database cluster managed by ShardingShpere project.
Besides the principles in collector design, UI is another core component in SkyWalking. It is based on React, Antd and Zuul proxy to provide collector cluster discovery, query dispatch and visualziation.
Web UI shares the similiar process flow as client's 1.naming then 2.uplink
mechanism in Multiple connection ways section. The only difference is that, replace the uplink with GraphQL query protocol in HTTP binding at the host and port under ui/jetty/
in yaml definition(default:localhost:12800
).