This is often asked by those who are first introduced to SkyWalking. Many believe that MQ should have better performance and should be able to support higher throughput, like the following:
Here's what we think.
This question arises when users consider the circumstances where the OAP cluster may not be powerful enough or becomes offline. But the following issues must first be addressed:
With the analysis above in mind, why would you still want the traces to be 100%, given the resources they would cost? The preferred way to do this would be adding a better dynamic trace sampling mechanism at the backend. When throughput exceeds the threshold, gradually modify the active sampling rate from 100% to 10%, which means you could get the OAP and ES 3 times more powerful than usual, while ignoring the traces at peak times.
Even though MQ transport is not recommended from the production perspective, SkyWalking still provides optional plugins named kafka-reporter
and kafka-fetcher
for this feature since 8.1.0.
The answer is that the MQ metrics data exporter is already readily available. The exporter module with gRPC default mechanism is there, and you can easily provide a new implementor of this module.