| # Query Protocol |
| Query Protocol defines a set of APIs in GraphQL grammar to provide data query and interactive capabilities with SkyWalking |
| native visualization tool or 3rd party system, including Web UI, CLI or private system. |
| |
| Query protocol official repository, https://github.com/apache/skywalking-query-protocol. |
| |
| ### Metadata |
| Metadata contains concise information on all services and their instances, endpoints, etc. under monitoring. |
| You may query the metadata in different ways. |
| ```graphql |
| extend type Query { |
| # Normal service related meta info |
| getAllServices(duration: Duration!, group: String): [Service!]! |
| searchServices(duration: Duration!, keyword: String!): [Service!]! |
| searchService(serviceCode: String!): Service |
| |
| # Fetch all services of Browser type |
| getAllBrowserServices(duration: Duration!): [Service!]! |
| searchBrowserServices(duration: Duration!, keyword: String!): [Service!]! |
| searchBrowserService(serviceCode: String!): Service |
| |
| # Service instance query |
| getServiceInstances(duration: Duration!, serviceId: ID!): [ServiceInstance!]! |
| |
| # Endpoint query |
| # Consider there are huge numbers of endpoint, |
| # must use endpoint owner's service id, keyword and limit filter to do query. |
| searchEndpoint(keyword: String!, serviceId: ID!, limit: Int!): [Endpoint!]! |
| getEndpointInfo(endpointId: ID!): EndpointInfo |
| |
| # Database related meta info. |
| getAllDatabases(duration: Duration!): [Database!]! |
| getTimeInfo: TimeInfo |
| } |
| ``` |
| |
| ### Topology |
| The topology and dependency graphs among services, instances and endpoints. Includes direct relationships or global maps. |
| |
| ```graphql |
| extend type Query { |
| # Query the global topology |
| getGlobalTopology(duration: Duration!): Topology |
| # Query the topology, based on the given service |
| getServiceTopology(serviceId: ID!, duration: Duration!): Topology |
| # Query the topology, based on the given services. |
| # `#getServiceTopology` could be replaced by this. |
| getServicesTopology(serviceIds: [ID!]!, duration: Duration!): Topology |
| # Query the instance topology, based on the given clientServiceId and serverServiceId |
| getServiceInstanceTopology(clientServiceId: ID!, serverServiceId: ID!, duration: Duration!): ServiceInstanceTopology |
| # Query the topology, based on the given endpoint |
| getEndpointTopology(endpointId: ID!, duration: Duration!): Topology |
| # v2 of getEndpointTopology |
| getEndpointDependencies(endpointId: ID!, duration: Duration!): EndpointTopology |
| } |
| ``` |
| |
| ### Metrics |
| Metrics query targets all objects defined in [OAL script](../concepts-and-designs/oal.md) and [MAL](../concepts-and-designs/mal.md). |
| You may obtain the metrics data in linear or thermodynamic matrix formats based on the aggregation functions in script. |
| |
| #### V2 APIs |
| Provide Metrics V2 query APIs since 8.0.0, including metadata, single/multiple values, heatmap, and sampled records metrics. |
| ```graphql |
| extend type Query { |
| # Metrics definition metadata query. Response the metrics type which determines the suitable query methods. |
| typeOfMetrics(name: String!): MetricsType! |
| # Get the list of all available metrics in the current OAP server. |
| # Param, regex, could be used to filter the metrics by name. |
| listMetrics(regex: String): [MetricDefinition!]! |
| |
| # Read metrics single value in the duration of required metrics |
| readMetricsValue(condition: MetricsCondition!, duration: Duration!): Long! |
| # Read time-series values in the duration of required metrics |
| readMetricsValues(condition: MetricsCondition!, duration: Duration!): MetricsValues! |
| # Read entity list of required metrics and parent entity type. |
| sortMetrics(condition: TopNCondition!, duration: Duration!): [SelectedRecord!]! |
| # Read value in the given time duration, usually as a linear. |
| # labels: the labels you need to query. |
| readLabeledMetricsValues(condition: MetricsCondition!, labels: [String!]!, duration: Duration!): [MetricsValues!]! |
| # Heatmap is bucket based value statistic result. |
| readHeatMap(condition: MetricsCondition!, duration: Duration!): HeatMap |
| # Deprecated since 9.3.0, replaced by readRecords defined in record.graphqls |
| # Read the sampled records |
| # TopNCondition#scope is not required. |
| readSampledRecords(condition: TopNCondition!, duration: Duration!): [SelectedRecord!]! |
| } |
| ``` |
| |
| #### V1 APIs |
| 3 types of metrics can be queried. V1 APIs were introduced since 6.x. Now they are a shell to V2 APIs. |
| 1. Single value. Most default metrics are in single value. `getValues` and `getLinearIntValues` are suitable for this purpose. |
| 1. Multiple value. A metric defined in OAL includes multiple value calculations. Use `getMultipleLinearIntValues` to obtain all values. `percentile` is a typical multiple value function in OAL. |
| 1. Heatmap value. Read [Heatmap in WIKI](https://en.wikipedia.org/wiki/Heat_map) for details. `thermodynamic` is the only OAL function. Use `getThermodynamic` to get the values. |
| ```graphql |
| extend type Query { |
| getValues(metric: BatchMetricConditions!, duration: Duration!): IntValues |
| getLinearIntValues(metric: MetricCondition!, duration: Duration!): IntValues |
| # Query the type of metrics including multiple values, and format them as multiple lines. |
| # The seq of these multiple lines base on the calculation func in OAL |
| # Such as, should us this to query the result of func percentile(50,75,90,95,99) in OAL, |
| # then five lines will be responded, p50 is the first element of return value. |
| getMultipleLinearIntValues(metric: MetricCondition!, numOfLinear: Int!, duration: Duration!): [IntValues!]! |
| getThermodynamic(metric: MetricCondition!, duration: Duration!): Thermodynamic |
| } |
| ``` |
| |
| Metrics are defined in the `config/oal/*.oal` files. |
| |
| ### Aggregation |
| Aggregation query means that the metrics data need a secondary aggregation at query stage, which causes the query |
| interfaces to have some different arguments. A typical example of aggregation query is the `TopN` list of services. |
| Metrics stream aggregation simply calculates the metrics values of each service, but the expected list requires ordering metrics data |
| by their values. |
| |
| Aggregation query is for single value metrics only. |
| |
| ```graphql |
| # The aggregation query is different with the metric query. |
| # All aggregation queries require backend or/and storage do aggregation in query time. |
| extend type Query { |
| # TopN is an aggregation query. |
| getServiceTopN(name: String!, topN: Int!, duration: Duration!, order: Order!): [TopNEntity!]! |
| getAllServiceInstanceTopN(name: String!, topN: Int!, duration: Duration!, order: Order!): [TopNEntity!]! |
| getServiceInstanceTopN(serviceId: ID!, name: String!, topN: Int!, duration: Duration!, order: Order!): [TopNEntity!]! |
| getAllEndpointTopN(name: String!, topN: Int!, duration: Duration!, order: Order!): [TopNEntity!]! |
| getEndpointTopN(serviceId: ID!, name: String!, topN: Int!, duration: Duration!, order: Order!): [TopNEntity!]! |
| } |
| ``` |
| |
| ### Record |
| Record is a general and abstract type for collected raw data. |
| In the observability, traces and logs have specific and well-defined meanings, meanwhile, the general records represent other |
| collected records. Such as sampled slow SQL statement, HTTP request raw data(request/response header/body) |
| |
| ```graphql |
| extend type Query { |
| # Query collected records with given metric name and parent entity conditions, and return in the requested order. |
| readRecords(condition: RecordCondition!, duration: Duration!): [Record!]! |
| } |
| ``` |
| |
| ### Logs |
| ```graphql |
| extend type Query { |
| # Return true if the current storage implementation supports fuzzy query for logs. |
| supportQueryLogsByKeywords: Boolean! |
| queryLogs(condition: LogQueryCondition): Logs |
| |
| # Test the logs and get the results of the LAL output. |
| test(requests: LogTestRequest!): LogTestResponse! |
| } |
| ``` |
| |
| Log implementations vary between different database options. Some search engines like ElasticSearch and OpenSearch can support |
| full log text fuzzy queries, while others do not due to considerations related to performance impact and end user experience. |
| |
| `test` API serves as the debugging tool for native LAL parsing. |
| |
| ### Trace |
| ```graphql |
| extend type Query { |
| queryBasicTraces(condition: TraceQueryCondition): TraceBrief |
| queryTrace(traceId: ID!): Trace |
| } |
| ``` |
| |
| Trace query fetches trace segment lists and spans of given trace IDs. |
| |
| ### Alarm |
| ```graphql |
| extend type Query { |
| getAlarmTrend(duration: Duration!): AlarmTrend! |
| getAlarm(duration: Duration!, scope: Scope, keyword: String, paging: Pagination!, tags: [AlarmTag]): Alarms |
| } |
| ``` |
| |
| Alarm query identifies alarms and related events. |
| |
| ### Event |
| ```graphql |
| extend type Query { |
| queryEvents(condition: EventQueryCondition): Events |
| } |
| ``` |
| |
| Event query fetches the event list based on given sources and time range conditions. |
| |
| ## Condition |
| ### Duration |
| Duration is a widely used parameter type as the APM data is time-related. See the following for more details. |
| Step relates to precision. |
| ```graphql |
| # The Duration defines the start and end time for each query operation. |
| # Fields: `start` and `end` |
| # represents the time span. And each of them matches the step. |
| # ref https://www.ietf.org/rfc/rfc3339.txt |
| # The time formats are |
| # `SECOND` step: yyyy-MM-dd HHmmss |
| # `MINUTE` step: yyyy-MM-dd HHmm |
| # `HOUR` step: yyyy-MM-dd HH |
| # `DAY` step: yyyy-MM-dd |
| # `MONTH` step: yyyy-MM |
| # Field: `step` |
| # represents the accurate time point. |
| # e.g. |
| # if step==HOUR , start=2017-11-08 09, end=2017-11-08 19 |
| # then |
| # metrics from the following time points expected |
| # 2017-11-08 9:00 -> 2017-11-08 19:00 |
| # there are 11 time points (hours) in the time span. |
| input Duration { |
| start: String! |
| end: String! |
| step: Step! |
| } |
| |
| enum Step { |
| MONTH |
| DAY |
| HOUR |
| MINUTE |
| SECOND |
| } |
| ``` |