SkyWalking leverages OpenTelemetry Collector with AWS Container Insights Receiver to transfer the metrics to OpenTelemetry receiver and into the Meter System.
AWS Container Insights Receiver provides multiple dimensions metrics for EKS cluster, node, service, etc. Accordingly, SkyWalking observes the status, and payload of the EKS cluster, which is cataloged as a LAYER: AWS_EKS
Service
in the OAP. Meanwhile, the k8s nodes would be recognized as LAYER: AWS_EKS
instance
s. The k8s service would be recognized as endpoint
s.
SkyWalking distinguishes AWS Cloud EKS metrics by attributes job_name
, which value is aws-cloud-eks-monitoring
. You could leverage OTEL Collector processor to add the attribute as follows:
processors: resource/job-name: attributes: - key: job_name value: aws-cloud-eks-monitoring action: insert
Notice, if you don't specify job_name
attribute, SkyWalking OAP will ignore the metrics
Monitoring Panel | Unit | Metric Name | Catalog | Description | Data Source |
---|---|---|---|---|---|
Node Count | eks_cluster_node_count | Service | The node count of the EKS cluster | AWS Container Insights Receiver | |
Failed Node Count | eks_cluster_failed_node_count | Service | The failed node count of the EKS cluster | AWS Container Insights Receiver | |
Pod Count (namespace dimension) | eks_cluster_namespace_count | Service | The count of pod in the EKS cluster(namespace dimension) | AWS Container Insights Receiver | |
Pod Count (service dimension) | eks_cluster_service_count | Service | The count of pod in the EKS cluster(service dimension) | AWS Container Insights Receiver | |
Network RX Dropped Count (per second) | count/s | eks_cluster_net_rx_dropped | Service | Network RX dropped count | AWS Container Insights Receiver |
Network RX Error Count (per second) | count/s | eks_cluster_net_rx_error | Service | Network RX error count | AWS Container Insights Receiver |
Network TX Dropped Count (per second) | count/s | eks_cluster_net_rx_dropped | Service | Network TX dropped count | AWS Container Insights Receiver |
Network TX Error Count (per second) | count/s | eks_cluster_net_rx_error | Service | Network TX error count | AWS Container Insights Receiver |
Pod Count | eks_cluster_node_pod_number | Instance | The count of pod running on the node | AWS Container Insights Receiver | |
CPU Utilization | percent | eks_cluster_node_cpu_utilization | Instance | The CPU Utilization of the node | AWS Container Insights Receiver |
Memory Utilization | percent | eks_cluster_node_memory_utilization | Instance | The Memory Utilization of the node | AWS Container Insights Receiver |
Network RX | bytes/s | eks_cluster_node_net_rx_bytes | Instance | Network RX bytes of the node | AWS Container Insights Receiver |
Network RX Error Count | count/s | eks_cluster_node_net_rx_bytes | Instance | Network RX error count of the node | AWS Container Insights Receiver |
Network TX | bytes/s | eks_cluster_node_net_rx_bytes | Instance | Network TX bytes of the node | AWS Container Insights Receiver |
Network TX Error Count | count/s | eks_cluster_node_net_rx_bytes | Instance | Network TX error count of the node | AWS Container Insights Receiver |
Disk IO Write | bytes/s | eks_cluster_node_net_rx_bytes | Instance | The IO write bytes of the node | AWS Container Insights Receiver |
Disk IO Read | bytes/s | eks_cluster_node_net_rx_bytes | Instance | The IO read bytes of the node | AWS Container Insights Receiver |
FS Utilization | percent | eks_cluster_node_net_rx_bytes | Instance | The filesystem utilization of the node | AWS Container Insights Receiver |
CPU Utilization | percent | eks_cluster_node_pod_cpu_utilization | Instance | The CPU Utilization of the pod running on the node | AWS Container Insights Receiver |
Memory Utilization | percent | eks_cluster_node_pod_memory_utilization | Instance | The Memory Utilization of the pod running on the node | AWS Container Insights Receiver |
Network RX | bytes/s | eks_cluster_node_pod_net_rx_bytes | Instance | Network RX bytes of the pod running on the node | AWS Container Insights Receiver |
Network RX Error Count | count/s | eks_cluster_node_pod_net_rx_error | Instance | Network RX error count of the pod running on the node | AWS Container Insights Receiver |
Network TX | bytes/s | eks_cluster_node_pod_net_tx_bytes | Instance | Network RX bytes of the pod running on the node | AWS Container Insights Receiver |
Network TX Error Count | count/s | eks_cluster_node_pod_net_tx_error | Instance | Network RX error count of the pod running on the node | AWS Container Insights Receiver |
CPU Utilization | percent | eks_cluster_service_pod_cpu_utilization | Endpoint | The CPU Utilization of pod that belong to the service | AWS Container Insights Receiver |
Memory Utilization | percent | eks_cluster_service_pod_memory_utilization | Endpoint | The Memory Utilization of pod that belong to the service | AWS Container Insights Receiver |
Network RX | bytes/s | eks_cluster_service_pod_net_rx_bytes | Endpoint | Network RX bytes of the pod that belong to the service | AWS Container Insights Receiver |
Network RX Error Count | count/s | eks_cluster_service_pod_net_rx_error | Endpoint | Network TX error count of the pod that belongs to the service | AWS Container Insights Receiver |
Network TX | bytes/s | eks_cluster_service_pod_net_tx_bytes | Endpoint | Network TX bytes of the pod that belong to the service | AWS Container Insights Receiver |
Network TX Error Count | count/s | eks_cluster_node_pod_net_tx_error | Endpoint | Network TX error count of the pod that belongs to the service | AWS Container Insights Receiver |
You can customize your own metrics/expression/dashboard panel. The metrics definition and expression rules are found in /config/otel-rules/aws-eks/
. The AWS Cloud EKS dashboard panel configurations are found in /config/ui-initialized-templates/aws_eks
.
extensions: health_check: receivers: awscontainerinsightreceiver: processors: resource/job-name: attributes: - key: job_name value: aws-cloud-eks-monitoring action: insert exporters: otlp: endpoint: oap-service:11800 tls: insecure: true logging: loglevel: debug service: pipelines: metrics: receivers: [awscontainerinsightreceiver] processors: [resource/job-name] exporters: [otlp,logging] extensions: [health_check]
Refer to AWS Container Insights Receiver for more information