You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2021/10/28 01:26:08 UTC
[GitHub] [skywalking] 844700118 opened a new issue #8026: k8s service collection error
844700118 opened a new issue #8026:
URL: https://github.com/apache/skywalking/issues/8026
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues.
### Apache SkyWalking Component
OAP server (apache/skywalking)
### What happened
**1. The sub-module "cluster" "node" under the k8s module of the dashboard has data, which is normal, but the sub-module "service" has no data displayed**
**2.Server error log**
**[root@k8s-master ~/apache-skywalking-apm-bin-es7]#tail -f logs/skywalking-oap-server.log**
......
2021-10-27 19:00:32,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Pod-1] INFO [] - class io.kubernetes.client.openapi.models.V1Pod#Start listing and watching...
2021-10-27 19:00:32,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Service-1] INFO [] - class io.kubernetes.client.openapi.models.V1Service#Start listing and watching...
2021-10-27 19:00:33,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Pod-1] INFO [] - class io.kubernetes.client.openapi.models.V1Pod#Start listing and watching...
2021-10-27 19:00:33,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Service-1] INFO [] - class io.kubernetes.client.openapi.models.V1Service#Start listing and watching...
2021-10-27 19:00:34,463 - org.apache.skywalking.oap.meter.analyzer.dsl.Expression - 88 [grpcServerPool-1-thread-17] ERROR [] - failed to run "(100 - ((node_memory_SwapFree_bytes * 100) / node_memory_SwapTotal_bytes)).tag({tags -> tags.node_identifier_host_name = 'vm::' + tags.node_identifier_host_name}).service(['node_identifier_host_name'])"
java.lang.IllegalArgumentException: null
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:128) ~[guava-28.1-jre.jar:?]
at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.build(SampleFamily.java:78) ~[meter-analyzer-8.7.0.jar:8.7.0]
at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.newValue(SampleFamily.java:487) ~[meter-analyzer-8.7.0.jar:8.7.0]
at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.div(SampleFamily.java:193) ~[meter-analyzer-8.7.0.jar:8.7.0]
at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily$div$9.call(Unknown Source) ~[?:?]
at Script1.run(Script1.groovy:1) ~[?:?]
at org.apache.skywalking.oap.meter.analyzer.dsl.Expression.run(Expression.java:77) ~[meter-analyzer-8.7.0.jar:8.7.0]
at org.apache.skywalking.oap.meter.analyzer.Analyzer.analyse(Analyzer.java:115) ~[meter-analyzer-8.7.0.jar:8.7.0]
at org.apache.skywalking.oap.meter.analyzer.MetricConvert.toMeter(MetricConvert.java:73) ~[meter-analyzer-8.7.0.jar:8.7.0]
at org.apache.skywalking.oap.meter.analyzer.prometheus.PrometheusMetricConverter.toMeter(PrometheusMetricConverter.java:84) ~[meter-analyzer-8.7.0.jar:8.7.0]
at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.lambda$onNext$6(OCMetricHandler.java:79) ~[otel-receiver-plugin-8.7.0.jar:8.7.0]
at java.util.ArrayList.forEach(ArrayList.java:1259) [?:1.8.0_262]
at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.onNext(OCMetricHandler.java:79) [otel-receiver-plugin-8.7.0.jar:8.7.0]
at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.onNext(OCMetricHandler.java:61) [otel-receiver-plugin-8.7.0.jar:8.7.0]
at io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:249) [grpc-stub-1.32.1.jar:1.32.1]
at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309) [grpc-core-1.32.1.jar:1.32.1]
at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292) [grpc-core-1.32.1.jar:1.32.1]
at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782) [grpc-core-1.32.1.jar:1.32.1]
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [grpc-core-1.32.1.jar:1.32.1]
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [grpc-core-1.32.1.jar:1.32.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_262]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_262]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
**3.k8s indicator monitoring is normal**
**[root@master131 ~]# kubectl logs -f -n kube-system kube-state-metrics-0**
I1027 10:01:11.984341 1 main.go:106] Using default resources
I1027 10:01:12.128159 1 main.go:118] Using all namespace
I1027 10:01:12.128166 1 main.go:139] metric allow-denylisting: Excluding the following lists that were on denylist:
W1027 10:01:12.128948 1 client_config.go:615] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I1027 10:01:12.212866 1 main.go:241] Testing communication with server
I1027 10:01:12.303482 1 main.go:246] Running with Kubernetes cluster version: v1.20. git version: v1.20.2. git tree state: clean. commit: faecb196815e248d3ecfb03c680a4507229c2a56. platform: linux/amd64
I1027 10:01:12.303518 1 main.go:248] Communication with server successful
I1027 10:01:12.303837 1 main.go:204] Starting metrics server: [::]:8080
I1027 10:01:12.303864 1 metrics_handler.go:102] Autosharding enabled with pod=kube-state-metrics-0 pod_namespace=kube-system
I1027 10:01:12.303886 1 metrics_handler.go:103] Auto detecting sharding settings.
I1027 10:01:12.303881 1 main.go:193] Starting kube-state-metrics self metrics server: [::]:8081
I1027 10:01:12.304116 1 main.go:64] levelinfomsgTLS is disabled.http2false
I1027 10:01:12.304203 1 main.go:64] levelinfomsgTLS is disabled.http2false
I1027 10:01:12.363206 1 builder.go:190] Active resources: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments
**4. Data collection OpenTelemetry is normal**
**[root@master131 ~]# kubectl logs -f otel-collector-7bb5b98564-stvdg**
2021-10-27T11:34:43.650Z info service/collector.go:262 Starting otelcol... {"Version": "v0.29.0", "NumCPU": 28}
2021-10-27T11:34:43.657Z info service/collector.go:322 Using memory ballast {"MiBs": 683}
2021-10-27T11:34:43.657Z info service/collector.go:170 Setting up own telemetry...
2021-10-27T11:34:43.659Z info service/telemetry.go:99 Serving Prometheus metrics {"address": ":8888", "level": 0, "service.instance.id": "9903e31e-d72f-4222-a2a8-32c94a0836db"}
2021-10-27T11:34:43.659Z info service/collector.go:205 Loading configuration...
2021-10-27T11:34:43.662Z info service/collector.go:221 Applying configuration...
2021-10-27T11:34:43.662Z info builder/exporters_builder.go:274 Exporter was built. {"kind": "exporter", "exporter": "opencensus"}
2021-10-27T11:34:43.662Z info builder/exporters_builder.go:274 Exporter was built. {"kind": "exporter", "exporter": "logging"}
2021-10-27T11:34:43.662Z info builder/pipelines_builder.go:204 Pipeline was built. {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2021-10-27T11:34:43.662Z info builder/receivers_builder.go:230 Receiver was built. {"kind": "receiver", "name": "prometheus", "datatype": "metrics"}
2021-10-27T11:34:43.662Z info service/service.go:137 Starting extensions...
2021-10-27T11:34:43.662Z info builder/extensions_builder.go:53 Extension is starting... {"kind": "extension", "name": "health_check"}
2021-10-27T11:34:43.662Z info healthcheckextension/healthcheckextension.go:41 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Port":0,"TCPAddr":{"Endpoint":"0.0.0.0:13133"}}}
2021-10-27T11:34:43.662Z info builder/extensions_builder.go:59 Extension started. {"kind": "extension", "name": "health_check"}
2021-10-27T11:34:43.662Z info builder/extensions_builder.go:53 Extension is starting... {"kind": "extension", "name": "zpages"}
2021-10-27T11:34:43.662Z info zpagesextension/zpagesextension.go:42 Register Host's zPages {"kind": "extension", "name": "zpages"}
2021-10-27T11:34:43.662Z info zpagesextension/zpagesextension.go:55 Starting zPages extension {"kind": "extension", "name": "zpages", "config": {"TCPAddr":{"Endpoint":"localhost:55679"}}}
2021-10-27T11:34:43.662Z info builder/extensions_builder.go:59 Extension started. {"kind": "extension", "name": "zpages"}
2021-10-27T11:34:43.662Z info service/service.go:182 Starting exporters...
2021-10-27T11:34:43.662Z info builder/exporters_builder.go:92 Exporter is starting... {"kind": "exporter", "name": "opencensus"}
2021-10-27T11:34:43.662Z info builder/exporters_builder.go:97 Exporter started. {"kind": "exporter", "name": "opencensus"}
2021-10-27T11:34:43.662Z info builder/exporters_builder.go:92 Exporter is starting... {"kind": "exporter", "name": "logging"}
2021-10-27T11:34:43.662Z info builder/exporters_builder.go:97 Exporter started. {"kind": "exporter", "name": "logging"}
2021-10-27T11:34:43.662Z info service/service.go:187 Starting processors...
2021-10-27T11:34:43.662Z info builder/pipelines_builder.go:51 Pipeline is starting... {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2021-10-27T11:34:43.662Z info builder/pipelines_builder.go:62 Pipeline is started. {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2021-10-27T11:34:43.662Z info service/service.go:192 Starting receivers...
2021-10-27T11:34:43.662Z info builder/receivers_builder.go:70 Receiver is starting... {"kind": "receiver", "name": "prometheus"}
2021-10-27T11:34:43.663Z info kubernetes/kubernetes.go:282 Using pod service account via in-cluster config {"kind": "receiver", "name": "prometheus", "level": "info", "discovery": "kubernetes"}
2021-10-27T11:34:43.679Z info kubernetes/kubernetes.go:282 Using pod service account via in-cluster config {"kind": "receiver", "name": "prometheus", "level": "info", "discovery": "kubernetes"}
2021-10-27T11:34:43.680Z info discovery/manager.go:195 Starting provider {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "static/0", "subs": "[jvm-node-exporter]"}
2021-10-27T11:34:43.680Z info discovery/manager.go:195 Starting provider {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "kubernetes/1", "subs": "[kubernetes-cadvisor]"}
2021-10-27T11:34:43.680Z info discovery/manager.go:195 Starting provider {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "kubernetes/2", "subs": "[kube-state-metrics]"}
2021-10-27T11:34:43.680Z info builder/receivers_builder.go:75 Receiver started. {"kind": "receiver", "name": "prometheus"}
2021-10-27T11:34:43.680Z info discovery/manager.go:213 Discoverer channel closed {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "static/0"}
2021-10-27T11:34:43.680Z info healthcheck/handler.go:129 Health Check state change {"kind": "extension", "name": "health_check", "status": "ready"}
2021-10-27T11:34:43.680Z info service/collector.go:182 Everything is ready. Begin running and processing data.
2021-10-27T11:34:50.493Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 170}
2021-10-27T11:34:50.493Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 170}
2021-10-27T11:34:50.708Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 70}
2021-10-27T11:34:51.930Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 46}
2021-10-27T11:34:52.944Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 70}
**5. I am not sure if the OpenTelemetry configuration is correct**
**[root@master131 ~]# vi ./otel-collector-config.yaml**
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-conf
labels:
app: opentelemetry
component: otel-collector-conf
namespace: default
data:
otel-collector-config: |
#1. Data export
receivers:
prometheus:
config:
global:
scrape_interval: 5s
evaluation_interval: 5s
scrape_configs:
#Collect jvm
- job_name: 'jvm-node-exporter'
static_configs:
- targets: ['192.168.1.131:9110']
#Collect k8s
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [] # relabel the cluster name
target_label: cluster
replacement: k8s-131
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/$${1}/proxy/metrics/cadvisor
- source_labels: [instance] # relabel the node name
separator: ;
regex: (.+)
target_label: node
replacement: $$1
action: replace
- job_name: kube-state-metrics
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
regex: kube-state-metrics
replacement: $$1
action: keep
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [] # relabel the cluster name
target_label: cluster
replacement: k8s-131
#2.Workflow, preprocessing work done before exporting the data source
processors:
batch:
#Self-health check
extensions:
health_check: {}
zpages: {}
#3.data import
exporters:
opencensus:
endpoint: "192.168.1.214:11800"
insecure: true
logging:
logLevel: info
service:
extensions: [health_check, zpages]
pipelines:
metrics:
receivers: [prometheus]
processors: [batch]
exporters: [opencensus,logging]
---
apiVersion: v1
kind: Service
metadata:
name: otel-collector
labels:
app: opentelemetry
component: otel-collector
namespace: default
spec:
type: NodePort
ports:
- name: metrics
port: 8888
targetPort: 8888
nodePort: 58888
selector:
component: otel-collector
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
labels:
app: opentelemetry
component: otel-collector
namespace: default
spec:
selector:
matchLabels:
app: opentelemetry
component: otel-collector
minReadySeconds: 5
progressDeadlineSeconds: 120
replicas: 1
template:
metadata:
labels:
app: opentelemetry
component: otel-collector
spec:
serviceAccountName: prometheus
containers:
- command:
- "/otelcol"
- "--config=/conf/otel-collector-config.yaml"
- "--log-level=info"
- "--mem-ballast-size-mib=683"
image: otel/opentelemetry-collector:0.29.0
name: otel-collector
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 200m
memory: 400Mi
ports:
- containerPort: 55679 # ZPages endpoint
- containerPort: 55680 # ZPages endpoint
- containerPort: 4317 # OpenTelemetry receiver
- containerPort: 8888 # querying metrics
volumeMounts:
- name: otel-collector-config-vol
mountPath: /conf
volumes:
- configMap:
name: otel-collector-conf
items:
- key: otel-collector-config
path: otel-collector-config.yaml
name: otel-collector-config-vol
### What you expected to happen
It may be a problem with the OpenTelemetry Collector configuration, but I don't know where the problem is. Ask for help.
### How to reproduce
The OpenTelemetry Collector configuration file is described above.
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #8026: k8s service collection error
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8026:
URL: https://github.com/apache/skywalking/issues/8026#issuecomment-953430457
> Are you willing to submit PR?
Yes I am willing to submit a PR!
Are you sure? If so, this issue will be assigned to yourself. We will wait for pull request only.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng closed issue #8026: k8s service collection error
Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #8026:
URL: https://github.com/apache/skywalking/issues/8026
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org