You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2021/10/28 01:26:08 UTC

[GitHub] [skywalking] 844700118 opened a new issue #8026: k8s service collection error

844700118 opened a new issue #8026:
URL: https://github.com/apache/skywalking/issues/8026


   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Apache SkyWalking Component
   
   OAP server (apache/skywalking)
   
   ### What happened
   
   **1. The sub-module "cluster" "node" under the k8s module of the dashboard has data, which is normal, but the sub-module "service" has no data displayed**
   
   **2.Server error log**
   **[root@k8s-master ~/apache-skywalking-apm-bin-es7]#tail -f logs/skywalking-oap-server.log**
   
       ......
       2021-10-27 19:00:32,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Pod-1] INFO  [] - class io.kubernetes.client.openapi.models.V1Pod#Start listing and watching...
       2021-10-27 19:00:32,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Service-1] INFO  [] - class io.kubernetes.client.openapi.models.V1Service#Start listing and watching...
       2021-10-27 19:00:33,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Pod-1] INFO  [] - class io.kubernetes.client.openapi.models.V1Pod#Start listing and watching...
       2021-10-27 19:00:33,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Service-1] INFO  [] - class io.kubernetes.client.openapi.models.V1Service#Start listing and watching...
       2021-10-27 19:00:34,463 - org.apache.skywalking.oap.meter.analyzer.dsl.Expression - 88 [grpcServerPool-1-thread-17] ERROR [] - failed to run "(100 - ((node_memory_SwapFree_bytes * 100) / node_memory_SwapTotal_bytes)).tag({tags -> tags.node_identifier_host_name = 'vm::' + tags.node_identifier_host_name}).service(['node_identifier_host_name'])"
       java.lang.IllegalArgumentException: null
               at com.google.common.base.Preconditions.checkArgument(Preconditions.java:128) ~[guava-28.1-jre.jar:?]
               at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.build(SampleFamily.java:78) ~[meter-analyzer-8.7.0.jar:8.7.0]
               at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.newValue(SampleFamily.java:487) ~[meter-analyzer-8.7.0.jar:8.7.0]
               at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.div(SampleFamily.java:193) ~[meter-analyzer-8.7.0.jar:8.7.0]
               at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily$div$9.call(Unknown Source) ~[?:?]
               at Script1.run(Script1.groovy:1) ~[?:?]
               at org.apache.skywalking.oap.meter.analyzer.dsl.Expression.run(Expression.java:77) ~[meter-analyzer-8.7.0.jar:8.7.0]
               at org.apache.skywalking.oap.meter.analyzer.Analyzer.analyse(Analyzer.java:115) ~[meter-analyzer-8.7.0.jar:8.7.0]
               at org.apache.skywalking.oap.meter.analyzer.MetricConvert.toMeter(MetricConvert.java:73) ~[meter-analyzer-8.7.0.jar:8.7.0]
               at org.apache.skywalking.oap.meter.analyzer.prometheus.PrometheusMetricConverter.toMeter(PrometheusMetricConverter.java:84) ~[meter-analyzer-8.7.0.jar:8.7.0]
               at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.lambda$onNext$6(OCMetricHandler.java:79) ~[otel-receiver-plugin-8.7.0.jar:8.7.0]
               at java.util.ArrayList.forEach(ArrayList.java:1259) [?:1.8.0_262]
               at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.onNext(OCMetricHandler.java:79) [otel-receiver-plugin-8.7.0.jar:8.7.0]
               at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.onNext(OCMetricHandler.java:61) [otel-receiver-plugin-8.7.0.jar:8.7.0]
               at io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:249) [grpc-stub-1.32.1.jar:1.32.1]
               at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309) [grpc-core-1.32.1.jar:1.32.1]
               at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292) [grpc-core-1.32.1.jar:1.32.1]
               at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782) [grpc-core-1.32.1.jar:1.32.1]
               at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [grpc-core-1.32.1.jar:1.32.1]
               at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [grpc-core-1.32.1.jar:1.32.1]
               at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_262]
               at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_262]
               at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
   
   **3.k8s indicator monitoring is normal**
   **[root@master131 ~]#   kubectl logs -f -n kube-system kube-state-metrics-0**
   
       I1027 10:01:11.984341       1 main.go:106] Using default resources
       I1027 10:01:12.128159       1 main.go:118] Using all namespace
       I1027 10:01:12.128166       1 main.go:139] metric allow-denylisting: Excluding the following lists that were on denylist: 
       W1027 10:01:12.128948       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
       I1027 10:01:12.212866       1 main.go:241] Testing communication with server
       I1027 10:01:12.303482       1 main.go:246] Running with Kubernetes cluster version: v1.20. git version: v1.20.2. git tree state: clean. commit: faecb196815e248d3ecfb03c680a4507229c2a56. platform: linux/amd64
       I1027 10:01:12.303518       1 main.go:248] Communication with server successful
       I1027 10:01:12.303837       1 main.go:204] Starting metrics server: [::]:8080
       I1027 10:01:12.303864       1 metrics_handler.go:102] Autosharding enabled with pod=kube-state-metrics-0 pod_namespace=kube-system
       I1027 10:01:12.303886       1 metrics_handler.go:103] Auto detecting sharding settings.
       I1027 10:01:12.303881       1 main.go:193] Starting kube-state-metrics self metrics server: [::]:8081
       I1027 10:01:12.304116       1 main.go:64] levelinfomsgTLS is disabled.http2false
       I1027 10:01:12.304203       1 main.go:64] levelinfomsgTLS is disabled.http2false
       I1027 10:01:12.363206       1 builder.go:190] Active resources: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments
   
   **4. Data collection OpenTelemetry is normal**
   **[root@master131 ~]# kubectl logs -f otel-collector-7bb5b98564-stvdg**
   
       2021-10-27T11:34:43.650Z        info    service/collector.go:262        Starting otelcol...     {"Version": "v0.29.0", "NumCPU": 28}
       2021-10-27T11:34:43.657Z        info    service/collector.go:322        Using memory ballast    {"MiBs": 683}
       2021-10-27T11:34:43.657Z        info    service/collector.go:170        Setting up own telemetry...
       2021-10-27T11:34:43.659Z        info    service/telemetry.go:99 Serving Prometheus metrics      {"address": ":8888", "level": 0, "service.instance.id": "9903e31e-d72f-4222-a2a8-32c94a0836db"}
       2021-10-27T11:34:43.659Z        info    service/collector.go:205        Loading configuration...
       2021-10-27T11:34:43.662Z        info    service/collector.go:221        Applying configuration...
       2021-10-27T11:34:43.662Z        info    builder/exporters_builder.go:274        Exporter was built.     {"kind": "exporter", "exporter": "opencensus"}
       2021-10-27T11:34:43.662Z        info    builder/exporters_builder.go:274        Exporter was built.     {"kind": "exporter", "exporter": "logging"}
       2021-10-27T11:34:43.662Z        info    builder/pipelines_builder.go:204        Pipeline was built.     {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
       2021-10-27T11:34:43.662Z        info    builder/receivers_builder.go:230        Receiver was built.     {"kind": "receiver", "name": "prometheus", "datatype": "metrics"}
       2021-10-27T11:34:43.662Z        info    service/service.go:137  Starting extensions...
       2021-10-27T11:34:43.662Z        info    builder/extensions_builder.go:53        Extension is starting...        {"kind": "extension", "name": "health_check"}
       2021-10-27T11:34:43.662Z        info    healthcheckextension/healthcheckextension.go:41 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Port":0,"TCPAddr":{"Endpoint":"0.0.0.0:13133"}}}
       2021-10-27T11:34:43.662Z        info    builder/extensions_builder.go:59        Extension started.      {"kind": "extension", "name": "health_check"}
       2021-10-27T11:34:43.662Z        info    builder/extensions_builder.go:53        Extension is starting...        {"kind": "extension", "name": "zpages"}
       2021-10-27T11:34:43.662Z        info    zpagesextension/zpagesextension.go:42   Register Host's zPages  {"kind": "extension", "name": "zpages"}
       2021-10-27T11:34:43.662Z        info    zpagesextension/zpagesextension.go:55   Starting zPages extension       {"kind": "extension", "name": "zpages", "config": {"TCPAddr":{"Endpoint":"localhost:55679"}}}
       2021-10-27T11:34:43.662Z        info    builder/extensions_builder.go:59        Extension started.      {"kind": "extension", "name": "zpages"}
       2021-10-27T11:34:43.662Z        info    service/service.go:182  Starting exporters...
       2021-10-27T11:34:43.662Z        info    builder/exporters_builder.go:92 Exporter is starting... {"kind": "exporter", "name": "opencensus"}
       2021-10-27T11:34:43.662Z        info    builder/exporters_builder.go:97 Exporter started.       {"kind": "exporter", "name": "opencensus"}
       2021-10-27T11:34:43.662Z        info    builder/exporters_builder.go:92 Exporter is starting... {"kind": "exporter", "name": "logging"}
       2021-10-27T11:34:43.662Z        info    builder/exporters_builder.go:97 Exporter started.       {"kind": "exporter", "name": "logging"}
       2021-10-27T11:34:43.662Z        info    service/service.go:187  Starting processors...
       2021-10-27T11:34:43.662Z        info    builder/pipelines_builder.go:51 Pipeline is starting... {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
       2021-10-27T11:34:43.662Z        info    builder/pipelines_builder.go:62 Pipeline is started.    {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
       2021-10-27T11:34:43.662Z        info    service/service.go:192  Starting receivers...
       2021-10-27T11:34:43.662Z        info    builder/receivers_builder.go:70 Receiver is starting... {"kind": "receiver", "name": "prometheus"}
       2021-10-27T11:34:43.663Z        info    kubernetes/kubernetes.go:282    Using pod service account via in-cluster config {"kind": "receiver", "name": "prometheus", "level": "info", "discovery": "kubernetes"}
       2021-10-27T11:34:43.679Z        info    kubernetes/kubernetes.go:282    Using pod service account via in-cluster config {"kind": "receiver", "name": "prometheus", "level": "info", "discovery": "kubernetes"}
       2021-10-27T11:34:43.680Z        info    discovery/manager.go:195        Starting provider       {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "static/0", "subs": "[jvm-node-exporter]"}
       2021-10-27T11:34:43.680Z        info    discovery/manager.go:195        Starting provider       {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "kubernetes/1", "subs": "[kubernetes-cadvisor]"}
       2021-10-27T11:34:43.680Z        info    discovery/manager.go:195        Starting provider       {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "kubernetes/2", "subs": "[kube-state-metrics]"}
       2021-10-27T11:34:43.680Z        info    builder/receivers_builder.go:75 Receiver started.       {"kind": "receiver", "name": "prometheus"}
       2021-10-27T11:34:43.680Z        info    discovery/manager.go:213        Discoverer channel closed       {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "static/0"}
       2021-10-27T11:34:43.680Z        info    healthcheck/handler.go:129      Health Check state change       {"kind": "extension", "name": "health_check", "status": "ready"}
       2021-10-27T11:34:43.680Z        info    service/collector.go:182        Everything is ready. Begin running and processing data.
       2021-10-27T11:34:50.493Z        INFO    loggingexporter/logging_exporter.go:56  MetricsExporter {"#metrics": 170}
       2021-10-27T11:34:50.493Z        INFO    loggingexporter/logging_exporter.go:56  MetricsExporter {"#metrics": 170}
       2021-10-27T11:34:50.708Z        INFO    loggingexporter/logging_exporter.go:56  MetricsExporter {"#metrics": 70}
       2021-10-27T11:34:51.930Z        INFO    loggingexporter/logging_exporter.go:56  MetricsExporter {"#metrics": 46}
       2021-10-27T11:34:52.944Z        INFO    loggingexporter/logging_exporter.go:56  MetricsExporter {"#metrics": 70}
   
   **5. I am not sure if the OpenTelemetry configuration is correct**
   **[root@master131 ~]#  vi  ./otel-collector-config.yaml**
   
       apiVersion: v1
       kind: ConfigMap
       metadata:
         name: otel-collector-conf
         labels:
           app: opentelemetry
           component: otel-collector-conf
         namespace: default
       data:
         otel-collector-config: |
           #1. Data export
           receivers:
             prometheus:
               config:
                 global:
                   scrape_interval: 5s
                   evaluation_interval: 5s
                 scrape_configs:
                   #Collect jvm
                   - job_name: 'jvm-node-exporter'
                     static_configs:
                       - targets: ['192.168.1.131:9110']
                   #Collect k8s
                   - job_name: 'kubernetes-cadvisor'
                     scheme: https
                     tls_config:
                       ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                     bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                     kubernetes_sd_configs:
                     - role: node
                     relabel_configs:
                     - action: labelmap
                       regex: __meta_kubernetes_node_label_(.+)
                     - source_labels: []       # relabel the cluster name 
                       target_label: cluster
                       replacement: k8s-131
                     - target_label: __address__
                       replacement: kubernetes.default.svc:443
                     - source_labels: [__meta_kubernetes_node_name]
                       regex: (.+)
                       target_label: __metrics_path__
                       replacement: /api/v1/nodes/$${1}/proxy/metrics/cadvisor
                     - source_labels: [instance]   # relabel the node name 
                       separator: ;
                       regex: (.+)
                       target_label: node
                       replacement: $$1
                       action: replace
                   - job_name: kube-state-metrics
                     kubernetes_sd_configs:
                     - role: endpoints
                     relabel_configs:
                     - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
                       regex: kube-state-metrics
                       replacement: $$1
                       action: keep
                     - action: labelmap
                       regex: __meta_kubernetes_service_label_(.+)
                     - source_labels: []  # relabel the cluster name 
                       target_label: cluster
                       replacement: k8s-131
           #2.Workflow, preprocessing work done before exporting the data source
           processors:
             batch:
           #Self-health check
           extensions:
             health_check: {}
             zpages: {}
           #3.data import
           exporters:
             opencensus:
               endpoint: "192.168.1.214:11800"
               insecure: true
             logging:
               logLevel: info
           service:
             extensions: [health_check, zpages]
             pipelines:
               metrics:
                 receivers: [prometheus]
                 processors: [batch]
                 exporters: [opencensus,logging]
       
       ---
       
       apiVersion: v1
       kind: Service
       metadata:
         name: otel-collector
         labels:
           app: opentelemetry
           component: otel-collector
         namespace: default
       spec:
         type: NodePort
         ports:
         - name: metrics 
           port: 8888
           targetPort: 8888
           nodePort: 58888
         selector:
           component: otel-collector
       
       ---
       
       apiVersion: apps/v1
       kind: Deployment
       metadata:
         name: otel-collector
         labels:
           app: opentelemetry
           component: otel-collector
         namespace: default
       spec:
         selector:
           matchLabels:
             app: opentelemetry
             component: otel-collector
         minReadySeconds: 5
         progressDeadlineSeconds: 120
         replicas: 1 
         template:
           metadata:
             labels:
               app: opentelemetry
               component: otel-collector
           spec:
             serviceAccountName: prometheus
             containers:
             - command:
                 - "/otelcol"
                 - "--config=/conf/otel-collector-config.yaml"
                 - "--log-level=info"
                 - "--mem-ballast-size-mib=683"
               image: otel/opentelemetry-collector:0.29.0
               name: otel-collector
               resources:
                 limits:
                   cpu: 1
                   memory: 2Gi
                 requests:
                   cpu: 200m
                   memory: 400Mi
               ports:
               - containerPort: 55679 # ZPages endpoint
               - containerPort: 55680 # ZPages endpoint
               - containerPort: 4317  # OpenTelemetry receiver
               - containerPort: 8888  # querying metrics
               volumeMounts:
               - name: otel-collector-config-vol
                 mountPath: /conf
             volumes:
               - configMap:
                   name: otel-collector-conf
                   items:
                     - key: otel-collector-config
                       path: otel-collector-config.yaml
                 name: otel-collector-config-vol
   
   
   ### What you expected to happen
   
   It may be a problem with the OpenTelemetry Collector configuration, but I don't know where the problem is. Ask for help.
   
   ### How to reproduce
   
   The OpenTelemetry Collector configuration file is described above.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #8026: k8s service collection error

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8026:
URL: https://github.com/apache/skywalking/issues/8026#issuecomment-953430457


   > Are you willing to submit PR?
    Yes I am willing to submit a PR!
   
   Are you sure? If so, this issue will be assigned to yourself. We will wait for pull request only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng closed issue #8026: k8s service collection error

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #8026:
URL: https://github.com/apache/skywalking/issues/8026


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org