You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@apisix.apache.org by GitBox <gi...@apache.org> on 2021/10/12 03:50:48 UTC

[GitHub] [apisix-helm-chart] youngwookim opened a new issue #156: etcd is crashing when upgrading chart to 0.6.0

youngwookim opened a new issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156


   I've been running apisix chart version 0.4.0 and now I am upgrading the chart to 0.6.0.
   
   I got an error from etcd pod like following messages:
   ```
   etcd 03:39:22.52 
   etcd 03:39:22.52 Welcome to the Bitnami etcd container
   etcd 03:39:22.52 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-etcd
   etcd 03:39:22.52 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-etcd/issues
   etcd 03:39:22.52 
   etcd 03:39:22.52 INFO  ==> ** Starting etcd setup **
   etcd 03:39:22.53 INFO  ==> Validating settings in ETCD_* env vars..
   etcd 03:39:22.53 WARN  ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
   etcd 03:39:22.54 INFO  ==> Initializing etcd
   etcd 03:39:22.54 INFO  ==> Detected data from previous deployments
   etcd 03:39:32.72 INFO  ==> Updating member in existing cluster
   Error: bad member ID arg (strconv.ParseUint: parsing "": invalid syntax), expecting ID in Hex
   
   ```
   
   The etcd pod keep crashing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix-helm-chart] tokers commented on issue #156: etcd is keep crashing when upgrading chart to 0.6.0

Posted by GitBox <gi...@apache.org>.
tokers commented on issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156#issuecomment-941830516


   @youngwookim Strange, could you try to roll back this upgrade and see whether the ETCD cluster recoveries from this fault? Also, we may have to get insight into the ETCD data so that we can make sure whether the data is integral.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix-helm-chart] youngwookim commented on issue #156: etcd is keep crashing when upgrading chart to 0.6.0

Posted by GitBox <gi...@apache.org>.
youngwookim commented on issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156#issuecomment-941033102


   Thanks for the comment @tokers 
   
   Following is a command that I have done for upgrading chart version 0.4.0 to 0.6.0:
   ```
   $ helm upgrade --install apisix apisix/apisix  --namespace apisix --version 0.6.0 \
       --set apisix.replicaCount=1 \
       --set gateway.type=LoadBalancer \
       --set gateway.loadBalancerIP="......" \
       --set gateway.tls.enabled=true \
       --set dashboard.enabled=true \
       --set ingress-controller.enabled=true \
       --set allow.ipList=""
   ```
   
   After upgrading, a pod of statefulset 'apisix-etcd' is keep crashing:
   ```
   $ kubectl describe -n apisix pod/apisix-etcd-2
   Name:         apisix-etcd-2
   Namespace:    apisix
   Priority:     0
   Node:         aks-defaultpool-17674265-vmss000002/10.240.0.16
   Start Time:   Tue, 12 Oct 2021 12:31:52 +0900
   Labels:       app.kubernetes.io/instance=apisix
                 app.kubernetes.io/managed-by=Helm
                 app.kubernetes.io/name=etcd
                 controller-revision-hash=apisix-etcd-6579c5cbc8
                 helm.sh/chart=etcd-6.2.6
                 statefulset.kubernetes.io/pod-name=apisix-etcd-2
   Annotations:  cni.projectcalico.org/containerID: 25fa15ee0266d8c17d213f20bc16053477deab49f0e6e9ea82ed4867a14cbf03
                 cni.projectcalico.org/podIP: 10.244.2.133/32
                 cni.projectcalico.org/podIPs: 10.244.2.133/32
   Status:       Running
   IP:           10.244.2.133
   IPs:
     IP:           10.244.2.133
   Controlled By:  StatefulSet/apisix-etcd
   Containers:
     etcd:
       Container ID:   containerd://e5cd7024f5a7158474800be5012ab149e6237086699e4512cc13af3326c1cb12
       Image:          docker.io/bitnami/etcd:3.4.16-debian-10-r14
       Image ID:       docker.io/bitnami/etcd@sha256:ef2d499749c634588f7d281dd70cc1fb2514d57f6d42308c0fb0f2c8ca55bea4
       Ports:          2379/TCP, 2380/TCP
       Host Ports:     0/TCP, 0/TCP
       State:          Waiting
         Reason:       CrashLoopBackOff
       Last State:     Terminated
         Reason:       Error
         Exit Code:    128
         Started:      Tue, 12 Oct 2021 22:41:29 +0900
         Finished:     Tue, 12 Oct 2021 22:41:44 +0900
       Ready:          False
       Restart Count:  118
       Liveness:       exec [/opt/bitnami/scripts/etcd/healthcheck.sh] delay=60s timeout=5s period=30s #success=1 #failure=5
       Readiness:      exec [/opt/bitnami/scripts/etcd/healthcheck.sh] delay=60s timeout=5s period=10s #success=1 #failure=5
       Environment:
         BITNAMI_DEBUG:                     false
         MY_POD_IP:                          (v1:status.podIP)
         MY_POD_NAME:                       apisix-etcd-2 (v1:metadata.name)
         ETCDCTL_API:                       3
         ETCD_ON_K8S:                       yes
         ETCD_START_FROM_SNAPSHOT:          no
         ETCD_DISASTER_RECOVERY:            no
         ETCD_NAME:                         $(MY_POD_NAME)
         ETCD_DATA_DIR:                     /bitnami/etcd/data
         ETCD_LOG_LEVEL:                    info
         ALLOW_NONE_AUTHENTICATION:         yes
         ETCD_ADVERTISE_CLIENT_URLS:        http://$(MY_POD_NAME).apisix-etcd-headless.apisix.svc.cluster.local:2379
         ETCD_LISTEN_CLIENT_URLS:           http://0.0.0.0:2379
         ETCD_INITIAL_ADVERTISE_PEER_URLS:  http://$(MY_POD_NAME).apisix-etcd-headless.apisix.svc.cluster.local:2380
         ETCD_LISTEN_PEER_URLS:             http://0.0.0.0:2380
         ETCD_INITIAL_CLUSTER_TOKEN:        etcd-cluster-k8s
         ETCD_INITIAL_CLUSTER_STATE:        existing
         ETCD_INITIAL_CLUSTER:              apisix-etcd-0=http://apisix-etcd-0.apisix-etcd-headless.apisix.svc.cluster.local:2380,apisix-etcd-1=http://apisix-etcd-1.apisix-etcd-headless.apisix.svc.cluster.local:2380,apisix-etcd-2=http://apisix-etcd-2.apisix-etcd-headless.apisix.svc.cluster.local:2380
         ETCD_CLUSTER_DOMAIN:               apisix-etcd-headless.apisix.svc.cluster.local
       Mounts:
         /bitnami/etcd from data (rw)
         /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rlxf5 (ro)
   Conditions:
     Type              Status
     Initialized       True 
     Ready             False 
     ContainersReady   False 
     PodScheduled      True 
   Volumes:
     data:
       Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
       ClaimName:  data-apisix-etcd-2
       ReadOnly:   false
     kube-api-access-rlxf5:
       Type:                    Projected (a volume that contains injected data from multiple sources)
       TokenExpirationSeconds:  3607
       ConfigMapName:           kube-root-ca.crt
       ConfigMapOptional:       <nil>
       DownwardAPI:             true
   QoS Class:                   BestEffort
   Node-Selectors:              <none>
   Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
   Events:
     Type     Reason   Age                     From     Message
     ----     ------   ----                    ----     -------
     Warning  BackOff  3m45s (x2732 over 10h)  kubelet  Back-off restarting failed container
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix-helm-chart] tokers commented on issue #156: etcd is keep crashing when upgrading chart to 0.6.0

Posted by GitBox <gi...@apache.org>.
tokers commented on issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156#issuecomment-940842442


   @youngwookim Could you provide the reproduce steps? Also, it seems that the ETCD data was corrupted, the member id is invalid. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix-helm-chart] youngwookim commented on issue #156: etcd is keep crashing when upgrading chart to 0.6.0

Posted by GitBox <gi...@apache.org>.
youngwookim commented on issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156#issuecomment-942305368


   @tokers 
   
   Rollback to 0.4.0 chart:
   ```
   $ helm list -n apisix
   NAME  	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART       	APP VERSION
   apisix	apisix   	2       	2021-10-12 12:26:19.228653 +0900 KST	deployed	apisix-0.6.0	2.10.0     
   [ywkim: ~]$ helm rollback -n apisix apisix
   Rollback was a success! Happy Helming!
   [ywkim: ~]$ helm list -n apisix
   NAME  	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART       	APP VERSION
   apisix	apisix   	3       	2021-10-13 22:18:23.102311 +0900 KST	deployed	apisix-0.4.0	2.7.0      
   ```
   
   No difference, I got same message from pod:
   ```
   etcd 13:23:13.58 
   etcd 13:23:13.58 Welcome to the Bitnami etcd container
   etcd 13:23:13.58 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-etcd
   etcd 13:23:13.58 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-etcd/issues
   etcd 13:23:13.58 
   etcd 13:23:13.58 INFO  ==> ** Starting etcd setup **
   etcd 13:23:13.59 INFO  ==> Validating settings in ETCD_* env vars..
   etcd 13:23:13.60 WARN  ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
   etcd 13:23:13.60 INFO  ==> Initializing etcd
   etcd 13:23:13.60 INFO  ==> Detected data from previous deployments
   etcd 13:23:23.86 INFO  ==> Updating member in existing cluster
   Error: bad member ID arg (strconv.ParseUint: parsing "": invalid syntax), expecting ID in Hex
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix-helm-chart] tokers commented on issue #156: etcd is keep crashing when upgrading chart to 0.6.0

Posted by GitBox <gi...@apache.org>.
tokers commented on issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156#issuecomment-941830516


   @youngwookim Strange, could you try to roll back this upgrade and see whether the ETCD cluster recoveries from this fault? Also, we may have to get insight into the ETCD data so that we can make sure whether the data is integral.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix-helm-chart] youngwookim commented on issue #156: etcd is keep crashing when upgrading chart to 0.6.0

Posted by GitBox <gi...@apache.org>.
youngwookim commented on issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156#issuecomment-942305368


   @tokers 
   
   Rollback to 0.4.0 chart:
   ```
   $ helm list -n apisix
   NAME  	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART       	APP VERSION
   apisix	apisix   	2       	2021-10-12 12:26:19.228653 +0900 KST	deployed	apisix-0.6.0	2.10.0     
   [ywkim: ~]$ helm rollback -n apisix apisix
   Rollback was a success! Happy Helming!
   [ywkim: ~]$ helm list -n apisix
   NAME  	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART       	APP VERSION
   apisix	apisix   	3       	2021-10-13 22:18:23.102311 +0900 KST	deployed	apisix-0.4.0	2.7.0      
   ```
   
   No difference, I got same message from pod:
   ```
   etcd 13:23:13.58 
   etcd 13:23:13.58 Welcome to the Bitnami etcd container
   etcd 13:23:13.58 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-etcd
   etcd 13:23:13.58 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-etcd/issues
   etcd 13:23:13.58 
   etcd 13:23:13.58 INFO  ==> ** Starting etcd setup **
   etcd 13:23:13.59 INFO  ==> Validating settings in ETCD_* env vars..
   etcd 13:23:13.60 WARN  ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
   etcd 13:23:13.60 INFO  ==> Initializing etcd
   etcd 13:23:13.60 INFO  ==> Detected data from previous deployments
   etcd 13:23:23.86 INFO  ==> Updating member in existing cluster
   Error: bad member ID arg (strconv.ParseUint: parsing "": invalid syntax), expecting ID in Hex
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix-helm-chart] tokers commented on issue #156: etcd is keep crashing when upgrading chart to 0.6.0

Posted by GitBox <gi...@apache.org>.
tokers commented on issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156#issuecomment-943244587


   > @tokers
   > 
   > Rollback to 0.4.0 chart:
   > 
   > ```
   > $ helm list -n apisix
   > NAME  	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART       	APP VERSION
   > apisix	apisix   	2       	2021-10-12 12:26:19.228653 +0900 KST	deployed	apisix-0.6.0	2.10.0     
   > [ywkim: ~]$ helm rollback -n apisix apisix
   > Rollback was a success! Happy Helming!
   > [ywkim: ~]$ helm list -n apisix
   > NAME  	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART       	APP VERSION
   > apisix	apisix   	3       	2021-10-13 22:18:23.102311 +0900 KST	deployed	apisix-0.4.0	2.7.0      
   > ```
   > 
   > No difference, I got same message from pod:
   > 
   > ```
   > etcd 13:23:13.58 
   > etcd 13:23:13.58 Welcome to the Bitnami etcd container
   > etcd 13:23:13.58 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-etcd
   > etcd 13:23:13.58 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-etcd/issues
   > etcd 13:23:13.58 
   > etcd 13:23:13.58 INFO  ==> ** Starting etcd setup **
   > etcd 13:23:13.59 INFO  ==> Validating settings in ETCD_* env vars..
   > etcd 13:23:13.60 WARN  ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
   > etcd 13:23:13.60 INFO  ==> Initializing etcd
   > etcd 13:23:13.60 INFO  ==> Detected data from previous deployments
   > etcd 13:23:23.86 INFO  ==> Updating member in existing cluster
   > Error: bad member ID arg (strconv.ParseUint: parsing "": invalid syntax), expecting ID in Hex
   > ```
   
   OK, I think now we can assert that this is irrelevant to the helm chart version but the ETCD data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [apisix-helm-chart] tokers commented on issue #156: etcd is keep crashing when upgrading chart to 0.6.0

Posted by GitBox <gi...@apache.org>.
tokers commented on issue #156:
URL: https://github.com/apache/apisix-helm-chart/issues/156#issuecomment-943245314


   How many instances are in your ETCD cluster? maybe we can backup the data from the healthy node and use it in the bad one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org