You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@apisix.apache.org by GitBox <gi...@apache.org> on 2022/12/23 17:30:26 UTC

[GitHub] [apisix] MirtoBusico opened a new issue, #8566: help request: how to recover a corrupted etcd after a crash?

MirtoBusico opened a new issue, #8566:
URL: https://github.com/apache/apisix/issues/8566

   ### Description
   
   Hi all,
   I have a 3 worker node (plus 1 master) K3S cluster with Apisix 2.15.1 installed as LoadBalancer using the helm chart
   
   Every node is a KVM virtual machine on the same host.
   
   After an host crash the three etcs pods never go online.
   
   Looking at the first etcd pod (apisix-etcd-0) logs I see
   ```
   {"level":"warn","ts":"2022-12-23T17:15:46.357Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DAEMON_USER=etcd"}
   {"level":"info","ts":"2022-12-23T17:15:46.357Z","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["etcd"]}
   {"level":"warn","ts":"2022-12-23T17:15:46.357Z","caller":"etcdmain/etcd.go:446","msg":"found invalid file under data directory","filename":"member_id","data-dir":"/bitnami/etcd/data"}
   {"level":"info","ts":"2022-12-23T17:15:46.357Z","caller":"etcdmain/etcd.go:116","msg":"server has been already initialized","data-dir":"/bitnami/etcd/data","dir-type":"member"}
   {"level":"info","ts":"2022-12-23T17:15:46.357Z","caller":"embed/etcd.go:131","msg":"configuring peer listeners","listen-peer-urls":["http://0.0.0.0:2380"]}
   {"level":"info","ts":"2022-12-23T17:15:46.357Z","caller":"embed/etcd.go:139","msg":"configuring client listeners","listen-client-urls":["http://0.0.0.0:2379"]}
   {"level":"info","ts":"2022-12-23T17:15:46.358Z","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.4","git-sha":"08407ff76","go-version":"go1.16.15","go-os":"linux","go-arch":"amd64","max-cpu-set":6,"max-cpu-available":6,"member-initialized":true,"name":"apisix-etcd-0","data-dir":"/bitnami/etcd/data","wal-dir":"","wal-dir-dedicated":"","member-dir":"/bitnami/etcd/data/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://apisix-etcd-0.apisix-etcd-headless.apisix.svc.cluster.local:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://apisix-etcd-0.apisix-etcd-headless.apisix.svc.cluster.local:2379","http://apisix-etcd.apisix.svc.cluster.local:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial
 -cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-size-bytes":2147483648,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
   {"level":"info","ts":"2022-12-23T17:15:46.358Z","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/bitnami/etcd/data/member/snap/db","took":"159.119µs"}
   {"level":"info","ts":"2022-12-23T17:15:46.473Z","caller":"etcdserver/server.go:508","msg":"recovered v2 store from snapshot","snapshot-index":200002,"snapshot-size":"26 kB"}
   {"level":"warn","ts":"2022-12-23T17:15:46.474Z","caller":"snap/db.go:88","msg":"failed to find [SNAPSHOT-INDEX].snap.db","snapshot-index":200002,"snapshot-file-path":"/bitnami/etcd/data/member/snap/0000000000030d42.snap.db","error":"snap: snapshot file doesn't exist"}
   {"level":"panic","ts":"2022-12-23T17:15:46.474Z","caller":"etcdserver/server.go:515","msg":"failed to recover v3 backend from snapshot","error":"failed to find database snapshot file (snap: snapshot file doesn't exist)","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver.NewServer\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdserver/server.go:515\ngo.etcd.io/etcd/server/v3/embed.StartEtcd\n\t/go/src/go.etcd.io/etcd/release/etcd/server/embed/etcd.go:245\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcd\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdmain/etcd.go:228\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdmain/etcd.go:123\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdmain/main.go:40\nmain.main\n\t/go/src/go.etcd.io/etcd/release/etcd/server/main.go:32\nruntime.main\n\t/go/gos/go1.16.15/src/runtime/proc.go:225"}
   panic: failed to recover v3 backend from snapshot
   goroutine 1 [running]:
   ```
   
   How can I recover etcd?
   Also recreating an empty etcd is god
   
   
   
   
   
   
   
   
   ### Environment
   
   
   -    APISIX version (run apisix version):
   
   root@apisix-64fffcfb4c-55vhw:/usr/local/apisix# apisix version
   /usr/local/openresty/luajit/bin/luajit ./apisix/cli/apisix.lua version
   2.15.1
   root@apisix-64fffcfb4c-55vhw:/usr/local/apisix#
   
   -    Operating system (run uname -a):
   
   root@apisix-64fffcfb4c-55vhw:/usr/local/apisix# uname -a
   Linux apisix-64fffcfb4c-55vhw 5.15.0-53-generic #59-Ubuntu SMP Mon Oct 17 18:53:30 UTC 2022 x86_64 GNU/Linux
   root@apisix-64fffcfb4c-55vhw:/usr/local/apisix# 
   
   -    OpenResty / Nginx version (run openresty -V or nginx -V):
   -    etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info):
   -    APISIX Dashboard version, if relevant: 2.13.0
   -    Plugin runner version, for issues related to plugin runners:
   -    LuaRocks version, for installation issues (run luarocks --version):
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [apisix] shreemaan-abhishek commented on issue #8566: help request: apisix unusable because etcd don't start

Posted by "shreemaan-abhishek (via GitHub)" <gi...@apache.org>.
shreemaan-abhishek commented on issue #8566:
URL: https://github.com/apache/apisix/issues/8566#issuecomment-1708145817

   Isn't this a completely unrelated generic problem? I mean, host crashes can happen anytime anywhere.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [apisix] MirtoBusico commented on issue #8566: help request: apisix unusable because etcd don't start

Posted by "MirtoBusico (via GitHub)" <gi...@apache.org>.
MirtoBusico commented on issue #8566:
URL: https://github.com/apache/apisix/issues/8566#issuecomment-1708915392

   Hi @shreemaan-abhishek 
   The crashes are "normal"; but in this case, the corrupted etcd requires a complete wipeout of the etcd volumes.
   
   BTW moving the virtual machines vdisks from an Hard drive to an SSD solved the problem and I had no more etcd corruptions.
   
   Moreover I'm currently using Apisix 3.X so I don't know if the problem is currently present
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [apisix] shreemaan-abhishek commented on issue #8566: help request: apisix unusable because etcd don't start

Posted by "shreemaan-abhishek (via GitHub)" <gi...@apache.org>.
shreemaan-abhishek commented on issue #8566:
URL: https://github.com/apache/apisix/issues/8566#issuecomment-1709388353

   I don't think apisix can do anything to avoid data corruption. 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org