You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@apisix.apache.org by GitBox <gi...@apache.org> on 2021/11/24 09:39:49 UTC

[GitHub] [apisix-ingress-controller] han6565 opened a new issue #769: request help: controller启动后，单点apisix pod被删除后就算恢复，controller也无法再连接上

han6565 opened a new issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769


   ### Issue description
   我在k8s上部署了一个apisix，一个apisix-controller，两台配置好后controller健康检查无问题，
   当apisix pod被kill后就算pod重新拉起，controller仍然无法重新连接到apisix
   但是我后续手动admin访问pod相同地址没有问题
   是我配置问题，还是本来没有pod能访问时就应该再也无法连接呢
   `
   021-11-23T20:51:39+08:00	warn	apisix/cluster.go:452	failed to check health for cluster default: dial tcp 172.24.150.14:9180: connect: connection refused, will retry
   2021-11-23T20:51:39+08:00	warn	ingress/controller.go:660	failed to check health for default cluster: timed out waiting for the condition, give up leader
   2021-11-23T20:51:39+08:00	info	ingress/endpoint.go:83	endpoints controller exited
   2021-11-23T20:51:39+08:00	info	ingress/apisix_tls.go:71	ApisixTls controller exited
   2021-11-23T20:51:39+08:00	error	ingress/ingress.go:63	cache sync failed
   2021-11-23T20:51:39+08:00	info	ingress/ingress.go:64	ingress controller exited
   2021-11-23T20:51:39+08:00	info	ingress/apisix_upstream.go:71	ApisixUpstream controller exited
   2021-11-23T20:51:39+08:00	info	ingress/secret.go:76	secret controller exited
   2021-11-23T20:51:39+08:00	info	ingress/service.go:61	svc controller exited
   2021-11-23T20:51:39+08:00	info	ingress/namespace.go:82	namespace controller exited
   2021-11-23T20:51:39+08:00	info	ingress/pod.go:56	pod controller exited
   2021-11-23T20:51:39+08:00	info	ingress/apisix_route.go:71	ApisixRoute controller exited
   2021-11-23T20:51:39+08:00	info	ingress/apisix_consumer.go:67	ApisixConsumer controller exited
   2021-11-23T20:51:39+08:00	info	ingress/apisix_cluster_config.go:69	ApisixClusterConfig controller exited
   2021-11-23T20:51:39+08:00	info	ingress/controller.go:354	controller now is running as a candidate	{"namespace": "apisix", "pod": "apisix-ingress-controller-7994d7bb49-z5hms"}
   I1123 20:51:39.105127       1 leaderelection.go:243] attempting to acquire leader lease apisix/ingress-apisix-leader...
   2021-11-23T20:51:39+08:00	info	ingress/controller.go:307	LeaderElection	{"message": "apisix-ingress-controller-7994d7bb49-z5hms became leader", "event_type": "Normal"}
   I1123 20:51:39.111962       1 leaderelection.go:253] successfully acquired lease apisix/ingress-apisix-leader
   2021-11-23T20:51:39+08:00	info	ingress/controller.go:387	controller tries to leading ...	{"namespace": "apisix", "pod": "apisix-ingress-controller-7994d7bb49-z5hms"}
   2021-11-23T20:51:39+08:00	error	ingress/controller.go:414	failed to wait the default cluster to be ready: dial tcp 172.24.150.14:9180: connect: connection refused
   E1123 20:51:39.112199       1 leaderelection.go:325] error retrieving resource lock apisix/ingress-apisix-leader: Get https://172.24.144.1:443/apis/coordination.k8s.io/v1/namespaces/apisix/leases/ingress-apisix-leader: context canceled
   2021-11-23T20:51:39+08:00	info	ingress/controller.go:307	LeaderElection	{"message": "apisix-ingress-controller-7994d7bb49-z5hms stopped leading", "event_type": "Normal"}
   I1123 20:51:39.112223       1 leaderelection.go:278] failed to renew lease apisix/ingress-apisix-leader: timed out waiting for the condition
   2021-11-23T20:51:39+08:00	info	ingress/controller.go:354	controller now is running as a candidate	{"namespace": "apisix", "pod": "apisix-ingress-controller-7994d7bb49-z5hms"}
   I1123 20:51:39.112241       1 leaderelection.go:243] attempting to acquire leader lease apisix/ingress-apisix-leader...
   2021-11-23T20:51:39+08:00	info	apisix/cluster.go:156	syncing cache	{"cluster": "default"}
   2021-11-23T20:51:39+08:00	info	apisix/cluster.go:347	syncing schema	{"cluster": "default"}
   2021-11-23T20:51:39+08:00	error	apisix/plugin.go:46	failed to list plugins' names: Get http://172.24.150.14:9180/apisix/admin/plugins/list: context canceled
   2021-11-23T20:51:39+08:00	error	apisix/cluster.go:367	failed to list plugin names in APISIX: Get http://172.24.150.14:9180/apisix/admin/plugins/list: context canceled
   2021-11-23T20:51:39+08:00	warn	apisix/cluster.go:330	failed to sync schema: Get http://172.24.150.14:9180/apisix/admin/plugins/list: context canceled
   2021-11-23T20:51:39+08:00	error	apisix/route.go:119	failed to list routes: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:39+08:00	error	apisix/cluster.go:200	failed to list route in APISIX: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:39+08:00	info	ingress/controller.go:307	LeaderElection	{"message": "apisix-ingress-controller-7994d7bb49-z5hms became leader", "event_type": "Normal"}
   I1123 20:51:39.118355       1 leaderelection.go:253] successfully acquired lease apisix/ingress-apisix-leader
   2021-11-23T20:51:39+08:00	info	ingress/controller.go:387	controller tries to leading ...	{"namespace": "apisix", "pod": "apisix-ingress-controller-7994d7bb49-z5hms"}
   2021-11-23T20:51:39+08:00	warn	apisix/cluster.go:307	waiting cluster default to ready, it may takes a while
   2021-11-23T20:51:41+08:00	error	apisix/route.go:119	failed to list routes: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:41+08:00	error	apisix/cluster.go:200	failed to list route in APISIX: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   [GIN] 2021/11/23 - 20:51:42 | 200 |      42.863µs |    172.24.248.6 | GET      "/healthz"
   2021-11-23T20:51:43+08:00	error	apisix/route.go:119	failed to list routes: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:43+08:00	error	apisix/cluster.go:200	failed to list route in APISIX: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:45+08:00	error	apisix/route.go:119	failed to list routes: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:45+08:00	error	apisix/cluster.go:200	failed to list route in APISIX: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   [GIN] 2021/11/23 - 20:51:46 | 200 |      41.607µs |    172.24.248.6 | GET      "/healthz"
   2021-11-23T20:51:47+08:00	error	apisix/route.go:119	failed to list routes: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:47+08:00	error	apisix/cluster.go:200	failed to list route in APISIX: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:47+08:00	error	apisix/cluster.go:166	failed to sync cache	{"cost_time": "8.001080415s", "cluster": "default"}
   2021-11-23T20:51:47+08:00	error	ingress/controller.go:414	failed to wait the default cluster to be ready: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:47+08:00	info	ingress/controller.go:354	controller now is running as a candidate	{"namespace": "apisix", "pod": "apisix-ingress-controller-7994d7bb49-z5hms"}
   I1123 20:51:47.113474       1 leaderelection.go:243] attempting to acquire leader lease apisix/ingress-apisix-leader...
   2021-11-23T20:51:47+08:00	info	apisix/cluster.go:347	syncing schema	{"cluster": "default"}
   2021-11-23T20:51:47+08:00	error	apisix/plugin.go:46	failed to list plugins' names: Get http://172.24.150.14:9180/apisix/admin/plugins/list: context canceled
   2021-11-23T20:51:47+08:00	error	apisix/cluster.go:367	failed to list plugin names in APISIX: Get http://172.24.150.14:9180/apisix/admin/plugins/list: context canceled
   2021-11-23T20:51:47+08:00	info	apisix/cluster.go:156	syncing cache	{"cluster": "default"}
   2021-11-23T20:51:47+08:00	error	apisix/route.go:119	failed to list routes: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:47+08:00	error	apisix/cluster.go:200	failed to list route in APISIX: Get http://172.24.150.14:9180/apisix/admin/routes: context canceled
   2021-11-23T20:51:47+08:00	warn	apisix/cluster.go:330	failed to sync schema: Get http://172.24.150.14:9180/apisix/admin/plugins/list: context canceled
   2021-11-23T20:51:47+08:00	info	ingress/controller.go:307	LeaderElection	{"message": "apisix-ingress-controller-7994d7bb49-z5hms became leader", "event_type": "Normal"}
   I1123 20:51:47.119539       1 leaderelection.go:253] successfully acquired lease apisix/ingress-apisix-leader
   2021-11-23T20:51:47+08:00	info	ingress/controller.go:387	controller tries to leading ...	{"namespace": "apisix", "pod": "apisix-ingress-controller-7994d7bb49-z5hms"}
   2021-11-23T20:51:47+08:00	warn	apisix/cluster.go:307	waiting cluster default to ready, it may takes a while
   `
   
   
   ### Environment
   
   * your apisix-ingress-controller version (output of `apisix-ingress-controller version --long`);
   * 1.3.0
   * your Kubernetes cluster version (output of `kubectl version`);
   * Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.14", GitCommit:"89182bdd065fbcaffefec691908a739d161efc03", GitTreeState:"clean", BuildDate:"2020-12-18T12:02:35Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
   * if you run apisix-ingress-controller in Bare-metal environment, also show your OS version (`uname -a`).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] tao12345666333 closed issue #769: bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix

Posted by GitBox <gi...@apache.org>.

tao12345666333 closed issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] stone2world commented on issue #769: bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix

Posted by GitBox <gi...@apache.org>.

stone2world commented on issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769#issuecomment-999454690


   ![image](https://user-images.githubusercontent.com/43597843/147076070-06deddf9-5316-495b-8fb0-706a2b062b7c.png)
   ![image](https://user-images.githubusercontent.com/43597843/147076523-cf169826-9246-475f-b46d-d7abc56f12d4.png)
   ![image](https://user-images.githubusercontent.com/43597843/147076609-16ca0702-d559-430c-86b4-507679053d2b.png)
   i get it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] tokers commented on issue #769: bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix

Posted by GitBox <gi...@apache.org>.

tokers commented on issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769#issuecomment-978728487


   I can give you some clues about this issue.
   
   When the controller gets the opportunity to be the new leader, it tries to add the cluster (name is `default`) again, but since the cluster was already there (added in its last term), the new one won't be added, and the controller will still use the old cluster, and in the old cluster, the `cacheSyncErr` was cached and will be used directly when calling `HasSynced`. So the controller won't enter the state for watching Kubernetes resources.
   
   A simple solution for this is destroyed the old cluster when it gives up the leader role.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] tao12345666333 commented on issue #769: bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix

Posted by GitBox <gi...@apache.org>.

tao12345666333 commented on issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769#issuecomment-977826777


   cc @gxthrj  PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] tao12345666333 commented on issue #769: request help: controller启动后，单点apisix pod被删除后就算恢复，controller也无法再连接上

Posted by GitBox <gi...@apache.org>.

tao12345666333 commented on issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769#issuecomment-977825321


   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] tao12345666333 commented on issue #769: request help: controller启动后，单点apisix pod被删除后就算恢复，controller也无法再连接上

Posted by GitBox <gi...@apache.org>.

tao12345666333 commented on issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769#issuecomment-977749612


   I will try to reproduce.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] tao12345666333 commented on issue #769: bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix

Posted by GitBox <gi...@apache.org>.

tao12345666333 commented on issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769#issuecomment-1000762449


   #774  has been merged. It will be released in v1.4 (next week)
   
   If you want to try it now, you can also build docker image to use it. 
   
   I will close this one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] Zhang21 commented on issue #769: bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix

Posted by GitBox <gi...@apache.org>.

Zhang21 commented on issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769#issuecomment-989632996


   Same error about apisix ingress accesses to apisix admin api.
   
   ```
   2021-12-09T16:37:00+08:00	info	ingress/controller.go:290	LeaderElection	{"message": "apisix-ingress-controller-769ddc5457-68d2z became leader", "event_type": "Normal"}
   I1209 16:37:00.635044       1 leaderelection.go:253] successfully acquired lease ingress-apisix/ingress-apisix-leader
   2021-12-09T16:37:00+08:00	info	ingress/controller.go:370	controller tries to leading ...	{"namespace": "ingress-apisix", "pod": "apisix-ingress-controller-769ddc5457-68d2z"}
   2021-12-09T16:37:00+08:00	warn	apisix/cluster.go:304	waiting cluster default to ready, it may takes a while
   2021-12-09T16:37:02+08:00	error	apisix/route.go:117	failed to list routes: Get http://apisix-admin.ingress-apisix.svc.cluster.local:9180/apisix/admin/routes: context canceled
   2021-12-09T16:37:02+08:00	error	apisix/cluster.go:197	failed to list route in APISIX: Get http://apisix-admin.ingress-apisix.svc.cluster.local:9180/apisix/admin/routes: context canceled
   [GIN] 2021/12/09 - 16:37:03 | 200 |     162.264µs |    172.16.2.248 | GET      "/healthz"
   ```
   
   <br/>
   
   I can access apisix admin api by curl:
   
   ```bash
   curl http://apisix-admin.ingress-apisix.svc.cluster.local:9180/apisix/admin/routes -H 'X-API-KEY: xxxxx'
   { result xxxxxx }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [apisix-ingress-controller] chzhuo commented on issue #769: bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix

Posted by GitBox <gi...@apache.org>.

chzhuo commented on issue #769:
URL: https://github.com/apache/apisix-ingress-controller/issues/769#issuecomment-982342929


   I encountered the same situation, but I also encountered the problem that the leader would not switch.
   Because the failed leader node give up and quickly restarts acquiring the lock again. 
   The leader not switch over last two days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@apisix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org