You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openwhisk.apache.org by OpenWhisk Team Slack <ra...@apache.org> on 2010/01/01 00:01:01 UTC

[slack-digest] [2022-05-03] #general

2022-05-03 18:30:41 UTC - Brendan Doyle: Has anyone ever faced an issue where some activations can't init after a vm reboot? Here's the order of behavior

1. Vm reboot, action containers are wiped beforehand when the invoker is stopped
2. Invoker is started
3. Healthcheck activations start
4. It seems like the first healthcheck activations face the issue and then the second set of healthcheck activations succeed. The init call to the function container times out after one minute with a connection refused response
5. Real traffic starts being sent but some still fail from this issue for about 2-3 minutes.
6. It resolves on its own after a couple minutes and can only be reproduced with a vm reboot and always happens on first run of the invoker after a vm reboot no matter how long you wait. Restarting the invoker after hitting the issue guarantees it won't happen again. (a simple daemon reboot does not reproduce either, it has to be a vm restart)
I don't suspect it's an openwhisk bug, it seems like it's something to do with docker so I don't expect much help here but curious if anyone's seen this before. We do pull all of our runtime containers to the machine when the invoker is started prior to accepting traffic so I know that is not the issue. We're also on the latest version of docker engine which I also know isn't technically supported. But curious if anyone has any knowledge of something related to docker that gets wiped on reboot that I should look into
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1651602641850569?thread_ts=1651602641.850569&cid=C3TPCAQG1
----