You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@heron.apache.org by GitBox <gi...@apache.org> on 2020/06/18 17:27:05 UTC
[GitHub] [incubator-heron] windhamwong opened a new issue #3542: Kubernetes additional work required for deploying topology
windhamwong opened a new issue #3542:
URL: https://github.com/apache/incubator-heron/issues/3542
Kubernetes version: 1.18.2+k3s1
I am trying out to deploy `heron-api-examples.jar` as suggested in the doc. After some troubleshooting (not just some but a lot) and found that the created pods (`acking-0`, `acking-1`, `acking-2`) require to connect together. The `acking-0` is responsible for launching servers for stream manager (6001/TCP), tmaster controller (6002/TCP) and tmaster state (6003/TCP). As k8s requires `Service` to expose ports to allow external or other containers to connect, we need additional code in Heron k8s source to generate these services as well. The following k8s yaml code has been tested and working for me.
I will keep on testing out as the topology is still not working as the demo expectation. I will come back to add more after.
```
apiVersion: v1
kind: Service
metadata:
name: acking-0
spec:
selector:
statefulset.kubernetes.io/pod-name: acking-0
ports:
- name: stream-manager
protocol: TCP
port: 6001
targetPort: 6001
- name: tmaster-controller
protocol: TCP
port: 6002
targetPort: 6002
- name: tmaster-state
protocol: TCP
port: 6003
targetPort: 6003
- name: heron-shell
protocol: TCP
port: 6004
targetPort: 6004
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-heron] nicknezis commented on issue #3542: Kubernetes additional work required for deploying topology
Posted by GitBox <gi...@apache.org>.
nicknezis commented on issue #3542:
URL: https://github.com/apache/incubator-heron/issues/3542#issuecomment-650497830
I've discussed this with @windhamwong and have come up with a proposed approach to handle Heron topologies in Kubernetes. We found that it does work, but there are some edge cases that can cause the topology StatefulSet to fail.
1. TMaster looks for `/.dockerenv` to determine if Tmaster is running in a container. I have found situations in which the pod does not have this file (i.e. Kind and K3s) [Tmaster code](https://github.com/apache/incubator-heron/blob/cc815d85305dc0b665a2ccb42113cf7a49b1eb0a/heron/executor/src/python/heron_executor.py#L232)
2. If TMaster does find `/.dockerenv` it will try to use the `HOST` environment variable. I have found some use cases in which the Pod does not have this set (i.e. Kind).
3. If both of these work, then the TMaster and Stmgr processes will use the pod's IP address. If either fails, then the `socket.hostname()` call will return the pod name, which is not stored in the Kubernetes cluster DNS.
4. To enable the use of the hostname, we need to have a Headless Service registered.
The proposal:
1. Update Kubernetes Scheduler code to create a matching Headless Service for each topology created.
2. Update the Kubernetes Scheduler code to add a custom ENV variable on the StatefulSet (i.e. `HERON_HOSTNAME`)
3. Update the TMaster logic that checks for `/.dockerenv` to instead first check for `HERON_HOSTNAME` variable.
If we make these changes, this issue would be resolved.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-heron] nicknezis commented on issue #3542: Kubernetes additional work required for deploying topology
Posted by GitBox <gi...@apache.org>.
nicknezis commented on issue #3542:
URL: https://github.com/apache/incubator-heron/issues/3542#issuecomment-658298595
This issue should now be resolved with the fix in #3550
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-heron] nicknezis commented on issue #3542: Kubernetes additional work required for deploying topology
Posted by GitBox <gi...@apache.org>.
nicknezis commented on issue #3542:
URL: https://github.com/apache/incubator-heron/issues/3542#issuecomment-647085583
Is this an issue with newer Kubernetes? How best can I reproduce the issue you are seeing.
Kubernetes services are not needed for exposing ports, but are used for Service Discovery and for providing a stable IP as opposed to Pod IP which will change on pod restart. I believe the way the current Topologies are run as StatefulSets in Kubernetes, each pod has a specific name which can be referenced by the other pods in the StatefulSet. It might still be a good idea to create the Service, but I don't think this should be causing the Topology pods to fail to connect.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-heron] windhamwong commented on issue #3542: Kubernetes additional work required for deploying topology
Posted by GitBox <gi...@apache.org>.
windhamwong commented on issue #3542:
URL: https://github.com/apache/incubator-heron/issues/3542#issuecomment-646201924
Current situation:
Heron UI shows only the topology layout, but no details of config, count and instance log/info. Bolt seems to be working and showing logs of,
```
[2020-06-18 17:27:28 +0000] [STDOUT] stdout: Bolt processed 64980000 tuples in 525055 ms
[2020-06-18 17:27:28 +0000] [STDOUT] stdout: Bolt processed 64990000 tuples in 525067 ms
```
I guess the Heron doc of k8s cluster setup, and the source for k8s require a lot of work to support the latest k8s version.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-heron] nicknezis closed issue #3542: Kubernetes additional work required for deploying topology
Posted by GitBox <gi...@apache.org>.
nicknezis closed issue #3542:
URL: https://github.com/apache/incubator-heron/issues/3542
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-heron] windhamwong commented on issue #3542: Kubernetes additional work required for deploying topology
Posted by GitBox <gi...@apache.org>.
windhamwong commented on issue #3542:
URL: https://github.com/apache/incubator-heron/issues/3542#issuecomment-649713583
I believe this is an issue of newer K8s as statefulset requires service to work properly now. I tried to resolve the targeted pod without service but never works, i.e. with just name or with the full domain, `<pod>.<namespace>.svc.cluster.local`. Please suggest if I am wrong here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org