You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by yidan zhao <hi...@gmail.com> on 2022/07/12 02:40:28 UTC

flink native k8s 按照文档提交任务找不到对应的集群

如下步骤参考的文档 https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes

版本:1.15

(1)创建集群:
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
(2)提交任务:
./bin/flink run \
    --target kubernetes-session \
    -Dkubernetes.cluster-id=my-first-flink-cluster \
    ./examples/streaming/TopSpeedWindowing.jar

svc是ClusterIp类型

第二步提交任务环节,显示如下:
Executing example with default input data.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
2022-07-12 10:23:23,021 WARN
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
Please note that Flink client operations(e.g. cancel, list, stop,
savepoint, etc.) won't work from outside the Kubernetes cluster since
'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
2022-07-12 10:23:23,027 INFO
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
Retrieve flink cluster my-first-flink-cluster successfully, JobManager
Web Interface: http://my-first-flink-cluster-rest.test:8081
2022-07-12 10:23:23,044 WARN
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
Please note that Flink client operations(e.g. cancel, list, stop,
savepoint, etc.) won't work from outside the Kubernetes cluster since
'kubernetes.rest-service.exposed.type' has been set to ClusterIP.

------------------------------------------------------------
 The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: The main
method caused an error: Failed to execute job
'CarTopSpeedWindowingExample'.
...
Caused by: org.apache.flink.util.FlinkException: Failed to execute job
'CarTopSpeedWindowingExample'.
...
Caused by: org.apache.flink.runtime.client.JobSubmissionException:
Failed to submit JobGraph.
...
Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException:
Could not complete the operation. Number of retries has been
exhausted.
...
Caused by: java.util.concurrent.CompletionException:
java.net.UnknownHostException: my-first-flink-cluster-rest.test: Name
or service not known
...
Caused by: java.net.UnknownHostException:
my-first-flink-cluster-rest.test: Name or service not known


如上,根据 --target kubernetes-session
-Dkubernetes.cluster-id=my-first-flink-cluster 找到的提交入口为
my-first-flink-cluster-rest.test。这个应该是根据k8s生成的dns,test是flink的namespace。

我本地也的确并无法解析 my-first-flink-cluster-rest.test 这个。

Re: flink native k8s 按照文档提交任务找不到对应的集群

Posted by Yang Wang <da...@gmail.com>.
你的理解是没有问题的

之所以将FlinkSessionJob拆成单独的CR来管理,主要是因为这样也更符合K8s的语义,在Session集群内每个Job也可以作为K8s资源来管理,Job状态变化就能及时更新到Status里面


Best,
Yang

yidan zhao <hi...@gmail.com> 于2022年7月14日周四 23:01写道:

> 再咨询下关于 flink-k8s-operator 的问题。
> 我看了看问的文档,提供了2个CRD,分别为 FlinkDeployment 和 FlinkSessionJob。不知道如下理解对不对:
> (1)对于 application-mode 方式提交运行的任务,则用 FlinkDeployment,并配置好 job 部分。 会自动创建
> flink 集群,并根据 job 配置运行job。
>          这种方式不需要考虑集群创建、任务提交的步骤,本身就是一体。
> (2)对于 session 集群的创建,也是用 FlinkDeployment ,只是不需要指定 job 配置即可。
> (3)配合通过(2)方式创建的 session 集群,则可以配合 FlinkSessionJob 提交任务。
>
> Yang Wang <da...@gmail.com> 于2022年7月12日周二 17:10写道:
> >
> > 如果你K8s集群内的机器配置的DNS Server也是coredns,那就可以正常解析clusterIP对应的service的
> >
> > 最初ClusterIP的设计也是让任务管理的Pod来使用,例如flink-kubernetes-operator[1]
> >
> > [1]. https://github.com/apache/flink-kubernetes-operator
> >
> > Best,
> > Yang
> >
> > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 13:17写道:
> >
> > > 我用 flink run -m 方式指定 clusterIp 是可以提交任务的。
> > > 那么使用 --target kubernetes-session
> > > -Dkubernetes.cluster-id=my-first-flink-cluster 的方式,为什么不能智能点拿到对应
> > > cluster 的 svc 的 clusterIp 去提交呢。
> > >
> > > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:50写道:
> > > >
> > > > 如果是在 k8s-master-node 上,可不可以直接用 ClusterIp 呢?
> > > >
> > > >
> > > > 其次,NodePort我大概理解,一直不是很懂 LoadBalancer 方式是什么原理。
> > > >
> > > > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:48写道:
> > > > >
> > > > > 我理解的 k8s 集群内是组成 k8s 的机器,是必须在 pod 内?我在k8s的node上也不可以是吧。
> > > > >
> > > > > Yang Wang <da...@gmail.com> 于2022年7月12日周二 12:07写道:
> > > > > >
> > > > > > 日志里面已经说明的比较清楚了,如果用的是ClusterIP的方式,那你的Flink
> > > > > > client必须在k8s集群内才能正常提交。例如:起一个Pod,然后再pod里面执行flink run
> > > > > > 否则你就需要NodePort或者LoadBalancer的方式了
> > > > > >
> > > > > > 2022-07-12 10:23:23,021 WARN
> > > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > > > savepoint, etc.) won't work from outside the Kubernetes cluster
> since
> > > > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 10:40写道:
> > > > > >
> > > > > > > 如下步骤参考的文档
> > > > > > >
> > >
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > > > > >
> > > > > > > 版本:1.15
> > > > > > >
> > > > > > > (1)创建集群:
> > > > > > >
> > > > > > >
> > >
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > > > > > (2)提交任务:
> > > > > > > ./bin/flink run \
> > > > > > >     --target kubernetes-session \
> > > > > > >     -Dkubernetes.cluster-id=my-first-flink-cluster \
> > > > > > >     ./examples/streaming/TopSpeedWindowing.jar
> > > > > > >
> > > > > > > svc是ClusterIp类型
> > > > > > >
> > > > > > > 第二步提交任务环节,显示如下:
> > > > > > > Executing example with default input data.
> > > > > > > Use --input to specify file input.
> > > > > > > Printing result to stdout. Use --output to specify output path.
> > > > > > > 2022-07-12 10:23:23,021 WARN
> > > > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor
> [] -
> > > > > > > Please note that Flink client operations(e.g. cancel, list,
> stop,
> > > > > > > savepoint, etc.) won't work from outside the Kubernetes cluster
> > > since
> > > > > > > 'kubernetes.rest-service.exposed.type' has been set to
> ClusterIP.
> > > > > > > 2022-07-12 10:23:23,027 INFO
> > > > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor
> [] -
> > > > > > > Retrieve flink cluster my-first-flink-cluster successfully,
> > > JobManager
> > > > > > > Web Interface: http://my-first-flink-cluster-rest.test:8081
> > > > > > > 2022-07-12 10:23:23,044 WARN
> > > > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor
> [] -
> > > > > > > Please note that Flink client operations(e.g. cancel, list,
> stop,
> > > > > > > savepoint, etc.) won't work from outside the Kubernetes cluster
> > > since
> > > > > > > 'kubernetes.rest-service.exposed.type' has been set to
> ClusterIP.
> > > > > > >
> > > > > > > ------------------------------------------------------------
> > > > > > >  The program finished with the following exception:
> > > > > > > org.apache.flink.client.program.ProgramInvocationException: The
> > > main
> > > > > > > method caused an error: Failed to execute job
> > > > > > > 'CarTopSpeedWindowingExample'.
> > > > > > > ...
> > > > > > > Caused by: org.apache.flink.util.FlinkException: Failed to
> execute
> > > job
> > > > > > > 'CarTopSpeedWindowingExample'.
> > > > > > > ...
> > > > > > > Caused by:
> org.apache.flink.runtime.client.JobSubmissionException:
> > > > > > > Failed to submit JobGraph.
> > > > > > > ...
> > > > > > > Caused by:
> > > org.apache.flink.util.concurrent.FutureUtils$RetryException:
> > > > > > > Could not complete the operation. Number of retries has been
> > > > > > > exhausted.
> > > > > > > ...
> > > > > > > Caused by: java.util.concurrent.CompletionException:
> > > > > > > java.net.UnknownHostException:
> my-first-flink-cluster-rest.test:
> > > Name
> > > > > > > or service not known
> > > > > > > ...
> > > > > > > Caused by: java.net.UnknownHostException:
> > > > > > > my-first-flink-cluster-rest.test: Name or service not known
> > > > > > >
> > > > > > >
> > > > > > > 如上,根据 --target kubernetes-session
> > > > > > > -Dkubernetes.cluster-id=my-first-flink-cluster 找到的提交入口为
> > > > > > >
> > > my-first-flink-cluster-rest.test。这个应该是根据k8s生成的dns,test是flink的namespace。
> > > > > > >
> > > > > > > 我本地也的确并无法解析 my-first-flink-cluster-rest.test 这个。
> > > > > > >
> > >
>

Re: flink native k8s 按照文档提交任务找不到对应的集群

Posted by yidan zhao <hi...@gmail.com>.
再咨询下关于 flink-k8s-operator 的问题。
我看了看问的文档,提供了2个CRD,分别为 FlinkDeployment 和 FlinkSessionJob。不知道如下理解对不对:
(1)对于 application-mode 方式提交运行的任务,则用 FlinkDeployment,并配置好 job 部分。 会自动创建
flink 集群,并根据 job 配置运行job。
         这种方式不需要考虑集群创建、任务提交的步骤,本身就是一体。
(2)对于 session 集群的创建,也是用 FlinkDeployment ,只是不需要指定 job 配置即可。
(3)配合通过(2)方式创建的 session 集群,则可以配合 FlinkSessionJob 提交任务。

Yang Wang <da...@gmail.com> 于2022年7月12日周二 17:10写道:
>
> 如果你K8s集群内的机器配置的DNS Server也是coredns,那就可以正常解析clusterIP对应的service的
>
> 最初ClusterIP的设计也是让任务管理的Pod来使用,例如flink-kubernetes-operator[1]
>
> [1]. https://github.com/apache/flink-kubernetes-operator
>
> Best,
> Yang
>
> yidan zhao <hi...@gmail.com> 于2022年7月12日周二 13:17写道:
>
> > 我用 flink run -m 方式指定 clusterIp 是可以提交任务的。
> > 那么使用 --target kubernetes-session
> > -Dkubernetes.cluster-id=my-first-flink-cluster 的方式,为什么不能智能点拿到对应
> > cluster 的 svc 的 clusterIp 去提交呢。
> >
> > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:50写道:
> > >
> > > 如果是在 k8s-master-node 上,可不可以直接用 ClusterIp 呢?
> > >
> > >
> > > 其次,NodePort我大概理解,一直不是很懂 LoadBalancer 方式是什么原理。
> > >
> > > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:48写道:
> > > >
> > > > 我理解的 k8s 集群内是组成 k8s 的机器,是必须在 pod 内?我在k8s的node上也不可以是吧。
> > > >
> > > > Yang Wang <da...@gmail.com> 于2022年7月12日周二 12:07写道:
> > > > >
> > > > > 日志里面已经说明的比较清楚了,如果用的是ClusterIP的方式,那你的Flink
> > > > > client必须在k8s集群内才能正常提交。例如:起一个Pod,然后再pod里面执行flink run
> > > > > 否则你就需要NodePort或者LoadBalancer的方式了
> > > > >
> > > > > 2022-07-12 10:23:23,021 WARN
> > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > > >
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 10:40写道:
> > > > >
> > > > > > 如下步骤参考的文档
> > > > > >
> > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > > > >
> > > > > > 版本:1.15
> > > > > >
> > > > > > (1)创建集群:
> > > > > >
> > > > > >
> > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > > > > (2)提交任务:
> > > > > > ./bin/flink run \
> > > > > >     --target kubernetes-session \
> > > > > >     -Dkubernetes.cluster-id=my-first-flink-cluster \
> > > > > >     ./examples/streaming/TopSpeedWindowing.jar
> > > > > >
> > > > > > svc是ClusterIp类型
> > > > > >
> > > > > > 第二步提交任务环节,显示如下:
> > > > > > Executing example with default input data.
> > > > > > Use --input to specify file input.
> > > > > > Printing result to stdout. Use --output to specify output path.
> > > > > > 2022-07-12 10:23:23,021 WARN
> > > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > > > savepoint, etc.) won't work from outside the Kubernetes cluster
> > since
> > > > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > > > > 2022-07-12 10:23:23,027 INFO
> > > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > > > Retrieve flink cluster my-first-flink-cluster successfully,
> > JobManager
> > > > > > Web Interface: http://my-first-flink-cluster-rest.test:8081
> > > > > > 2022-07-12 10:23:23,044 WARN
> > > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > > > savepoint, etc.) won't work from outside the Kubernetes cluster
> > since
> > > > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > > > >
> > > > > > ------------------------------------------------------------
> > > > > >  The program finished with the following exception:
> > > > > > org.apache.flink.client.program.ProgramInvocationException: The
> > main
> > > > > > method caused an error: Failed to execute job
> > > > > > 'CarTopSpeedWindowingExample'.
> > > > > > ...
> > > > > > Caused by: org.apache.flink.util.FlinkException: Failed to execute
> > job
> > > > > > 'CarTopSpeedWindowingExample'.
> > > > > > ...
> > > > > > Caused by: org.apache.flink.runtime.client.JobSubmissionException:
> > > > > > Failed to submit JobGraph.
> > > > > > ...
> > > > > > Caused by:
> > org.apache.flink.util.concurrent.FutureUtils$RetryException:
> > > > > > Could not complete the operation. Number of retries has been
> > > > > > exhausted.
> > > > > > ...
> > > > > > Caused by: java.util.concurrent.CompletionException:
> > > > > > java.net.UnknownHostException: my-first-flink-cluster-rest.test:
> > Name
> > > > > > or service not known
> > > > > > ...
> > > > > > Caused by: java.net.UnknownHostException:
> > > > > > my-first-flink-cluster-rest.test: Name or service not known
> > > > > >
> > > > > >
> > > > > > 如上,根据 --target kubernetes-session
> > > > > > -Dkubernetes.cluster-id=my-first-flink-cluster 找到的提交入口为
> > > > > >
> > my-first-flink-cluster-rest.test。这个应该是根据k8s生成的dns,test是flink的namespace。
> > > > > >
> > > > > > 我本地也的确并无法解析 my-first-flink-cluster-rest.test 这个。
> > > > > >
> >

Re: flink native k8s 按照文档提交任务找不到对应的集群

Posted by Yang Wang <da...@gmail.com>.
如果你K8s集群内的机器配置的DNS Server也是coredns,那就可以正常解析clusterIP对应的service的

最初ClusterIP的设计也是让任务管理的Pod来使用,例如flink-kubernetes-operator[1]

[1]. https://github.com/apache/flink-kubernetes-operator

Best,
Yang

yidan zhao <hi...@gmail.com> 于2022年7月12日周二 13:17写道:

> 我用 flink run -m 方式指定 clusterIp 是可以提交任务的。
> 那么使用 --target kubernetes-session
> -Dkubernetes.cluster-id=my-first-flink-cluster 的方式,为什么不能智能点拿到对应
> cluster 的 svc 的 clusterIp 去提交呢。
>
> yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:50写道:
> >
> > 如果是在 k8s-master-node 上,可不可以直接用 ClusterIp 呢?
> >
> >
> > 其次,NodePort我大概理解,一直不是很懂 LoadBalancer 方式是什么原理。
> >
> > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:48写道:
> > >
> > > 我理解的 k8s 集群内是组成 k8s 的机器,是必须在 pod 内?我在k8s的node上也不可以是吧。
> > >
> > > Yang Wang <da...@gmail.com> 于2022年7月12日周二 12:07写道:
> > > >
> > > > 日志里面已经说明的比较清楚了,如果用的是ClusterIP的方式,那你的Flink
> > > > client必须在k8s集群内才能正常提交。例如:起一个Pod,然后再pod里面执行flink run
> > > > 否则你就需要NodePort或者LoadBalancer的方式了
> > > >
> > > > 2022-07-12 10:23:23,021 WARN
> > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 10:40写道:
> > > >
> > > > > 如下步骤参考的文档
> > > > >
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > > >
> > > > > 版本:1.15
> > > > >
> > > > > (1)创建集群:
> > > > >
> > > > >
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > > > (2)提交任务:
> > > > > ./bin/flink run \
> > > > >     --target kubernetes-session \
> > > > >     -Dkubernetes.cluster-id=my-first-flink-cluster \
> > > > >     ./examples/streaming/TopSpeedWindowing.jar
> > > > >
> > > > > svc是ClusterIp类型
> > > > >
> > > > > 第二步提交任务环节,显示如下:
> > > > > Executing example with default input data.
> > > > > Use --input to specify file input.
> > > > > Printing result to stdout. Use --output to specify output path.
> > > > > 2022-07-12 10:23:23,021 WARN
> > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > > savepoint, etc.) won't work from outside the Kubernetes cluster
> since
> > > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > > > 2022-07-12 10:23:23,027 INFO
> > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > > Retrieve flink cluster my-first-flink-cluster successfully,
> JobManager
> > > > > Web Interface: http://my-first-flink-cluster-rest.test:8081
> > > > > 2022-07-12 10:23:23,044 WARN
> > > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > > savepoint, etc.) won't work from outside the Kubernetes cluster
> since
> > > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > > >
> > > > > ------------------------------------------------------------
> > > > >  The program finished with the following exception:
> > > > > org.apache.flink.client.program.ProgramInvocationException: The
> main
> > > > > method caused an error: Failed to execute job
> > > > > 'CarTopSpeedWindowingExample'.
> > > > > ...
> > > > > Caused by: org.apache.flink.util.FlinkException: Failed to execute
> job
> > > > > 'CarTopSpeedWindowingExample'.
> > > > > ...
> > > > > Caused by: org.apache.flink.runtime.client.JobSubmissionException:
> > > > > Failed to submit JobGraph.
> > > > > ...
> > > > > Caused by:
> org.apache.flink.util.concurrent.FutureUtils$RetryException:
> > > > > Could not complete the operation. Number of retries has been
> > > > > exhausted.
> > > > > ...
> > > > > Caused by: java.util.concurrent.CompletionException:
> > > > > java.net.UnknownHostException: my-first-flink-cluster-rest.test:
> Name
> > > > > or service not known
> > > > > ...
> > > > > Caused by: java.net.UnknownHostException:
> > > > > my-first-flink-cluster-rest.test: Name or service not known
> > > > >
> > > > >
> > > > > 如上,根据 --target kubernetes-session
> > > > > -Dkubernetes.cluster-id=my-first-flink-cluster 找到的提交入口为
> > > > >
> my-first-flink-cluster-rest.test。这个应该是根据k8s生成的dns,test是flink的namespace。
> > > > >
> > > > > 我本地也的确并无法解析 my-first-flink-cluster-rest.test 这个。
> > > > >
>

Re: flink native k8s 按照文档提交任务找不到对应的集群

Posted by yidan zhao <hi...@gmail.com>.
我用 flink run -m 方式指定 clusterIp 是可以提交任务的。
那么使用 --target kubernetes-session
-Dkubernetes.cluster-id=my-first-flink-cluster 的方式,为什么不能智能点拿到对应
cluster 的 svc 的 clusterIp 去提交呢。

yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:50写道:
>
> 如果是在 k8s-master-node 上,可不可以直接用 ClusterIp 呢?
>
>
> 其次,NodePort我大概理解,一直不是很懂 LoadBalancer 方式是什么原理。
>
> yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:48写道:
> >
> > 我理解的 k8s 集群内是组成 k8s 的机器,是必须在 pod 内?我在k8s的node上也不可以是吧。
> >
> > Yang Wang <da...@gmail.com> 于2022年7月12日周二 12:07写道:
> > >
> > > 日志里面已经说明的比较清楚了,如果用的是ClusterIP的方式,那你的Flink
> > > client必须在k8s集群内才能正常提交。例如:起一个Pod,然后再pod里面执行flink run
> > > 否则你就需要NodePort或者LoadBalancer的方式了
> > >
> > > 2022-07-12 10:23:23,021 WARN
> > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 10:40写道:
> > >
> > > > 如下步骤参考的文档
> > > > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > >
> > > > 版本:1.15
> > > >
> > > > (1)创建集群:
> > > >
> > > > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > > (2)提交任务:
> > > > ./bin/flink run \
> > > >     --target kubernetes-session \
> > > >     -Dkubernetes.cluster-id=my-first-flink-cluster \
> > > >     ./examples/streaming/TopSpeedWindowing.jar
> > > >
> > > > svc是ClusterIp类型
> > > >
> > > > 第二步提交任务环节,显示如下:
> > > > Executing example with default input data.
> > > > Use --input to specify file input.
> > > > Printing result to stdout. Use --output to specify output path.
> > > > 2022-07-12 10:23:23,021 WARN
> > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > > 2022-07-12 10:23:23,027 INFO
> > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > Retrieve flink cluster my-first-flink-cluster successfully, JobManager
> > > > Web Interface: http://my-first-flink-cluster-rest.test:8081
> > > > 2022-07-12 10:23:23,044 WARN
> > > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > >
> > > > ------------------------------------------------------------
> > > >  The program finished with the following exception:
> > > > org.apache.flink.client.program.ProgramInvocationException: The main
> > > > method caused an error: Failed to execute job
> > > > 'CarTopSpeedWindowingExample'.
> > > > ...
> > > > Caused by: org.apache.flink.util.FlinkException: Failed to execute job
> > > > 'CarTopSpeedWindowingExample'.
> > > > ...
> > > > Caused by: org.apache.flink.runtime.client.JobSubmissionException:
> > > > Failed to submit JobGraph.
> > > > ...
> > > > Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException:
> > > > Could not complete the operation. Number of retries has been
> > > > exhausted.
> > > > ...
> > > > Caused by: java.util.concurrent.CompletionException:
> > > > java.net.UnknownHostException: my-first-flink-cluster-rest.test: Name
> > > > or service not known
> > > > ...
> > > > Caused by: java.net.UnknownHostException:
> > > > my-first-flink-cluster-rest.test: Name or service not known
> > > >
> > > >
> > > > 如上,根据 --target kubernetes-session
> > > > -Dkubernetes.cluster-id=my-first-flink-cluster 找到的提交入口为
> > > > my-first-flink-cluster-rest.test。这个应该是根据k8s生成的dns,test是flink的namespace。
> > > >
> > > > 我本地也的确并无法解析 my-first-flink-cluster-rest.test 这个。
> > > >

Re: flink native k8s 按照文档提交任务找不到对应的集群

Posted by yidan zhao <hi...@gmail.com>.
如果是在 k8s-master-node 上,可不可以直接用 ClusterIp 呢?


其次,NodePort我大概理解,一直不是很懂 LoadBalancer 方式是什么原理。

yidan zhao <hi...@gmail.com> 于2022年7月12日周二 12:48写道:
>
> 我理解的 k8s 集群内是组成 k8s 的机器,是必须在 pod 内?我在k8s的node上也不可以是吧。
>
> Yang Wang <da...@gmail.com> 于2022年7月12日周二 12:07写道:
> >
> > 日志里面已经说明的比较清楚了,如果用的是ClusterIP的方式,那你的Flink
> > client必须在k8s集群内才能正常提交。例如:起一个Pod,然后再pod里面执行flink run
> > 否则你就需要NodePort或者LoadBalancer的方式了
> >
> > 2022-07-12 10:23:23,021 WARN
> > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > Please note that Flink client operations(e.g. cancel, list, stop,
> > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> >
> >
> > Best,
> > Yang
> >
> > yidan zhao <hi...@gmail.com> 于2022年7月12日周二 10:40写道:
> >
> > > 如下步骤参考的文档
> > > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > >
> > > 版本:1.15
> > >
> > > (1)创建集群:
> > >
> > > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > > (2)提交任务:
> > > ./bin/flink run \
> > >     --target kubernetes-session \
> > >     -Dkubernetes.cluster-id=my-first-flink-cluster \
> > >     ./examples/streaming/TopSpeedWindowing.jar
> > >
> > > svc是ClusterIp类型
> > >
> > > 第二步提交任务环节,显示如下:
> > > Executing example with default input data.
> > > Use --input to specify file input.
> > > Printing result to stdout. Use --output to specify output path.
> > > 2022-07-12 10:23:23,021 WARN
> > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > > 2022-07-12 10:23:23,027 INFO
> > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > Retrieve flink cluster my-first-flink-cluster successfully, JobManager
> > > Web Interface: http://my-first-flink-cluster-rest.test:8081
> > > 2022-07-12 10:23:23,044 WARN
> > > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > > Please note that Flink client operations(e.g. cancel, list, stop,
> > > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > >
> > > ------------------------------------------------------------
> > >  The program finished with the following exception:
> > > org.apache.flink.client.program.ProgramInvocationException: The main
> > > method caused an error: Failed to execute job
> > > 'CarTopSpeedWindowingExample'.
> > > ...
> > > Caused by: org.apache.flink.util.FlinkException: Failed to execute job
> > > 'CarTopSpeedWindowingExample'.
> > > ...
> > > Caused by: org.apache.flink.runtime.client.JobSubmissionException:
> > > Failed to submit JobGraph.
> > > ...
> > > Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException:
> > > Could not complete the operation. Number of retries has been
> > > exhausted.
> > > ...
> > > Caused by: java.util.concurrent.CompletionException:
> > > java.net.UnknownHostException: my-first-flink-cluster-rest.test: Name
> > > or service not known
> > > ...
> > > Caused by: java.net.UnknownHostException:
> > > my-first-flink-cluster-rest.test: Name or service not known
> > >
> > >
> > > 如上,根据 --target kubernetes-session
> > > -Dkubernetes.cluster-id=my-first-flink-cluster 找到的提交入口为
> > > my-first-flink-cluster-rest.test。这个应该是根据k8s生成的dns,test是flink的namespace。
> > >
> > > 我本地也的确并无法解析 my-first-flink-cluster-rest.test 这个。
> > >

Re: flink native k8s 按照文档提交任务找不到对应的集群

Posted by yidan zhao <hi...@gmail.com>.
我理解的 k8s 集群内是组成 k8s 的机器,是必须在 pod 内?我在k8s的node上也不可以是吧。

Yang Wang <da...@gmail.com> 于2022年7月12日周二 12:07写道:
>
> 日志里面已经说明的比较清楚了,如果用的是ClusterIP的方式,那你的Flink
> client必须在k8s集群内才能正常提交。例如:起一个Pod,然后再pod里面执行flink run
> 否则你就需要NodePort或者LoadBalancer的方式了
>
> 2022-07-12 10:23:23,021 WARN
> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> Please note that Flink client operations(e.g. cancel, list, stop,
> savepoint, etc.) won't work from outside the Kubernetes cluster since
> 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
>
>
> Best,
> Yang
>
> yidan zhao <hi...@gmail.com> 于2022年7月12日周二 10:40写道:
>
> > 如下步骤参考的文档
> > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> >
> > 版本:1.15
> >
> > (1)创建集群:
> >
> > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> > (2)提交任务:
> > ./bin/flink run \
> >     --target kubernetes-session \
> >     -Dkubernetes.cluster-id=my-first-flink-cluster \
> >     ./examples/streaming/TopSpeedWindowing.jar
> >
> > svc是ClusterIp类型
> >
> > 第二步提交任务环节,显示如下:
> > Executing example with default input data.
> > Use --input to specify file input.
> > Printing result to stdout. Use --output to specify output path.
> > 2022-07-12 10:23:23,021 WARN
> > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > Please note that Flink client operations(e.g. cancel, list, stop,
> > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> > 2022-07-12 10:23:23,027 INFO
> > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > Retrieve flink cluster my-first-flink-cluster successfully, JobManager
> > Web Interface: http://my-first-flink-cluster-rest.test:8081
> > 2022-07-12 10:23:23,044 WARN
> > org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> > Please note that Flink client operations(e.g. cancel, list, stop,
> > savepoint, etc.) won't work from outside the Kubernetes cluster since
> > 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> >
> > ------------------------------------------------------------
> >  The program finished with the following exception:
> > org.apache.flink.client.program.ProgramInvocationException: The main
> > method caused an error: Failed to execute job
> > 'CarTopSpeedWindowingExample'.
> > ...
> > Caused by: org.apache.flink.util.FlinkException: Failed to execute job
> > 'CarTopSpeedWindowingExample'.
> > ...
> > Caused by: org.apache.flink.runtime.client.JobSubmissionException:
> > Failed to submit JobGraph.
> > ...
> > Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException:
> > Could not complete the operation. Number of retries has been
> > exhausted.
> > ...
> > Caused by: java.util.concurrent.CompletionException:
> > java.net.UnknownHostException: my-first-flink-cluster-rest.test: Name
> > or service not known
> > ...
> > Caused by: java.net.UnknownHostException:
> > my-first-flink-cluster-rest.test: Name or service not known
> >
> >
> > 如上,根据 --target kubernetes-session
> > -Dkubernetes.cluster-id=my-first-flink-cluster 找到的提交入口为
> > my-first-flink-cluster-rest.test。这个应该是根据k8s生成的dns,test是flink的namespace。
> >
> > 我本地也的确并无法解析 my-first-flink-cluster-rest.test 这个。
> >

Re: flink native k8s 按照文档提交任务找不到对应的集群

Posted by Yang Wang <da...@gmail.com>.
日志里面已经说明的比较清楚了,如果用的是ClusterIP的方式,那你的Flink
client必须在k8s集群内才能正常提交。例如:起一个Pod,然后再pod里面执行flink run
否则你就需要NodePort或者LoadBalancer的方式了

2022-07-12 10:23:23,021 WARN
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
Please note that Flink client operations(e.g. cancel, list, stop,
savepoint, etc.) won't work from outside the Kubernetes cluster since
'kubernetes.rest-service.exposed.type' has been set to ClusterIP.


Best,
Yang

yidan zhao <hi...@gmail.com> 于2022年7月12日周二 10:40写道:

> 如下步骤参考的文档
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
>
> 版本:1.15
>
> (1)创建集群:
>
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes
> (2)提交任务:
> ./bin/flink run \
>     --target kubernetes-session \
>     -Dkubernetes.cluster-id=my-first-flink-cluster \
>     ./examples/streaming/TopSpeedWindowing.jar
>
> svc是ClusterIp类型
>
> 第二步提交任务环节,显示如下:
> Executing example with default input data.
> Use --input to specify file input.
> Printing result to stdout. Use --output to specify output path.
> 2022-07-12 10:23:23,021 WARN
> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> Please note that Flink client operations(e.g. cancel, list, stop,
> savepoint, etc.) won't work from outside the Kubernetes cluster since
> 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
> 2022-07-12 10:23:23,027 INFO
> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> Retrieve flink cluster my-first-flink-cluster successfully, JobManager
> Web Interface: http://my-first-flink-cluster-rest.test:8081
> 2022-07-12 10:23:23,044 WARN
> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
> Please note that Flink client operations(e.g. cancel, list, stop,
> savepoint, etc.) won't work from outside the Kubernetes cluster since
> 'kubernetes.rest-service.exposed.type' has been set to ClusterIP.
>
> ------------------------------------------------------------
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main
> method caused an error: Failed to execute job
> 'CarTopSpeedWindowingExample'.
> ...
> Caused by: org.apache.flink.util.FlinkException: Failed to execute job
> 'CarTopSpeedWindowingExample'.
> ...
> Caused by: org.apache.flink.runtime.client.JobSubmissionException:
> Failed to submit JobGraph.
> ...
> Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException:
> Could not complete the operation. Number of retries has been
> exhausted.
> ...
> Caused by: java.util.concurrent.CompletionException:
> java.net.UnknownHostException: my-first-flink-cluster-rest.test: Name
> or service not known
> ...
> Caused by: java.net.UnknownHostException:
> my-first-flink-cluster-rest.test: Name or service not known
>
>
> 如上,根据 --target kubernetes-session
> -Dkubernetes.cluster-id=my-first-flink-cluster 找到的提交入口为
> my-first-flink-cluster-rest.test。这个应该是根据k8s生成的dns,test是flink的namespace。
>
> 我本地也的确并无法解析 my-first-flink-cluster-rest.test 这个。
>