You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by ant burton <ap...@gmail.com> on 2017/10/02 13:39:03 UTC
Savepoints - jobmanager.rpc.address
Hi,
When taking a savepoint on AWS EMR I get the following error
[hadoop@ip-10-12-169-172 ~]$ flink savepoint
e14a6402b6f1e547c4adf40f43861c27
Retrieving JobManager.
------------------------------------------------------------
The program finished with the following exception:
org.apache.flink.configuration.IllegalConfigurationException: Couldn't
retrieve client for cluster
at
org.apache.flink.client.CliFrontend.retrieveClient(CliFrontend.java:925)
at
org.apache.flink.client.CliFrontend.getJobManagerGateway(CliFrontend.java:939)
at
org.apache.flink.client.CliFrontend.triggerSavepoint(CliFrontend.java:714)
at
org.apache.flink.client.CliFrontend.savepoint(CliFrontend.java:704)
at
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1096)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
at
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
*Caused by: java.lang.RuntimeException: Couldn't retrieve standalone
cluster*
at
org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:48)
at
org.apache.flink.client.cli.DefaultCLI.retrieveCluster(DefaultCLI.java:74)
at
org.apache.flink.client.cli.DefaultCLI.retrieveCluster(DefaultCLI.java:38)
at
org.apache.flink.client.CliFrontend.retrieveClient(CliFrontend.java:920)
... 12 more
*Caused by: org.apache.flink.util.ConfigurationException: Config parameter
'Key: 'jobmanager.rpc.address' , default: null (deprecated keys: [])' is
missing (hostname/address of JobManager to connect to).*
at
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.getJobManagerAddress(HighAvailabilityServicesUtils.java:119)
at
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:76)
at
org.apache.flink.client.program.ClusterClient.<init>(ClusterClient.java:131)
at
org.apache.flink.client.program.StandaloneClusterClient.<init>(StandaloneClusterClient.java:42)
at
org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:46)
... 15 more
My configuration.json is
[
{
"Classification": "flink-conf",
"Properties": {
"taskmanager.numberOfTaskSlots":"1",
"state.backend": "filesystem",
"state.checkpoints.dir": "s3://flink/checkpoints/",
"state.backend.fs.checkpointdir": "s3://flink/checkpoints/"
}
}
]
Setting the following in configuration.json does not resolve the issue.
jobmanager.rpc.address: localhost or 0.0.0.0 or 127.0.0.1
jobmanager.rpc.port: 6123
Thanks,
Re: Savepoints - jobmanager.rpc.address
Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi!,
Since your running on AWS EMR, I’m assuming your deploying your Flink job / cluster on YARN?
If so, make sure to specify the YARN application id also.
You should do that by:
flink savepoint -yid <the YARN application id> <JobID>
Cheers,
Gordon
On 2 October 2017 at 9:39:09 PM, ant burton (apburton84@gmail.com) wrote:
Hi,
When taking a savepoint on AWS EMR I get the following error
[hadoop@ip-10-12-169-172 ~]$ flink savepoint e14a6402b6f1e547c4adf40f43861c27
Retrieving JobManager.
------------------------------------------------------------
The program finished with the following exception:
org.apache.flink.configuration.IllegalConfigurationException: Couldn't retrieve client for cluster
at org.apache.flink.client.CliFrontend.retrieveClient(CliFrontend.java:925)
at org.apache.flink.client.CliFrontend.getJobManagerGateway(CliFrontend.java:939)
at org.apache.flink.client.CliFrontend.triggerSavepoint(CliFrontend.java:714)
at org.apache.flink.client.CliFrontend.savepoint(CliFrontend.java:704)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1096)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
Caused by: java.lang.RuntimeException: Couldn't retrieve standalone cluster
at org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:48)
at org.apache.flink.client.cli.DefaultCLI.retrieveCluster(DefaultCLI.java:74)
at org.apache.flink.client.cli.DefaultCLI.retrieveCluster(DefaultCLI.java:38)
at org.apache.flink.client.CliFrontend.retrieveClient(CliFrontend.java:920)
... 12 more
Caused by: org.apache.flink.util.ConfigurationException: Config parameter 'Key: 'jobmanager.rpc.address' , default: null (deprecated keys: [])' is missing (hostname/address of JobManager to connect to).
at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.getJobManagerAddress(HighAvailabilityServicesUtils.java:119)
at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:76)
at org.apache.flink.client.program.ClusterClient.<init>(ClusterClient.java:131)
at org.apache.flink.client.program.StandaloneClusterClient.<init>(StandaloneClusterClient.java:42)
at org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:46)
... 15 more
My configuration.json is
[
{
"Classification": "flink-conf",
"Properties": {
"taskmanager.numberOfTaskSlots":"1",
"state.backend": "filesystem",
"state.checkpoints.dir": "s3://flink/checkpoints/",
"state.backend.fs.checkpointdir": "s3://flink/checkpoints/"
}
}
]
Setting the following in configuration.json does not resolve the issue.
jobmanager.rpc.address: localhost or 0.0.0.0 or 127.0.0.1
jobmanager.rpc.port: 6123
Thanks,