You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by uce <gi...@git.apache.org> on 2015/09/30 17:01:59 UTC

[GitHub] flink pull request: [dashboard] Redirect to leader in non-standalo...

GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/1202

    [dashboard] Redirect to leader in non-standalone mode

    - Redirects HTTP requests to the web fronted of the leading job manager if not associated with leading job manager
    - Splits up the start up of the job manager and web frontend (see `JobManager#runJobManager`) to first start the web fronted (binding to the port), then the job manager actor, then the web fronted with the address of the associated job manager.
    - The port of the web frontend to redirect is determined via a newly introduced `JobManagerMessage`
    - The associated job manager is used to determine whether the leader notification addresses the associated job manager or another one.
    - Adds web front end tests via HttpTestClient (moved the generated files to the src folder for this) 
    
    This breaks the leader notification abstraction in the sense that the job manager location is not transparent anymore. The reason for this is that the execution graph structures are not serializable (a proper fix would address this instead of redirecting). I have not integrated this with the TestingCluster.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink redirect

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1202.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1202
    
----
commit 5a88d5ed511f67b8b41c44789e13d58cc46c03a5
Author: Ufuk Celebi <uc...@apache.org>
Date:   2015-09-25T09:44:53Z

    [tests] Add HttpTestClient for testing HTTP responses

commit 656d6d610842682ed69b36a4b9109eff7c257817
Author: Ufuk Celebi <uc...@apache.org>
Date:   2015-09-25T09:47:30Z

    Split WebMonitor and LeaderRetrievalService start up

commit a7e8da82a4188e6b2861f15c6f313ac672bab958
Author: Ufuk Celebi <uc...@apache.org>
Date:   2015-09-25T10:31:00Z

    Move generated /web files to src/main/resources

commit 999ddb44990eee9f6a753b6635e7662074db121b
Author: Ufuk Celebi <uc...@apache.org>
Date:   2015-09-27T11:29:21Z

    [dashboard] Redirect to leader in non-standalone mode

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-145786422
  
    Yes, good point. I will do this now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-145598003
  
    The PR looks good. I think we cannot do much about it being a temporary workaround for the limitation that our `ExecutionGraph` is not serializable. Once we have the resources this should be addressed, though.
    
    I've got some minor comments. The main point is the blocking call in the `notifyLeaderAddress` of the `JobManagerArchiverRetriever`. This should be fixed imo.
    
    Moreover, I think I've found a race condition. The `RuntimeMonitorHandler` checks whether the current leader is its own or a remote one. If it's the local JM, then the respective RequestHandler is called. Within the `RequestHandler`, such as `CurrentJobsOverviewHandler` the `JobManagerArchiverRetriever` is called again. If now the leader has changed to be a remote JM, then we'll have a problem. This could be fixed as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-145885023
  
    It's not superficial at all. It's cleaner, because the retriever is independent of the redirect logic then.
    
    I've decided against it though in order to keep the handlers cleaner (there only two different kinds atm though, so it would be OK to add it) and because the associated job manager URL is only known after start() is called, so I would have to update all handlers lazily after that.
    
    This is due to the start up ordering of
    1. start web server (bind to port)
    2. get web server port and start job manager with web server port configured
    3. set job manager address at web server
    
    This ordering allows both components to pick random ports.
    
    I just didn't want to change too much code this at this point, because I genuinely hope that we will make the execution graph serialisable after the release and get back to the old solution. But I'm also happy to change it if you like. Just ping me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41169783
  
    --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/JobManagerArchiveRetriever.java ---
    @@ -70,28 +104,68 @@ public ActorGateway getArchiveGateway() {
     		return archiveGateway;
     	}
     
    +	/**
    +	 * Returns the current redirect address or <code>null</code> if the job manager associated with
    +	 * this web monitor is leading. In that case, work with the gateway directly.
    +	 */
    +	public String getRedirectAddress() {
    +		return redirectWebMonitorAddress;
    +	}
     
     	@Override
     	public void notifyLeaderAddress(String leaderAddress, UUID leaderSessionID) {
     		if (leaderAddress != null && !leaderAddress.equals("")) {
     			try {
    -				ActorRef jobManager = AkkaUtils.getActorRef(
    -						leaderAddress,
    -						actorSystem,
    +				ActorRef jobManager = AkkaUtils.getActorRef(leaderAddress, actorSystem,
    --- End diff --
    
    Good catch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-146737613
  
    Are there any blockers to this PR. Otherwise, I'd like to have it merged rather soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-145868099
  
    OK. I've addressed your comments in the last two commits. Thanks again!
    
    - The associated local leader gateway is only resolved once and getGateway will only return this one (new leader notifications don't change this one and only affect the redirect address)
    - The handlers don't block on the redirect address, but just read the current redirect address.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41272107
  
    --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/JobManagerArchiveRetriever.java ---
    @@ -47,51 +60,160 @@
     	private final FiniteDuration timeout;
     	private final WebMonitor webMonitor;
     
    +	/** Pattern to extract the host from an remote Akka URL */
    +	private final Pattern hostFromLeaderAddressPattern = Pattern.compile("^.+@(.+):([0-9]+)/user/.+$");
    +
    +	/** The JobManager Akka URL associated with this JobManager */
    +	private volatile String jobManagerAkkaUrl;
    +
     	/** will be written and read concurrently */
     	private volatile ActorGateway jobManagerGateway;
     	private volatile ActorGateway archiveGateway;
    +	private volatile String redirectWebMonitorAddress;
     
     	public JobManagerArchiveRetriever(
     			WebMonitor webMonitor,
     			ActorSystem actorSystem,
     			FiniteDuration lookupTimeout,
     			FiniteDuration timeout) {
    -		this.webMonitor = webMonitor;
    -		this.actorSystem = actorSystem;
    -		this.lookupTimeout = lookupTimeout;
    -		this.timeout = timeout;
    +
    +		this.webMonitor = checkNotNull(webMonitor);
    +		this.actorSystem = checkNotNull(actorSystem);
    +		this.lookupTimeout = checkNotNull(lookupTimeout);
    +		this.timeout = checkNotNull(timeout);
    +	}
    +
    +	/**
    +	 * Associates this instance with the job manager identified by the given URL.
    +	 *
    +	 * <p>This has to match the URL retrieved by the leader retrieval service. In tests setups you
    +	 * have to make sure to use the correct type of URLs.
    +	 */
    +	public void setJobManagerAkkaUrlAndRetrieveGateway(String jobManagerAkkaUrl) throws Exception {
    --- End diff --
    
    Actually, I meant to get rid of this extra redirect logic here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-146991673
  
    Great ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41164028
  
    --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala ---
    @@ -129,6 +129,13 @@ class JobManager(
       var leaderSessionID: Option[UUID] = None
     
       /**
    +   * The port of the web monitor as configured. Make sure that it is actually configured before
    +   * starting the JobManager.
    +   */
    +  val webMonitorPort : Integer = flinkConfiguration.getInteger(
    --- End diff --
    
    Maybe we can add a comment which says that this is gonna be kicked out once we have made the `ExecutionGraph` properly serializable or so...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-145878794
  
    I've added some more comments. Ping me if it's too superficial.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41164549
  
    --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala ---
    @@ -1761,11 +1803,8 @@ object JobManager {
        * @param config The configuration for the runtime monitor.
        * @param leaderRetrievalService Leader retrieval service to get the leading JobManager
        */
    -  def startWebRuntimeMonitor(
    -      config: Configuration,
    -      leaderRetrievalService: LeaderRetrievalService,
    -      actorSystem: ActorSystem)
    -    : WebMonitor = {
    +  def startWebRuntimeMonitor(config: Configuration, leaderRetrievalService: LeaderRetrievalService,
    +    actorSystem: ActorSystem) : WebMonitor = {
    --- End diff --
    
    Line breaks not very scalaesque


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41272295
  
    --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/RuntimeMonitorHandler.java ---
    @@ -44,27 +47,40 @@
     public class RuntimeMonitorHandler extends SimpleChannelInboundHandler<Routed> {
     	
     	private static final Charset ENCODING = Charset.forName("UTF-8");
    +
    +	private final JobManagerArchiveRetriever retriever;
     	
     	private final RequestHandler handler;
     	
     	private final String contentType;
     	
    -	public RuntimeMonitorHandler(RequestHandler handler) {
    -		if (handler == null) {
    -			throw new NullPointerException();
    -		}
    -		this.handler = handler;
    +	public RuntimeMonitorHandler(RequestHandler handler, JobManagerArchiveRetriever retriever) {
    +		this.retriever = checkNotNull(retriever);
    +		this.handler = checkNotNull(handler);
     		this.contentType = (handler instanceof RequestHandler.JsonResponse) ? "application/json" : "text/plain";
     	}
     	
     	@Override
     	protected void channelRead0(ChannelHandlerContext ctx, Routed routed) throws Exception {
    +		// Redirect to leader if necessary
    +		String redirectAddress = retriever.getRedirectAddress();
    --- End diff --
    
    Get rid of this `getRedirectAddress` method here and simply ask for the current leader. Then compare the leader address with the local job manager address to decide whether you have to redirect or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41169911
  
    --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala ---
    @@ -1773,8 +1812,7 @@ object JobManager {
             .asSubclass(classOf[WebMonitor])
     
           val ctor: Constructor[_ <: WebMonitor] = clazz.getConstructor(classOf[Configuration],
    -        classOf[LeaderRetrievalService],
    -        classOf[ActorSystem])
    +        classOf[LeaderRetrievalService], classOf[ActorSystem])
    --- End diff --
    
    I was going for Kafkaesque... :D Just kidding, will change it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41167117
  
    --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/JobManagerArchiveRetriever.java ---
    @@ -70,28 +104,68 @@ public ActorGateway getArchiveGateway() {
     		return archiveGateway;
     	}
     
    +	/**
    +	 * Returns the current redirect address or <code>null</code> if the job manager associated with
    +	 * this web monitor is leading. In that case, work with the gateway directly.
    +	 */
    +	public String getRedirectAddress() {
    +		return redirectWebMonitorAddress;
    +	}
     
     	@Override
     	public void notifyLeaderAddress(String leaderAddress, UUID leaderSessionID) {
     		if (leaderAddress != null && !leaderAddress.equals("")) {
     			try {
    -				ActorRef jobManager = AkkaUtils.getActorRef(
    -						leaderAddress,
    -						actorSystem,
    +				ActorRef jobManager = AkkaUtils.getActorRef(leaderAddress, actorSystem,
    --- End diff --
    
    This is a blocking call which will block the `NodeCache's` thread which calls the `NodeCacheListener`. This should be avoided by using `AkkaUtils.getActorRefFuture` and a `Promise`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41163935
  
    --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala ---
    @@ -129,6 +129,13 @@ class JobManager(
       var leaderSessionID: Option[UUID] = None
     
       /**
    +   * The port of the web monitor as configured. Make sure that it is actually configured before
    +   * starting the JobManager.
    +   */
    +  val webMonitorPort : Integer = flinkConfiguration.getInteger(
    --- End diff --
    
    Why not using Scala's `Int` type here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-144752438
  
    I already opened the PR. Will review it asap.
    
    On Thu, Oct 1, 2015 at 4:38 PM, Robert Metzger <no...@github.com>
    wrote:
    
    > No, I didn't try it.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/flink/pull/1202#issuecomment-144747629>.
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-144747629
  
    No, I didn't try it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41164567
  
    --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala ---
    @@ -1773,8 +1812,7 @@ object JobManager {
             .asSubclass(classOf[WebMonitor])
     
           val ctor: Constructor[_ <: WebMonitor] = clazz.getConstructor(classOf[Configuration],
    -        classOf[LeaderRetrievalService],
    -        classOf[ActorSystem])
    +        classOf[LeaderRetrievalService], classOf[ActorSystem])
    --- End diff --
    
    Line breaks not very scalaesque


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41167164
  
    --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/JobManagerArchiveRetriever.java ---
    @@ -70,28 +104,68 @@ public ActorGateway getArchiveGateway() {
     		return archiveGateway;
     	}
     
    +	/**
    +	 * Returns the current redirect address or <code>null</code> if the job manager associated with
    +	 * this web monitor is leading. In that case, work with the gateway directly.
    +	 */
    +	public String getRedirectAddress() {
    +		return redirectWebMonitorAddress;
    +	}
     
     	@Override
     	public void notifyLeaderAddress(String leaderAddress, UUID leaderSessionID) {
     		if (leaderAddress != null && !leaderAddress.equals("")) {
     			try {
    -				ActorRef jobManager = AkkaUtils.getActorRef(
    -						leaderAddress,
    -						actorSystem,
    +				ActorRef jobManager = AkkaUtils.getActorRef(leaderAddress, actorSystem,
     						lookupTimeout);
    +
     				jobManagerGateway = new AkkaActorGateway(jobManager, leaderSessionID);
     
     				Future<Object> archiveFuture = jobManagerGateway.ask(
    -						JobManagerMessages.getRequestArchive(),
    -						timeout);
    +						JobManagerMessages.getRequestArchive(), timeout);
     
     				ActorRef archive = ((JobManagerMessages.ResponseArchive) Await.result(
    -						archiveFuture,
    -						timeout)
    -				).actor();
    -
    +						archiveFuture, timeout)).actor();
     				archiveGateway = new AkkaActorGateway(archive, leaderSessionID);
    -			} catch (Exception e) {
    +
    +				if (jobManagerAkkaUrl == null) {
    +					throw new IllegalStateException("Unspecified Akka URL for the job manager " +
    +							"associated with this web monitor.");
    +				}
    +
    +				boolean isLeader = jobManagerAkkaUrl.equals(leaderAddress);
    +
    +				if (isLeader) {
    +					// Our JobManager is leader and our work is done :)
    +					redirectWebMonitorAddress = null;
    +				}
    +				else {
    +					// We need to redirect to the leader -.-
    +					//
    +					// This is necessary currently, because many execution graph structures are not
    +					// serializable. The proper solution here is to have these serializable and
    +					// transparently work with the leading job manager instead of redirecting.
    +					Future<Object> portFuture = jobManagerGateway
    +							.ask(JobManagerMessages.getRequestWebMonitorPort(), timeout);
    +
    +					JobManagerMessages.ResponseWebMonitorPort portResponse =
    +							(JobManagerMessages.ResponseWebMonitorPort) Await.result(portFuture, timeout);
    --- End diff --
    
    Again a blocking call which is bad. Better to use futures to circumvent this problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-144745094
  
    +1 The change looks good to merge. (maybe we should add the JIRA id when merging)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-145781149
  
    What you can actually do to resolve the issue with the `RequestHandler` is to statically assign them the local JM's address instead of providing a `JobManagerArchiveRetriever`. Since they are only called when the local JM is the leader, they don't have to look up the current leader address.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/1202


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1202#discussion_r41272418
  
    --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitor.java ---
    @@ -150,97 +148,97 @@ else if (flinkRoot != null) {
     		FiniteDuration lookupTimeout = AkkaUtils.getTimeout(config);
     
     		retriever = new JobManagerArchiveRetriever(this, actorSystem, lookupTimeout, timeout);
    -		
    +
     		ExecutionGraphHolder currentGraphs = new ExecutionGraphHolder(retriever);
     
     		router = new Router()
     			// config how to interact with this web server
    -			.GET("/config", handler(new DashboardConfigHandler(cfg.getRefreshInterval())))
    +			.GET("/config", handler(new DashboardConfigHandler(cfg.getRefreshInterval()), retriever))
     
     			// the overview - how many task managers, slots, free slots, ...
    -			.GET("/overview", handler(new ClusterOverviewHandler(retriever, DEFAULT_REQUEST_TIMEOUT)))
    +			.GET("/overview", handler(new ClusterOverviewHandler(retriever, DEFAULT_REQUEST_TIMEOUT), retriever))
     
     			// job manager configuration
    -			.GET("/jobmanager/config", handler(new JobManagerConfigHandler(config)))
    +			.GET("/jobmanager/config", handler(new JobManagerConfigHandler(config), retriever))
     
     			// overview over jobs
    -			.GET("/joboverview", handler(new CurrentJobsOverviewHandler(retriever, DEFAULT_REQUEST_TIMEOUT, true, true)))
    -			.GET("/joboverview/running", handler(new CurrentJobsOverviewHandler(retriever, DEFAULT_REQUEST_TIMEOUT, true, false)))
    -			.GET("/joboverview/completed", handler(new CurrentJobsOverviewHandler(retriever, DEFAULT_REQUEST_TIMEOUT, false, true)))
    +			.GET("/joboverview", handler(new CurrentJobsOverviewHandler(retriever, DEFAULT_REQUEST_TIMEOUT, true, true), retriever))
    --- End diff --
    
    Start the `RequestHandler` with the JobManager address of the local JobManager. They will only be called from the `RuntimeMonitorHandler` if the local JM is the leader.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-144746710
  
    I would squash all commits and add the JIRA. Have you tried it out?
    
    I would like to hear @tillrohrmann's take on this "abstraction" breaking behaviour before merging it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-146991497
  
    Have fixed a possible race condition in `JobManagerRetriever` and rebased on the latest master. If Travis gives green light, I'll merge it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2793] [dashboard] Redirect to leader in...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1202#issuecomment-146991947
  
    Oh I forgot. I still want to test it on Yarn.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---