You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Vinod Kone <vi...@gmail.com> on 2016/01/05 01:20:31 UTC

Re: Review Request 41454: Added initial draft of executor HTTP API user doc.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41454/#review112636
-----------------------------------------------------------



docs/executor-http-api.md (line 14)
<https://reviews.apache.org/r/41454/#comment173138>

    put 4 spaces so that you get a shaded block around this. see https://github.com/apache/mesos/blob/master/docs/scheduler-http-api.md for reference.



docs/executor-http-api.md (line 30)
<https://reviews.apache.org/r/41454/#comment173139>

    s/Alternatively/Additonally/



docs/executor-http-api.md (line 31)
<https://reviews.apache.org/r/41454/#comment173140>

    "The executor is expected to keep running...". This sentence should be at the top of these two sub points. Also, the executor should keep running only if checkpointing is enabled right?
    
    "These status update messages...". How about:
    "The executor is expected to keep track of status updates not acknowledged by the agent via the ACKNOWLEDGE events."



docs/executor-http-api.md (line 32)
<https://reviews.apache.org/r/41454/#comment173141>

    there should be no network intermediaries between an agent and an executor!
    
    can you just re-use the comment in exec.cpp above `tasks` map instead?



docs/executor-http-api.md (lines 52 - 70)
<https://reviews.apache.org/r/41454/#comment173142>

    looking at just the JSON, it's not very clear that `tasks` and `updates` correspond to unacknowledged tasks and updates. we should probably rename these fields as `unacknowledged_tasks` and `unacknowledged_updates`. what do you think?



docs/executor-http-api.md (line 131)
<https://reviews.apache.org/r/41454/#comment173144>

    Is the retry by executor necessary? The current driver doesn't do retries for example. Are you anticipating a scenario where the executor is connected but messages are lost?



docs/executor-http-api.md (line 171)
<https://reviews.apache.org/r/41454/#comment173145>

    s/message to the executor/message to the scheduler/



docs/executor-http-api.md (line 211)
<https://reviews.apache.org/r/41454/#comment173147>

    s/messages/tasks/



docs/executor-http-api.md (line 247)
<https://reviews.apache.org/r/41454/#comment173148>

    s/the resources/the task resources/



docs/executor-http-api.md (line 278)
<https://reviews.apache.org/r/41454/#comment173149>

    Mention that `data` is raw bytes encoded in Base64?



docs/executor-http-api.md (line 293)
<https://reviews.apache.org/r/41454/#comment173150>

    Also mention that TASK_LOST updates will be sent by the slave for any active tasks.



docs/executor-http-api.md (lines 332 - 333)
<https://reviews.apache.org/r/41454/#comment173192>

    didn't we agree to rename these?



docs/executor-http-api.md (line 335)
<https://reviews.apache.org/r/41454/#comment173193>

    s/all agent/all the agent's/



docs/executor-http-api.md (line 343)
<https://reviews.apache.org/r/41454/#comment173195>

    what's the relation between MESOS_RETRY_INTERVAL_MAX and the duration after which the agent deems recovery process complete (EXECUTOR_REREGISTER_TIMEOUT)? AFAICT, currently the former is set to 15 min and the latter is 2s.



docs/executor-http-api.md (line 350)
<https://reviews.apache.org/r/41454/#comment173194>

    s/then kill/then forcefully kill/
    
    s/in forcefully/in/


- Vinod Kone


On Dec. 17, 2015, 1:07 a.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41454/
> -----------------------------------------------------------
> 
> (Updated Dec. 17, 2015, 1:07 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-4177
>     https://issues.apache.org/jira/browse/MESOS-4177
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is an initial draft for the user doc of Executor HTTP API. Some details might change due to the pending executor library review chain: https://reviews.apache.org/r/41275
> 
> 
> Diffs
> -----
> 
>   docs/executor-http-api.md PRE-CREATION 
>   docs/home.md a01612ec8347eb8a7a9277b829365b6c3a1fe9e3 
> 
> Diff: https://reviews.apache.org/r/41454/diff/
> 
> 
> Testing
> -------
> 
> https://gist.github.com/hatred/d35fdaa667203bf34a78
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Re: Review Request 41454: Added initial draft of executor HTTP API user doc.

Posted by Anand Mazumdar <ma...@gmail.com>.

> On Jan. 5, 2016, 12:20 a.m., Vinod Kone wrote:
> > docs/executor-http-api.md, lines 52-70
> > <https://reviews.apache.org/r/41454/diff/3/?file=1167954#file1167954line52>
> >
> >     looking at just the JSON, it's not very clear that `tasks` and `updates` correspond to unacknowledged tasks and updates. we should probably rename these fields as `unacknowledged_tasks` and `unacknowledged_updates`. what do you think?

sgtm, modified the name of the fields as per our discussion.


> On Jan. 5, 2016, 12:20 a.m., Vinod Kone wrote:
> > docs/executor-http-api.md, line 131
> > <https://reviews.apache.org/r/41454/diff/3/?file=1167954#file1167954line131>
> >
> >     Is the retry by executor necessary? The current driver doesn't do retries for example. Are you anticipating a scenario where the executor is connected but messages are lost?

My bad, Fixed. This got copied over verbatim from the corresponding scheduler docs.


> On Jan. 5, 2016, 12:20 a.m., Vinod Kone wrote:
> > docs/executor-http-api.md, line 343
> > <https://reviews.apache.org/r/41454/diff/3/?file=1167954#file1167954line343>
> >
> >     what's the relation between MESOS_RETRY_INTERVAL_MAX and the duration after which the agent deems recovery process complete (EXECUTOR_REREGISTER_TIMEOUT)? AFAICT, currently the former is set to 15 min and the latter is 2s.

As per our discussion, I modified the name of the variable to `EXECUTOR_REGISTRATION_BACKOFF_MAX`. This variable is equal in value to `EXECUTOR_REREGISTER_TIMEOUT`. 

It essentially denotes that upon agent recovery, the agent would wait for `EXECUTOR_REREGISTER_TIMEOUT` before killing the executor and marking the recovery process as complete. Hence, the executor should retry atleast once within `EXECUTOR_REGISTRATION_BACKOFF_MAX` to avoid being killed.

Also, let me know if this new name looks good to you.


- Anand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41454/#review112636
-----------------------------------------------------------


On Jan. 5, 2016, 9:28 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41454/
> -----------------------------------------------------------
> 
> (Updated Jan. 5, 2016, 9:28 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-4177
>     https://issues.apache.org/jira/browse/MESOS-4177
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is an initial draft for the user doc of Executor HTTP API. Some details might change due to the pending executor library review chain: https://reviews.apache.org/r/41275
> 
> 
> Diffs
> -----
> 
>   docs/executor-http-api.md PRE-CREATION 
>   docs/home.md d929838206817a6c49cc2343b4de82fa085da682 
> 
> Diff: https://reviews.apache.org/r/41454/diff/
> 
> 
> Testing
> -------
> 
> https://gist.github.com/hatred/d35fdaa667203bf34a78
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>