You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2016/05/01 19:23:12 UTC

[jira] [Commented] (HADOOP-13035) Add states INITING and STARTING to YARN Service model to cover in-transition states.

    [ https://issues.apache.org/jira/browse/HADOOP-13035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265839#comment-15265839 ] 

Steve Loughran commented on HADOOP-13035:
-----------------------------------------

-1, as it


This is a pretty fundamental change. I would have also really liked to have been pinged on this earlier, given my hands are all over the code as it stands. While I acknowledge it isn't perfect, it does include experience on other systems, and I did go through every single YARN service, repeatedly, until things were stable.

This whole discrepancy between state-> starting and service->live is a recurrent problem, but as you can see from things like web and IPC servers starting in the background, service start() is inherently async; what code really needs to wait upon is not the state change complete, but to await for the started state to go live, which *may happen at some indeterminate state in the future*

Without picking into this patch in detail, here are the places which have caused most trouble over time, which any patch at what is a fundamental bit of how the YARN services are constructed is going to have to look at

* subclasses of {{CompositeService}} adding new services in service start, having to push them through their lifecycle enough to attach them to their parent, then rely on the remaining of the serviceStart lifecycle to walk themselves through.
* things going wrong in composite start and having to unroll the stack
* things trying to call stop() during start.
* the fact that calling start() on a service which is started *or in the process of starting* is required to be a no-op.
* the issue as to when is serviceStop() invoked on a service when stop() is called? Currently: not until you init(). it had better be after initing() now.

Can i also note that the ubquity of YarnClient means this class gets used a lot downstream. Admittedly, I use it most of all, but you can essentially build yarn based apps by aggregating their service lifecycles together. Which means there is a risk that things may change. Before a descendant of this patch goes in, someone is going to have to have built and tested slider's functional test suite against a version of Hadoop with this turned on. I think they'll be able to dodge doing the same in Hive, as Hive 1.2.x still uses a cut-and-paste of the the 2.0 service model before the YARN-117 patch went in; which, if you've ever seen how Spark Thriftserver abuses introspection to subclass (SPARK-8064, SPARK-10793) you'll be grateful there.

I also to know what happens to YARN-679 and YARN-1564 with this. I propose adding them first, as that will expand the codebase, and, as much of this is code which I can migrate slider to, will make it easier for slider to adapt to a change this fundamental.

Accordingly, I'll tag this as a depends-on there, rebase those two batches with trunk and await reviews.

> Add states INITING and STARTING to YARN Service model to cover in-transition states.
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13035
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>         Attachments: 0001-HADOOP-13035.patch, 0002-HADOOP-13035.patch, 0003-HADOOP-13035.patch
>
>
> As per the discussion in YARN-3971 the we should be setting the service state to STARTED only after serviceStart() 
> Currently {{AbstractService#start()}} is set
> {noformat} 
>      if (stateModel.enterState(STATE.STARTED) != STATE.STARTED) {
>         try {
>           startTime = System.currentTimeMillis();
>           serviceStart();
> ..
>  }
> {noformat}
> enterState sets the service state to proposed state. So in {{service.getServiceState}} in {{serviceStart()}} will return STARTED .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org