You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by anu sudarsan <an...@gmail.com> on 2016/02/08 18:37:37 UTC

Re: Slider 0.81.1 issue with YARN labels

Hi,

I see https://issues.apache.org/jira/browse/SLIDER-1051 is still open. Any
updates on when this will be fixed?

On Thu, Jan 7, 2016 at 9:49 AM, anu sudarsan <an...@gmail.com> wrote:

> Thanks Steve for fixing it.  I will give 0.91 a try at some point.
>
> As I mentioned, the issue is not there in 0.80.0 but only in 0.81.1. Did
> you mean cherrypick-ing the fix to 0.81 branch?
>
> If we upgrade to 0.91, do you expect instabilities just in AA placement or
> other placement policies in general? Also upgrading to 0.91 will need a
> Hadoop upgrade (to 2.7?) too, I assume? If so, I would suggest backporting
> the fix to 0.81 branch.
>
> -Anu
>
> On Thu, Jan 7, 2016 at 8:17 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
>>
>> don't bother trying that —I've replicated it locally, added tests and
>> fixed it.
>>
>> It'll be fixed in 0.91, that is, the successor to the 0.90.2 that will be
>> out today.
>>
>> One thing to consider is: do we backport this to the 0.80 branch? It's a
>> one-off change, and with the changes for AA placement still going to take
>> an iteration or so to stabilise, probably better to cherry pick it in
>> rather than say "do the big upgrade"
>>
>> What do people think?
>>
>>
>> > On 7 Jan 2016, at 11:28, Steve Loughran <st...@hortonworks.com> wrote:
>> >
>> > quick question
>> >
>> > what happens if you delete the directory under that cluster in
>> ${user.home}/.sliders/clusters/${appname}/history  (where user.home is your
>> homedir, appname the name of the slider application?
>> >
>> > If it works then, what happens when you stop the application and
>> restart it?
>> >
>> >
>> >> On 6 Jan 2016, at 15:36, anu sudarsan <an...@gmail.com> wrote:
>> >>
>> >> Hi
>> >>
>> >> I tried with HDP 2.3 and still getting the same error. Any ideas what
>> might
>> >> be causing this? As I said, the same appConfig and resources.json
>> >> configurations and cluster works for Slider 0.80.0.
>> >>
>> >> Relevant parameters from the resources.json
>> >>
>> >> {
>> >> "schema": "http://example.org/specification/v2.0.0",
>> >> "metadata": {
>> >> },
>> >> "global": {
>> >>   "yarn.vcores": "1"
>> >> },
>> >> "components": {
>> >>   "slider-appmaster": {
>> >>   },
>> >>   "COORDINATOR": {
>> >>     "yarn.role.priority": "1",
>> >>     "yarn.component.instances": "1",
>> >>     "yarn.memory": "256",
>> >>     "yarn.label.expression": "coord"
>> >>   }
>> >> }
>> >>
>> >>
>> >> On Tue, Jan 5, 2016 at 2:30 PM, anu sudarsan <an...@gmail.com>
>> wrote:
>> >>
>> >>> Hi
>> >>>
>> >>> I am trying to use Slider 0.81.1 with *yarn.label.expression* feature
>> >>> and getting the following error. This is from slider-agent.log for
>> >>> slider-appmaster.
>> >>>
>> >>> ERROR appmaster.SliderAppMaster - Exception in AmExecutor-006:
>> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: relax
>> location flag doesn't match container priority:
>> OutstandingRequest{roleId=1, node=null, hostname='null', hasLocation=true,
>> requestedTimeMillis=1452019991722, mayEscalate=false, escalated=true,
>> escalationTimeoutMillis=1452020021722,
>> issuedRequest=Capability[<memory:1500, vCores:1>]Priority[1073741825];
>> relaxLocality=true; nodeLabels=coord; }
>> >>> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException:
>> relax location flag doesn't match container priority:
>> OutstandingRequest{roleId=1, node=null, hostname='null', hasLocation=true,
>> requestedTimeMillis=1452019991722, mayEscalate=false, escalated=true,
>> escalationTimeoutMillis=1452020021722,
>> issuedRequest=Capability[<memory:1500, vCores:1>]Priority[1073741825];
>> relaxLocality=true; nodeLabels=coord; }
>> >>>     at
>> org.apache.slider.server.appmaster.state.OutstandingRequest.validate(OutstandingRequest.java:406)
>> >>>     at
>> org.apache.slider.server.appmaster.state.OutstandingRequest.buildContainerRequest(OutstandingRequest.java:232)
>> >>>     at
>> org.apache.slider.server.appmaster.state.RoleHistory.requestInstanceOnNode(RoleHistory.java:598)
>> >>>     at
>> org.apache.slider.server.appmaster.state.RoleHistory.requestNode(RoleHistory.java:613)
>> >>>     at
>> org.apache.slider.server.appmaster.state.AppState.createContainerRequest(AppState.java:1232)
>> >>>     at
>> org.apache.slider.server.appmaster.state.AppState.buildContainerResourceAndRequest(AppState.java:1213)
>> >>>     at
>> org.apache.slider.server.appmaster.state.AppState.reviewOneRole(AppState.java:1938)
>> >>>     at
>> org.apache.slider.server.appmaster.state.AppState.reviewRequestAndReleaseNodes(AppState.java:1812)
>> >>>     at
>> org.apache.slider.server.appmaster.SliderAppMaster.executeNodeReview(SliderAppMaster.java:1804)
>> >>>     at
>> org.apache.slider.server.appmaster.SliderAppMaster.handleReviewAndFlexApplicationSize(SliderAppMaster.java:1790)
>> >>>     at
>> org.apache.slider.server.appmaster.actions.ReviewAndFlexApplicationSize.execute(ReviewAndFlexApplicationSize.java:41)
>> >>>     at
>> org.apache.slider.server.appmaster.actions.QueueExecutor.run(QueueExecutor.java:73)
>> >>>
>> >>> The same configuration works fine when I remove
>> "*yarn.label.expression*"
>> >>> from the resources.json.
>> >>>
>> >>> Does Slider 0.81.1 require Hadoop 2.7? I am using HDP 2.2, and thus
>> have
>> >>> Hadoop 2.6. I have no issues deploying the application using yarn
>> labels
>> >>> when using Slider 0.80.0, the problem is only with Slider 0.81.1.
>> >>>
>> >>> Thanks
>> >>> -Anu
>> >>>
>> >
>> >
>>
>>
>