You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@myriad.apache.org by "Matthew J. Loppatto" <ml...@keywcorp.com> on 2016/08/16 12:48:47 UTC

Resource manager error

Hi,

 

I'm setting up Myriad 0.2.0 on my Mesos cluster following this guide:
https://cwiki.apache.org/confluence/display/MYRIAD/Installing+for+Developers

 

And I get the following error in the resource manager executor log in mesos
after starting it with `/opt/hadoop-2.7.2/bin/yarn resourcemanager`:

 

chown: cannot access
'/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-f298affb6442': No such
file or directory

env: /bin/yarn: No such file or directory

ory

 

It appears the 'mesos' directory doesn't exist under /sys/fs/cgroup/cpu.
Any ideas what the issue could be?

 

This is my yarn-site.xml:

 

<configuration>

<!-- Site-specific YARN configuration properties -->

   <property>

       <name>yarn.nodemanager.aux-services</name>

       <value>mapreduce_shuffle,myriad_executor</value>

       <!-- If using MapR distro, please use the following value:

 
<value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value> -->

   </property>

   <property>

       <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>

       <value>org.apache.hadoop.mapred.ShuffleHandler</value>

   </property>

   <property>

       <name>yarn.nodemanager.aux-services.myriad_executor.class</name>

       <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>

   </property>

   <property>

       <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>

       <value>2000</value>

   </property>

   <property>

       <name>yarn.am.liveness-monitor.expiry-interval-ms</name>

       <value>10000</value>

   </property>

   <property>

       <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>

       <value>1000</value>

   </property>

<!-- Needed for Fine Grain Scaling -->

   <property>

       <name>yarn.scheduler.minimum-allocation-vcores</name>

       <value>0</value>

   </property>

   <property>

       <name>yarn.scheduler.minimum-allocation-mb</name>

       <value>0</value>

   </property>

<!-- Site specific YARN configuration properties -->

<property>

   <name>yarn.nodemanager.resource.cpu-vcores</name>

   <value>${nodemanager.resource.cpu-vcores}</value>

</property>

<property>

   <name>yarn.nodemanager.resource.memory-mb</name>

   <value>${nodemanager.resource.memory-mb}</value>

</property>

<!--These options enable dynamic port assignment by mesos -->

<property>

   <name>yarn.nodemanager.address</name>

   <value>${myriad.yarn.nodemanager.address}</value>

</property>

<property>

   <name>yarn.nodemanager.webapp.address</name>

   <value>${myriad.yarn.nodemanager.webapp.address}</value>

</property>

<property>

   <name>yarn.nodemanager.webapp.https.address</name>

   <value>${myriad.yarn.nodemanager.webapp.address}</value>

</property>

<property>

   <name>yarn.nodemanager.localizer.address</name>

   <value>${myriad.yarn.nodemanager.localizer.address}</value>

</property>

<!-- Configure Myriad Scheduler here -->

<property>

   <name>yarn.resourcemanager.scheduler.class</name>

   <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value>

   <description>One can configure other scehdulers as well from following
list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>

</property>

<!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->

<property>

   <name>yarn.nodemanager.pmem-check-enabled</name>

   <value>false</value>

</property>

<property>

   <name>yarn.nodemanager.vmem-check-enabled</name>

   <value>false</value>

</property>

</configuration>

 

 

My myriad-config-default.yml:

 

mesosMaster: zk://myip:2181/mesos

checkpoint: false

frameworkFailoverTimeout: 43200000

frameworkName: MyriadAlpha

frameworkRole:

frameworkUser: root     # User the Node Manager runs as, required if
nodeManagerURI set, otherwise defaults to the user

                         # running the resource manager.

frameworkSuperUser: root  # To be depricated, currently permissions need set
by a superuser due to Mesos-1790.  Must be

                         # root or have passwordless sudo. Required if
nodeManagerURI set, ignored otherwise.

nativeLibrary: /usr/local/lib/libmesos.so

zkServers: myip:2181

zkTimeout: 20000

restApiPort: 8192

servedConfigPath: dist/config.tgz

servedBinaryPath: dist/binary.tgz

profiles:

zero:  # NMs launched with this profile dynamically obtain cpu/mem from
Mesos

   cpu: 0

   mem: 0

small:

   cpu: 2

   mem: 2048

medium:

   cpu: 4

   mem: 4096

large:

   cpu: 10

   mem: 12288

nmInstances: # NMs to start with. Requires at least 1 NM with a non-zero
profile.

medium: 1 # <profile_name : instances>

rebalancer: false

haEnabled: false

nodemanager:

jvmMaxMemoryMB: 1024

cpus: 0.2

cgroups: false

executor:

jvmMaxMemoryMB: 256

path: file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar

#The following should be used for a remotely distributed URI, hdfs assumed
but other URI types valid.

#nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz

#configUri: http://127.0.0.1/api/arifacts/config.tgz

#jvmUri: https://downloads.mycompany.com/java/jre-7u76-linux-x64.tar.gz

yarnEnvironment:

YARN_HOME: /opt/hadoop-2.7.2

 

 

Thanks!

Matt


Re: Resource manager error

Posted by John Yost <ho...@gmail.com>.
JIRA ticket # 239

On Wed, Aug 17, 2016 at 2:21 PM, John Yost <ho...@gmail.com> wrote:

> Hi Guys,
>
> Sorry, just checked email, setting role to * does indeed cause this
> error.  Matt -> if you set it to nothing or whatever role you have your
> mesos slaves configured as, that will fix this. My apologies for failing to
> log and fix this. Darin -> I will enter a JIRA ticket for this.
>
> To echo Darin's sentiments--thanks a bunch for checking Myriad out! :)
>
> --John
>
> On Wed, Aug 17, 2016 at 1:25 PM, Darin Johnson <db...@gmail.com>
> wrote:
>
>> Hey Matt,
>>
>> Looking through the code, I think setting myriadFrameworkRole to "*" might
>> be the problem.  Can you try commenting out that line in your config?
>> I'll
>> double check this in a little while too.  If that works I'll submit a
>> patch
>> that checks that.
>>
>> Sorry - Myriad is still a pretty young project!  Thanks for checking it
>> out
>> though!
>>
>> Darin
>>
>> On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
>> mloppatto@keywcorp.com> wrote:
>>
>> > Hey Darin,
>> >
>> > Pulling from master got rid of the errors I was seeing, however I'm
>> > running into a new issue.  After starting the resource manager, I see
>> this
>> > in the logs:
>> >
>> > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 NM(s)
>> > with profile medium
>> > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.My
>> riadOperations:
>> > Adding 1 NM instances to cluster
>> > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.ev
>> ent.handlers.ErrorEventHandler:
>> > Role '' is not present in the master's --roles
>> >
>> > My Mesos cluster has the default "*" role so I tried setting
>> > frameworkRole: "*" in myriad-config-default.yml, restarted the resource
>> > manager and got this error:
>> >
>> > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.ev
>> ent.handlers.ResourceOffersEventHandler:
>> > Exception thrown while trying to create a task for nm
>> > java.lang.IllegalArgumentException: n must be positive
>> >     at java.util.Random.nextInt(Random.java:300)
>> >     at org.apache.myriad.scheduler.resource.RangeResource.
>> > getRandomValues(RangeResource.java:128)
>> >     at org.apache.myriad.scheduler.resource.RangeResource.
>> > consumeResource(RangeResource.java:99)
>> >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
>> > consumePorts(ResourceOfferContainer.java:171)
>> >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
>> > NMTaskFactory.java:45)
>> >     at org.apache.myriad.scheduler.event.handlers.
>> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119)
>> >     at org.apache.myriad.scheduler.event.handlers.
>> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
>> >     at com.lmax.disruptor.BatchEventProcessor.run(
>> > BatchEventProcessor.java:128)
>> >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> > ThreadPoolExecutor.java:1145)
>> >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> > ThreadPoolExecutor.java:615)
>> >     at java.lang.Thread.run(Thread.java:745)
>> >
>> > Does Myriad require its own role in Mesos?
>> >
>> > Thanks,
>> > Matt
>> >
>> >
>> > -----Original Message-----
>> > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
>> > Sent: Tuesday, August 16, 2016 6:18 PM
>> > To: Dev
>> > Subject: Re: Resource manager error
>> >
>> > Hey Mathew, my coworker found the same issue recently, I fixed it on my
>> > last pull request, if you'd like to pull from master.
>> >
>> > Alternatively, you could comment out the appendCgroups line in
>> > myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-
>> > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
>> > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
>> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
>> > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src <https://urldefense.
>> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
>> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
>> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
>> > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
>> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c=
>> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
>> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
>> > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
>> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_ja
>> va&d=CwIFaQ&c=
>> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
>> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
>> > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org <https://urldefense.
>> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
>> > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
>> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
>> > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
>> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
>> > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyo
>> lgGeY2ZhlU&r=
>> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
>> > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad <https://urldefense.
>> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
>> > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
>> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
>> > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler <https://urldefense.
>> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
>> > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6L
>> wDN4Ngk1qezfsY
>> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
>> > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
>> > /*NMExecutorCLGenImpl* and rebuild.
>> >
>> > Sorry that missed my QA unfortunately I'm always using cgroups and
>> didn't
>> > test that.  We may do a 0.2.1 release but I can say when.
>> >
>> > Darin
>> >
>> > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto" <ml...@keywcorp.com>
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > >
>> > >
>> > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
>> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.a
>> pache.org_
>> > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ng
>> k1qezfsYHy
>> > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> ibxhOZQSsK
>> > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58
>> BuSD5etwIm
>> > > WZHzFz6Sk&e=
>> > > Installing+for+Developers
>> > >
>> > >
>> > >
>> > > And I get the following error in the resource manager executor log in
>> > > mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
>> > resourcemanager`:
>> > >
>> > >
>> > >
>> > > chown: cannot access ‘/sys/fs/cgroup/cpu/mesos/f5d6
>> c530-c13d-4b1d-bc30-
>> > f298affb6442’:
>> > > No such file or directory
>> > >
>> > > env: /bin/yarn: No such file or directory
>> > >
>> > > ory
>> > >
>> > >
>> > >
>> > > It appears the ‘mesos’ directory doesn’t exist under
>> /sys/fs/cgroup/cpu.
>> > > Any ideas what the issue could be?
>> > >
>> > >
>> > >
>> > > This is my yarn-site.xml:
>> > >
>> > >
>> > >
>> > > <configuration>
>> > >
>> > > <!-- Site-specific YARN configuration properties -->
>> > >
>> > >    <property>
>> > >
>> > >        <name>yarn.nodemanager.aux-services</name>
>> > >
>> > >        <value>mapreduce_shuffle,myriad_executor</value>
>> > >
>> > >        <!-- If using MapR distro, please use the following value:
>> > >
>> > >
>> > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value>
>> > > -->
>> > >
>> > >    </property>
>> > >
>> > >    <property>
>> > >
>> > >
>> > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
>> > >
>> > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>> > >
>> > >    </property>
>> > >
>> > >    <property>
>> > >
>> > >
>> > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
>> > >
>> > >
>> > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
>> > >
>> > >    </property>
>> > >
>> > >    <property>
>> > >
>> > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
>> > >
>> > >        <value>2000</value>
>> > >
>> > >    </property>
>> > >
>> > >    <property>
>> > >
>> > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
>> > >
>> > >        <value>10000</value>
>> > >
>> > >    </property>
>> > >
>> > >    <property>
>> > >
>> > >
>> > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
>> > >
>> > >        <value>1000</value>
>> > >
>> > >    </property>
>> > >
>> > > <!-- Needed for Fine Grain Scaling -->
>> > >
>> > >    <property>
>> > >
>> > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
>> > >
>> > >        <value>0</value>
>> > >
>> > >    </property>
>> > >
>> > >    <property>
>> > >
>> > >        <name>yarn.scheduler.minimum-allocation-mb</name>
>> > >
>> > >        <value>0</value>
>> > >
>> > >    </property>
>> > >
>> > > <!-- Site specific YARN configuration properties -->
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
>> > >
>> > >    <value>${nodemanager.resource.cpu-vcores}</value>
>> > >
>> > > </property>
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.nodemanager.resource.memory-mb</name>
>> > >
>> > >    <value>${nodemanager.resource.memory-mb}</value>
>> > >
>> > > </property>
>> > >
>> > > <!--These options enable dynamic port assignment by mesos -->
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.nodemanager.address</name>
>> > >
>> > >    <value>${myriad.yarn.nodemanager.address}</value>
>> > >
>> > > </property>
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.nodemanager.webapp.address</name>
>> > >
>> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
>> > >
>> > > </property>
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.nodemanager.webapp.https.address</name>
>> > >
>> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
>> > >
>> > > </property>
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.nodemanager.localizer.address</name>
>> > >
>> > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
>> > >
>> > > </property>
>> > >
>> > > <!-- Configure Myriad Scheduler here -->
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.resourcemanager.scheduler.class</name>
>> > >
>> > >    <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler<
>> /value>
>> > >
>> > >    <description>One can configure other scehdulers as well from
>> > > following
>> > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
>> > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
>> > >
>> > > </property>
>> > >
>> > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.nodemanager.pmem-check-enabled</name>
>> > >
>> > >    <value>false</value>
>> > >
>> > > </property>
>> > >
>> > > <property>
>> > >
>> > >    <name>yarn.nodemanager.vmem-check-enabled</name>
>> > >
>> > >    <value>false</value>
>> > >
>> > > </property>
>> > >
>> > > </configuration>
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > My myriad-config-default.yml:
>> > >
>> > >
>> > >
>> > > mesosMaster: zk://myip:2181/mesos
>> > >
>> > > checkpoint: false
>> > >
>> > > frameworkFailoverTimeout: 43200000
>> > >
>> > > frameworkName: MyriadAlpha
>> > >
>> > > frameworkRole:
>> > >
>> > > frameworkUser: root     # User the Node Manager runs as, required if
>> > > nodeManagerURI set, otherwise defaults to the user
>> > >
>> > >                          # running the resource manager.
>> > >
>> > > frameworkSuperUser: root  # To be depricated, currently permissions
>> > > need set by a superuser due to Mesos-1790.  Must be
>> > >
>> > >                          # root or have passwordless sudo. Required if
>> > > nodeManagerURI set, ignored otherwise.
>> > >
>> > > nativeLibrary: /usr/local/lib/libmesos.so
>> > >
>> > > zkServers: myip:2181
>> > >
>> > > zkTimeout: 20000
>> > >
>> > > restApiPort: 8192
>> > >
>> > > servedConfigPath: dist/config.tgz
>> > >
>> > > servedBinaryPath: dist/binary.tgz
>> > >
>> > > profiles:
>> > >
>> > > zero:  # NMs launched with this profile dynamically obtain cpu/mem
>> > > from Mesos
>> > >
>> > >    cpu: 0
>> > >
>> > >    mem: 0
>> > >
>> > > small:
>> > >
>> > >    cpu: 2
>> > >
>> > >    mem: 2048
>> > >
>> > > medium:
>> > >
>> > >    cpu: 4
>> > >
>> > >    mem: 4096
>> > >
>> > > large:
>> > >
>> > >    cpu: 10
>> > >
>> > >    mem: 12288
>> > >
>> > > nmInstances: # NMs to start with. Requires at least 1 NM with a
>> > > non-zero profile.
>> > >
>> > > medium: 1 # <profile_name : instances>
>> > >
>> > > rebalancer: false
>> > >
>> > > haEnabled: false
>> > >
>> > > nodemanager:
>> > >
>> > > jvmMaxMemoryMB: 1024
>> > >
>> > > cpus: 0.2
>> > >
>> > > cgroups: false
>> > >
>> > > executor:
>> > >
>> > > jvmMaxMemoryMB: 256
>> > >
>> > > path:
>> > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
>> > >
>> > > #The following should be used for a remotely distributed URI, hdfs
>> > > assumed but other URI types valid.
>> > >
>> > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
>> > >
>> > > #configUri:
>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.
>> 1_api_arif
>> > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyo
>> lgGeY2ZhlU
>> > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKt
>> yVi5iruY8I
>> > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUju
>> RUfRsmew&e
>> > > =
>> > >
>> > > #jvmUri:
>> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloa
>> ds.mycompa
>> > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=
>> 31nHN1tvZeuWBT6
>> > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDla
>> bKIPtzNhAI
>> > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677
>> RH3k3CLsgl
>> > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
>> > >
>> > > yarnEnvironment:
>> > >
>> > > YARN_HOME: /opt/hadoop-2.7.2
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Thanks!
>> > >
>> > > Matt
>> > >
>> >
>>
>
>

Re: Resource manager error

Posted by John Yost <ho...@gmail.com>.
Hi Guys,

Sorry, just checked email, setting role to * does indeed cause this error.
Matt -> if you set it to nothing or whatever role you have your mesos
slaves configured as, that will fix this. My apologies for failing to log
and fix this. Darin -> I will enter a JIRA ticket for this.

To echo Darin's sentiments--thanks a bunch for checking Myriad out! :)

--John

On Wed, Aug 17, 2016 at 1:25 PM, Darin Johnson <db...@gmail.com>
wrote:

> Hey Matt,
>
> Looking through the code, I think setting myriadFrameworkRole to "*" might
> be the problem.  Can you try commenting out that line in your config?  I'll
> double check this in a little while too.  If that works I'll submit a patch
> that checks that.
>
> Sorry - Myriad is still a pretty young project!  Thanks for checking it out
> though!
>
> Darin
>
> On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
> mloppatto@keywcorp.com> wrote:
>
> > Hey Darin,
> >
> > Pulling from master got rid of the errors I was seeing, however I'm
> > running into a new issue.  After starting the resource manager, I see
> this
> > in the logs:
> >
> > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 NM(s)
> > with profile medium
> > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> MyriadOperations:
> > Adding 1 NM instances to cluster
> > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> event.handlers.ErrorEventHandler:
> > Role '' is not present in the master's --roles
> >
> > My Mesos cluster has the default "*" role so I tried setting
> > frameworkRole: "*" in myriad-config-default.yml, restarted the resource
> > manager and got this error:
> >
> > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> event.handlers.ResourceOffersEventHandler:
> > Exception thrown while trying to create a task for nm
> > java.lang.IllegalArgumentException: n must be positive
> >     at java.util.Random.nextInt(Random.java:300)
> >     at org.apache.myriad.scheduler.resource.RangeResource.
> > getRandomValues(RangeResource.java:128)
> >     at org.apache.myriad.scheduler.resource.RangeResource.
> > consumeResource(RangeResource.java:99)
> >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > consumePorts(ResourceOfferContainer.java:171)
> >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > NMTaskFactory.java:45)
> >     at org.apache.myriad.scheduler.event.handlers.
> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119)
> >     at org.apache.myriad.scheduler.event.handlers.
> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
> >     at com.lmax.disruptor.BatchEventProcessor.run(
> > BatchEventProcessor.java:128)
> >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> >     at java.lang.Thread.run(Thread.java:745)
> >
> > Does Myriad require its own role in Mesos?
> >
> > Thanks,
> > Matt
> >
> >
> > -----Original Message-----
> > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > Sent: Tuesday, August 16, 2016 6:18 PM
> > To: Dev
> > Subject: Re: Resource manager error
> >
> > Hey Mathew, my coworker found the same issue recently, I fixed it on my
> > last pull request, if you'd like to pull from master.
> >
> > Alternatively, you could comment out the appendCgroups line in
> > myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src <https://urldefense.
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c=
> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java&d=CwIFaQ&c=
> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org <https://urldefense.
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> HyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad <https://urldefense.
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler <https://urldefense.
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache_myriad_scheduler&d=CwIFaQ&c=
> 31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > /*NMExecutorCLGenImpl* and rebuild.
> >
> > Sorry that missed my QA unfortunately I'm always using cgroups and didn't
> > test that.  We may do a 0.2.1 release but I can say when.
> >
> > Darin
> >
> > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto" <ml...@keywcorp.com>
> > wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHy
> > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsK
> > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5etwIm
> > > WZHzFz6Sk&e=
> > > Installing+for+Developers
> > >
> > >
> > >
> > > And I get the following error in the resource manager executor log in
> > > mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> > resourcemanager`:
> > >
> > >
> > >
> > > chown: cannot access ‘/sys/fs/cgroup/cpu/mesos/
> f5d6c530-c13d-4b1d-bc30-
> > f298affb6442’:
> > > No such file or directory
> > >
> > > env: /bin/yarn: No such file or directory
> > >
> > > ory
> > >
> > >
> > >
> > > It appears the ‘mesos’ directory doesn’t exist under
> /sys/fs/cgroup/cpu.
> > > Any ideas what the issue could be?
> > >
> > >
> > >
> > > This is my yarn-site.xml:
> > >
> > >
> > >
> > > <configuration>
> > >
> > > <!-- Site-specific YARN configuration properties -->
> > >
> > >    <property>
> > >
> > >        <name>yarn.nodemanager.aux-services</name>
> > >
> > >        <value>mapreduce_shuffle,myriad_executor</value>
> > >
> > >        <!-- If using MapR distro, please use the following value:
> > >
> > >
> > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value>
> > > -->
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
> > >
> > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> > >
> > >
> > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > >
> > >        <value>2000</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > >
> > >        <value>10000</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
> > >
> > >        <value>1000</value>
> > >
> > >    </property>
> > >
> > > <!-- Needed for Fine Grain Scaling -->
> > >
> > >    <property>
> > >
> > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > >
> > >        <value>0</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > >
> > >        <value>0</value>
> > >
> > >    </property>
> > >
> > > <!-- Site specific YARN configuration properties -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > >
> > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > >
> > >    <value>${nodemanager.resource.memory-mb}</value>
> > >
> > > </property>
> > >
> > > <!--These options enable dynamic port assignment by mesos -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.webapp.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.webapp.https.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.localizer.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > >
> > > </property>
> > >
> > > <!-- Configure Myriad Scheduler here -->
> > >
> > > <property>
> > >
> > >    <name>yarn.resourcemanager.scheduler.class</name>
> > >
> > >    <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value>
> > >
> > >    <description>One can configure other scehdulers as well from
> > > following
> > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
> > >
> > > </property>
> > >
> > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > >
> > >    <value>false</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > >
> > >    <value>false</value>
> > >
> > > </property>
> > >
> > > </configuration>
> > >
> > >
> > >
> > >
> > >
> > > My myriad-config-default.yml:
> > >
> > >
> > >
> > > mesosMaster: zk://myip:2181/mesos
> > >
> > > checkpoint: false
> > >
> > > frameworkFailoverTimeout: 43200000
> > >
> > > frameworkName: MyriadAlpha
> > >
> > > frameworkRole:
> > >
> > > frameworkUser: root     # User the Node Manager runs as, required if
> > > nodeManagerURI set, otherwise defaults to the user
> > >
> > >                          # running the resource manager.
> > >
> > > frameworkSuperUser: root  # To be depricated, currently permissions
> > > need set by a superuser due to Mesos-1790.  Must be
> > >
> > >                          # root or have passwordless sudo. Required if
> > > nodeManagerURI set, ignored otherwise.
> > >
> > > nativeLibrary: /usr/local/lib/libmesos.so
> > >
> > > zkServers: myip:2181
> > >
> > > zkTimeout: 20000
> > >
> > > restApiPort: 8192
> > >
> > > servedConfigPath: dist/config.tgz
> > >
> > > servedBinaryPath: dist/binary.tgz
> > >
> > > profiles:
> > >
> > > zero:  # NMs launched with this profile dynamically obtain cpu/mem
> > > from Mesos
> > >
> > >    cpu: 0
> > >
> > >    mem: 0
> > >
> > > small:
> > >
> > >    cpu: 2
> > >
> > >    mem: 2048
> > >
> > > medium:
> > >
> > >    cpu: 4
> > >
> > >    mem: 4096
> > >
> > > large:
> > >
> > >    cpu: 10
> > >
> > >    mem: 12288
> > >
> > > nmInstances: # NMs to start with. Requires at least 1 NM with a
> > > non-zero profile.
> > >
> > > medium: 1 # <profile_name : instances>
> > >
> > > rebalancer: false
> > >
> > > haEnabled: false
> > >
> > > nodemanager:
> > >
> > > jvmMaxMemoryMB: 1024
> > >
> > > cpus: 0.2
> > >
> > > cgroups: false
> > >
> > > executor:
> > >
> > > jvmMaxMemoryMB: 256
> > >
> > > path:
> > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
> > >
> > > #The following should be used for a remotely distributed URI, hdfs
> > > assumed but other URI types valid.
> > >
> > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > >
> > > #configUri:
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_arif
> > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU
> > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8I
> > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsmew&e
> > > =
> > >
> > > #jvmUri:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.mycompa
> > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeuWBT6
> > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAI
> > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3CLsgl
> > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > >
> > > yarnEnvironment:
> > >
> > > YARN_HOME: /opt/hadoop-2.7.2
> > >
> > >
> > >
> > >
> > >
> > > Thanks!
> > >
> > > Matt
> > >
> >
>

Re: Resource manager error

Posted by John Yost <ho...@gmail.com>.
Okay, "*" or '*' works fine, which is slightly less horrible, so I can
update the myriad-config-default.yml accordingly to put quotes around * for
frameworkRole.

--John

On Wed, Aug 17, 2016 at 3:44 PM, John Yost <ho...@gmail.com> wrote:

> The * is causing error within the yaml parsing (see yaml special
> characters issues @http://bit.ly/2b507G3) I "fixed" this by setting
> frameworkRole  = /* and then updating MyriadConfiguration.getFrameworkRole()
> to strip off the /. Blech. We can do this or put in a constant like
> ANY_ROLE or something like that. Since this is a yaml thing, I guess it's
> okay to do /*. Again, just kinda yucky. Darin -> what do you think?
>
> --John
>
> On Wed, Aug 17, 2016 at 3:25 PM, Matthew J. Loppatto <
> mloppatto@keywcorp.com> wrote:
>
>> Ah that worked!  I'll let you know if I run into any more issues but it
>> looks like its good now.  Thanks for the help!
>>
>> Matt
>>
>> -----Original Message-----
>> From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
>> Sent: Wednesday, August 17, 2016 3:18 PM
>> To: Dev
>> Subject: Re: Resource manager error
>>
>> Take a look at your myriad configuration under yarnEnvironment.  You can
>> set JAVA_HOME there, should solve the issue. See below.
>> yarnEnvironment:
>> YARN_HOME: /usr/local/hadoop
>> #HADOOP_CONF_DIR=config
>> #HADOOP_TMP_DIR=$MESOS_SANDBOX
>> #YARN_HOME: hadoop-2.7.0 #this should be relative if nodeManagerUri is set
>> #JAVA_HOME: /usr/lib/jvm/java-default #System dependent, but sometimes
>> necessary
>> #JAVA_HOME: jre1.7.0_76 # Path to JRE distribution, relative to sandbox
>> directory
>> #JAVA_LIBRARY_PATH: /opt/mycompany/lib
>>
>> On Wed, Aug 17, 2016 at 3:13 PM, Matthew J. Loppatto <
>> mloppatto@keywcorp.com
>> > wrote:
>>
>> > I'm running the resource manager as the root user.  Checking a few of
>> > my nodes, JAVA_HOME is set on all of them for the root env.  Am I ok
>> > to be using openjdk1.7 or do I have to use Oracle jdk?
>> >
>> > Matt
>> >
>> > -----Original Message-----
>> > From: John Yost [mailto:hokiegeek2@gmail.com]
>> > Sent: Wednesday, August 17, 2016 3:01 PM
>> > To: dev@myriad.incubator.apache.org
>> > Subject: Re: Resource manager error
>> >
>> > Progress is nice! What user are you running myriad as? root? yarn? If
>> > it is the former and you are running via sudo, I've seen this type of
>> error.
>> > If so, sudo to the root user and then launch. Otherwise, please type
>> > in env if you are on linux box and confirm you see JAVA_HOME for the
>> > user you are launching myriad as.
>> >
>> > --John
>> >
>> > On Wed, Aug 17, 2016 at 2:56 PM, Matthew J. Loppatto <
>> > mloppatto@keywcorp.com
>> > > wrote:
>> >
>> > > Hey John,
>> > >
>> > > I set up a role for myriad, restarted mesos-master, and now I'm
>> > > seeing RMs starting on the Mesos UI, but they fail with the message
>> > > "lost with exit
>> > > status: 256".  The executor log says "Error: JAVA_HOME is not set
>> > > and could not be found."  $JAVA_HOME is set on all my slaves as far
>> > > as I'm
>> > aware.
>> > > Running `java -version` confirms openjdk 1.7.0_111.  Looks like its
>> > > close to a working state.  Am I missing something?
>> > >
>> > > Thanks!
>> > > Matt
>> > >
>> > > -----Original Message-----
>> > > From: John Yost [mailto:hokiegeek2@gmail.com]
>> > > Sent: Wednesday, August 17, 2016 2:38 PM
>> > > To: dev@myriad.incubator.apache.org
>> > > Subject: Re: Resource manager error
>> > >
>> > > Please uncomment frameworkRole and then add the name of whatever
>> > > Mesos role you have configured that is not *. Note: at the risk of
>> > > telling you something you already know, you define roles in
>> > /etc/mesos-master/roles.
>> > >
>> > > In the meantime, I opened up a JIRA ticket and gonna fix this ASAP
>> > > starting now! :)
>> > >
>> > > --John
>> > >
>> > > On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <
>> > > mloppatto@keywcorp.com
>> > > > wrote:
>> > >
>> > > > Hey Darin,
>> > > >
>> > > > Commenting out myriadFrameworkRole got rid of the log message
>> > > > about the missing role, but I'm still seeing the "n must be
>> positive"
>> > > exception.
>> > > >
>> > > > The only other thing of interest I see in the log is WARN fair.
>> > > AllocationFileLoaderService:
>> > > > fair-scheduler.xml not found on the classpath.  Not sure if that
>> > > > is causing any issue though.
>> > > >
>> > > > Matt
>> > > >
>> > > > -----Original Message-----
>> > > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
>> > > > Sent: Wednesday, August 17, 2016 1:26 PM
>> > > > To: Dev
>> > > > Subject: Re: Resource manager error
>> > > >
>> > > > Hey Matt,
>> > > >
>> > > > Looking through the code, I think setting myriadFrameworkRole to "*"
>> > > > might be the problem.  Can you try commenting out that line in
>> > > > your config?  I'll double check this in a little while too.  If
>> > > > that works I'll submit a patch that checks that.
>> > > >
>> > > > Sorry - Myriad is still a pretty young project!  Thanks for
>> > > > checking it out though!
>> > > >
>> > > > Darin
>> > > >
>> > > > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
>> > > > mloppatto@keywcorp.com> wrote:
>> > > >
>> > > > > Hey Darin,
>> > > > >
>> > > > > Pulling from master got rid of the errors I was seeing, however
>> > > > > I'm running into a new issue.  After starting the resource
>> > > > > manager, I see this in the logs:
>> > > > >
>> > > > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1
>> > > > > NM(s) with profile medium
>> > > > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
>> > > > MyriadOperations:
>> > > > > Adding 1 NM instances to cluster
>> > > > > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
>> > > > event.handlers.ErrorEventHandler:
>> > > > > Role '' is not present in the master's --roles
>> > > > >
>> > > > > My Mesos cluster has the default "*" role so I tried setting
>> > > > > frameworkRole: "*" in myriad-config-default.yml, restarted the
>> > > > > resource manager and got this error:
>> > > > >
>> > > > > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
>> > > > event.handlers.ResourceOffersEventHandler:
>> > > > > Exception thrown while trying to create a task for nm
>> > > > > java.lang.IllegalArgumentException: n must be positive
>> > > > >     at java.util.Random.nextInt(Random.java:300)
>> > > > >     at org.apache.myriad.scheduler.resource.RangeResource.
>> > > > > getRandomValues(RangeResource.java:128)
>> > > > >     at org.apache.myriad.scheduler.resource.RangeResource.
>> > > > > consumeResource(RangeResource.java:99)
>> > > > >     at org.apache.myriad.scheduler.re
>> source.ResourceOfferContainer.
>> > > > > consumePorts(ResourceOfferContainer.java:171)
>> > > > >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
>> > > > > NMTaskFactory.java:45)
>> > > > >     at org.apache.myriad.scheduler.event.handlers.
>> > > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
>> > > java:119)
>> > > > >     at org.apache.myriad.scheduler.event.handlers.
>> > > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
>> > java:49)
>> > > > >     at com.lmax.disruptor.BatchEventProcessor.run(
>> > > > > BatchEventProcessor.java:128)
>> > > > >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> > > > > ThreadPoolExecutor.java:1145)
>> > > > >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> > > > > ThreadPoolExecutor.java:615)
>> > > > >     at java.lang.Thread.run(Thread.java:745)
>> > > > >
>> > > > > Does Myriad require its own role in Mesos?
>> > > > >
>> > > > > Thanks,
>> > > > > Matt
>> > > > >
>> > > > >
>> > > > > -----Original Message-----
>> > > > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
>> > > > > Sent: Tuesday, August 16, 2016 6:18 PM
>> > > > > To: Dev
>> > > > > Subject: Re: Resource manager error
>> > > > >
>> > > > > Hey Mathew, my coworker found the same issue recently, I fixed
>> > > > > it on my last pull request, if you'd like to pull from master.
>> > > > >
>> > > > > Alternatively, you could comment out the appendCgroups line in
>> > > > > myriad-scheduler
>> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-
>> > > > > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
>> > > > > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
>> > > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
>> > > > > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src
>> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
>> > > > > &d
>> > > > > =C
>> > > > > wI
>> > > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
>> > > > > Sx
>> > > > > aG
>> > > > > Dn
>> > > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
>> > > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
>> > > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
>> > > > > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
>> > > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
>> > > > > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
>> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
>> > > > > ap
>> > > > > ac
>> > > > > he
>> > > > > _
>> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIF
>> > > > > aQ &c = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
>> > > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
>> > > > > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
>> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
>> > > > > ap
>> > > > > ac
>> > > > > he
>> > > > > _
>> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d
>> > > > > =C wI Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
>> > > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
>> > > > > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org
>> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
>> > > > > &d
>> > > > > =C
>> > > > > wI
>> > > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
>> > > > > Sx
>> > > > > aG
>> > > > > Dn
>> > > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
>> > > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
>> > > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
>> > > > > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&
>> > > > > r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
>> > > > > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
>> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
>> > > > > ap ac he _
>> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
>> > > > > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGe
>> > > > > Y2 Zh lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
>> > > > > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad
>> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
>> > > > > &d
>> > > > > =C
>> > > > > wI
>> > > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
>> > > > > Sx
>> > > > > aG
>> > > > > Dn
>> > > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
>> > > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
>> > > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
>> > > > > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
>> > > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtX
>> > > > > x- w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler
>> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
>> > > > > &d
>> > > > > =C
>> > > > > wI
>> > > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
>> > > > > Sx
>> > > > > aG
>> > > > > Dn
>> > > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
>> > > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
>> > > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
>> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
>> > > > > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4
>> > > > > Ng
>> > > > > k1
>> > > > > qe zfsY
>> > > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
>> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
>> > > > > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
>> > > > > /*NMExecutorCLGenImpl* and rebuild.
>> > > > >
>> > > > > Sorry that missed my QA unfortunately I'm always using cgroups
>> > > > > and didn't test that.  We may do a 0.2.1 release but I can say
>> when.
>> > > > >
>> > > > > Darin
>> > > > >
>> > > > > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
>> > > > > <ml...@keywcorp.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this
>> > guide:
>> > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.a
>> pache.
>> > > > > > or
>> > > > > > g_
>> > > > > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1
>> > > > > > qe
>> > > > > > zf
>> > > > > > sY
>> > > > > > Hy
>> > > > > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ib
>> > > > > > xh
>> > > > > > OZ
>> > > > > > QS
>> > > > > > sK
>> > > > > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58Bu
>> > > > > > SD
>> > > > > > 5e
>> > > > > > tw
>> > > > > > Im
>> > > > > > WZHzFz6Sk&e=
>> > > > > > Installing+for+Developers
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > And I get the following error in the resource manager executor
>> > > > > > log in mesos after starting it with
>> > > > > > `/opt/hadoop-2.7.2/bin/yarn
>> > > > > resourcemanager`:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > chown: cannot access
>> > > > > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
>> > > > > f298affb6442’:
>> > > > > > No such file or directory
>> > > > > >
>> > > > > > env: /bin/yarn: No such file or directory
>> > > > > >
>> > > > > > ory
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > It appears the ‘mesos’ directory doesn’t exist under
>> > > > /sys/fs/cgroup/cpu.
>> > > > > > Any ideas what the issue could be?
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > This is my yarn-site.xml:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > <configuration>
>> > > > > >
>> > > > > > <!-- Site-specific YARN configuration properties -->
>> > > > > >
>> > > > > >    <property>
>> > > > > >
>> > > > > >        <name>yarn.nodemanager.aux-services</name>
>> > > > > >
>> > > > > >        <value>mapreduce_shuffle,myriad_executor</value>
>> > > > > >
>> > > > > >        <!-- If using MapR distro, please use the following
>> value:
>> > > > > >
>> > > > > >
>> > > > > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</
>> > > > > > va
>> > > > > > lu
>> > > > > > e>
>> > > > > > -->
>> > > > > >
>> > > > > >    </property>
>> > > > > >
>> > > > > >    <property>
>> > > > > >
>> > > > > >
>> > > > > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</n
>> > > > > > am
>> > > > > > e>
>> > > > > >
>> > > > > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>> > > > > >
>> > > > > >    </property>
>> > > > > >
>> > > > > >    <property>
>> > > > > >
>> > > > > >
>> > > > > > <name>yarn.nodemanager.aux-services.myriad_executor.class</nam
>> > > > > > e>
>> > > > > >
>> > > > > >
>> > > > > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</va
>> > > > > > lu
>> > > > > > e>
>> > > > > >
>> > > > > >    </property>
>> > > > > >
>> > > > > >    <property>
>> > > > > >
>> > > > > >
>> > > > > > <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
>> > > > > >
>> > > > > >        <value>2000</value>
>> > > > > >
>> > > > > >    </property>
>> > > > > >
>> > > > > >    <property>
>> > > > > >
>> > > > > >
>> > > > > > <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
>> > > > > >
>> > > > > >        <value>10000</value>
>> > > > > >
>> > > > > >    </property>
>> > > > > >
>> > > > > >    <property>
>> > > > > >
>> > > > > >
>> > > > > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</na
>> > > > > > me
>> > > > > > >
>> > > > > >
>> > > > > >        <value>1000</value>
>> > > > > >
>> > > > > >    </property>
>> > > > > >
>> > > > > > <!-- Needed for Fine Grain Scaling -->
>> > > > > >
>> > > > > >    <property>
>> > > > > >
>> > > > > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
>> > > > > >
>> > > > > >        <value>0</value>
>> > > > > >
>> > > > > >    </property>
>> > > > > >
>> > > > > >    <property>
>> > > > > >
>> > > > > >        <name>yarn.scheduler.minimum-allocation-mb</name>
>> > > > > >
>> > > > > >        <value>0</value>
>> > > > > >
>> > > > > >    </property>
>> > > > > >
>> > > > > > <!-- Site specific YARN configuration properties -->
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
>> > > > > >
>> > > > > >    <value>${nodemanager.resource.cpu-vcores}</value>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.nodemanager.resource.memory-mb</name>
>> > > > > >
>> > > > > >    <value>${nodemanager.resource.memory-mb}</value>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > <!--These options enable dynamic port assignment by mesos -->
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.nodemanager.address</name>
>> > > > > >
>> > > > > >    <value>${myriad.yarn.nodemanager.address}</value>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.nodemanager.webapp.address</name>
>> > > > > >
>> > > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.nodemanager.webapp.https.address</name>
>> > > > > >
>> > > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.nodemanager.localizer.address</name>
>> > > > > >
>> > > > > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > <!-- Configure Myriad Scheduler here -->
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.resourcemanager.scheduler.class</name>
>> > > > > >
>> > > > > >
>> > > > > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</v
>> > > > > > al
>> > > > > > ue
>> > > > > > >
>> > > > > >
>> > > > > >    <description>One can configure other scehdulers as well
>> > > > > > from following
>> > > > > > list:
>> > > > > > org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
>> > > > > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</descript
>> > > > > > io
>> > > > > > n>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.nodemanager.pmem-check-enabled</name>
>> > > > > >
>> > > > > >    <value>false</value>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > <property>
>> > > > > >
>> > > > > >    <name>yarn.nodemanager.vmem-check-enabled</name>
>> > > > > >
>> > > > > >    <value>false</value>
>> > > > > >
>> > > > > > </property>
>> > > > > >
>> > > > > > </configuration>
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > My myriad-config-default.yml:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > mesosMaster: zk://myip:2181/mesos
>> > > > > >
>> > > > > > checkpoint: false
>> > > > > >
>> > > > > > frameworkFailoverTimeout: 43200000
>> > > > > >
>> > > > > > frameworkName: MyriadAlpha
>> > > > > >
>> > > > > > frameworkRole:
>> > > > > >
>> > > > > > frameworkUser: root     # User the Node Manager runs as,
>> required
>> > if
>> > > > > > nodeManagerURI set, otherwise defaults to the user
>> > > > > >
>> > > > > >                          # running the resource manager.
>> > > > > >
>> > > > > > frameworkSuperUser: root  # To be depricated, currently
>> > > > > > permissions need set by a superuser due to Mesos-1790.  Must
>> > > > > > be
>> > > > > >
>> > > > > >                          # root or have passwordless sudo.
>> > > > > > Required if nodeManagerURI set, ignored otherwise.
>> > > > > >
>> > > > > > nativeLibrary: /usr/local/lib/libmesos.so
>> > > > > >
>> > > > > > zkServers: myip:2181
>> > > > > >
>> > > > > > zkTimeout: 20000
>> > > > > >
>> > > > > > restApiPort: 8192
>> > > > > >
>> > > > > > servedConfigPath: dist/config.tgz
>> > > > > >
>> > > > > > servedBinaryPath: dist/binary.tgz
>> > > > > >
>> > > > > > profiles:
>> > > > > >
>> > > > > > zero:  # NMs launched with this profile dynamically obtain
>> > > > > > cpu/mem from Mesos
>> > > > > >
>> > > > > >    cpu: 0
>> > > > > >
>> > > > > >    mem: 0
>> > > > > >
>> > > > > > small:
>> > > > > >
>> > > > > >    cpu: 2
>> > > > > >
>> > > > > >    mem: 2048
>> > > > > >
>> > > > > > medium:
>> > > > > >
>> > > > > >    cpu: 4
>> > > > > >
>> > > > > >    mem: 4096
>> > > > > >
>> > > > > > large:
>> > > > > >
>> > > > > >    cpu: 10
>> > > > > >
>> > > > > >    mem: 12288
>> > > > > >
>> > > > > > nmInstances: # NMs to start with. Requires at least 1 NM with
>> > > > > > a non-zero profile.
>> > > > > >
>> > > > > > medium: 1 # <profile_name : instances>
>> > > > > >
>> > > > > > rebalancer: false
>> > > > > >
>> > > > > > haEnabled: false
>> > > > > >
>> > > > > > nodemanager:
>> > > > > >
>> > > > > > jvmMaxMemoryMB: 1024
>> > > > > >
>> > > > > > cpus: 0.2
>> > > > > >
>> > > > > > cgroups: false
>> > > > > >
>> > > > > > executor:
>> > > > > >
>> > > > > > jvmMaxMemoryMB: 256
>> > > > > >
>> > > > > > path:
>> > > > > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0
>> > > > > > .j
>> > > > > > ar
>> > > > > >
>> > > > > > #The following should be used for a remotely distributed URI,
>> > > > > > hdfs assumed but other URI types valid.
>> > > > > >
>> > > > > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
>> > > > > >
>> > > > > > #configUri:
>> > > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_
>> > > > > > ap
>> > > > > > i_
>> > > > > > ar
>> > > > > > if
>> > > > > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolg
>> > > > > > Ge
>> > > > > > Y2
>> > > > > > Zh
>> > > > > > lU
>> > > > > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyV
>> > > > > > i5
>> > > > > > ir
>> > > > > > uY
>> > > > > > 8I
>> > > > > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRU
>> > > > > > fR
>> > > > > > sm
>> > > > > > ew
>> > > > > > &e
>> > > > > > =
>> > > > > >
>> > > > > > #jvmUri:
>> > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads
>> > > > > > .m
>> > > > > > yc
>> > > > > > om
>> > > > > > pa
>> > > > > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1t
>> > > > > > vZ
>> > > > > > eu
>> > > > > > WB
>> > > > > > T6
>> > > > > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabK
>> > > > > > IP
>> > > > > > tz
>> > > > > > Nh
>> > > > > > AI
>> > > > > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH
>> > > > > > 3k
>> > > > > > 3C
>> > > > > > Ls
>> > > > > > gl
>> > > > > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
>> > > > > >
>> > > > > > yarnEnvironment:
>> > > > > >
>> > > > > > YARN_HOME: /opt/hadoop-2.7.2
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Thanks!
>> > > > > >
>> > > > > > Matt
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Resource manager error

Posted by John Yost <ho...@gmail.com>.
The * is causing error within the yaml parsing (see yaml special characters
issues @http://bit.ly/2b507G3) I "fixed" this by setting frameworkRole  =
/* and then updating MyriadConfiguration.getFrameworkRole() to strip off
the /. Blech. We can do this or put in a constant like ANY_ROLE or
something like that. Since this is a yaml thing, I guess it's okay to do
/*. Again, just kinda yucky. Darin -> what do you think?

--John

On Wed, Aug 17, 2016 at 3:25 PM, Matthew J. Loppatto <mloppatto@keywcorp.com
> wrote:

> Ah that worked!  I'll let you know if I run into any more issues but it
> looks like its good now.  Thanks for the help!
>
> Matt
>
> -----Original Message-----
> From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> Sent: Wednesday, August 17, 2016 3:18 PM
> To: Dev
> Subject: Re: Resource manager error
>
> Take a look at your myriad configuration under yarnEnvironment.  You can
> set JAVA_HOME there, should solve the issue. See below.
> yarnEnvironment:
> YARN_HOME: /usr/local/hadoop
> #HADOOP_CONF_DIR=config
> #HADOOP_TMP_DIR=$MESOS_SANDBOX
> #YARN_HOME: hadoop-2.7.0 #this should be relative if nodeManagerUri is set
> #JAVA_HOME: /usr/lib/jvm/java-default #System dependent, but sometimes
> necessary
> #JAVA_HOME: jre1.7.0_76 # Path to JRE distribution, relative to sandbox
> directory
> #JAVA_LIBRARY_PATH: /opt/mycompany/lib
>
> On Wed, Aug 17, 2016 at 3:13 PM, Matthew J. Loppatto <
> mloppatto@keywcorp.com
> > wrote:
>
> > I'm running the resource manager as the root user.  Checking a few of
> > my nodes, JAVA_HOME is set on all of them for the root env.  Am I ok
> > to be using openjdk1.7 or do I have to use Oracle jdk?
> >
> > Matt
> >
> > -----Original Message-----
> > From: John Yost [mailto:hokiegeek2@gmail.com]
> > Sent: Wednesday, August 17, 2016 3:01 PM
> > To: dev@myriad.incubator.apache.org
> > Subject: Re: Resource manager error
> >
> > Progress is nice! What user are you running myriad as? root? yarn? If
> > it is the former and you are running via sudo, I've seen this type of
> error.
> > If so, sudo to the root user and then launch. Otherwise, please type
> > in env if you are on linux box and confirm you see JAVA_HOME for the
> > user you are launching myriad as.
> >
> > --John
> >
> > On Wed, Aug 17, 2016 at 2:56 PM, Matthew J. Loppatto <
> > mloppatto@keywcorp.com
> > > wrote:
> >
> > > Hey John,
> > >
> > > I set up a role for myriad, restarted mesos-master, and now I'm
> > > seeing RMs starting on the Mesos UI, but they fail with the message
> > > "lost with exit
> > > status: 256".  The executor log says "Error: JAVA_HOME is not set
> > > and could not be found."  $JAVA_HOME is set on all my slaves as far
> > > as I'm
> > aware.
> > > Running `java -version` confirms openjdk 1.7.0_111.  Looks like its
> > > close to a working state.  Am I missing something?
> > >
> > > Thanks!
> > > Matt
> > >
> > > -----Original Message-----
> > > From: John Yost [mailto:hokiegeek2@gmail.com]
> > > Sent: Wednesday, August 17, 2016 2:38 PM
> > > To: dev@myriad.incubator.apache.org
> > > Subject: Re: Resource manager error
> > >
> > > Please uncomment frameworkRole and then add the name of whatever
> > > Mesos role you have configured that is not *. Note: at the risk of
> > > telling you something you already know, you define roles in
> > /etc/mesos-master/roles.
> > >
> > > In the meantime, I opened up a JIRA ticket and gonna fix this ASAP
> > > starting now! :)
> > >
> > > --John
> > >
> > > On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <
> > > mloppatto@keywcorp.com
> > > > wrote:
> > >
> > > > Hey Darin,
> > > >
> > > > Commenting out myriadFrameworkRole got rid of the log message
> > > > about the missing role, but I'm still seeing the "n must be positive"
> > > exception.
> > > >
> > > > The only other thing of interest I see in the log is WARN fair.
> > > AllocationFileLoaderService:
> > > > fair-scheduler.xml not found on the classpath.  Not sure if that
> > > > is causing any issue though.
> > > >
> > > > Matt
> > > >
> > > > -----Original Message-----
> > > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > > Sent: Wednesday, August 17, 2016 1:26 PM
> > > > To: Dev
> > > > Subject: Re: Resource manager error
> > > >
> > > > Hey Matt,
> > > >
> > > > Looking through the code, I think setting myriadFrameworkRole to "*"
> > > > might be the problem.  Can you try commenting out that line in
> > > > your config?  I'll double check this in a little while too.  If
> > > > that works I'll submit a patch that checks that.
> > > >
> > > > Sorry - Myriad is still a pretty young project!  Thanks for
> > > > checking it out though!
> > > >
> > > > Darin
> > > >
> > > > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
> > > > mloppatto@keywcorp.com> wrote:
> > > >
> > > > > Hey Darin,
> > > > >
> > > > > Pulling from master got rid of the errors I was seeing, however
> > > > > I'm running into a new issue.  After starting the resource
> > > > > manager, I see this in the logs:
> > > > >
> > > > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1
> > > > > NM(s) with profile medium
> > > > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> > > > MyriadOperations:
> > > > > Adding 1 NM instances to cluster
> > > > > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> > > > event.handlers.ErrorEventHandler:
> > > > > Role '' is not present in the master's --roles
> > > > >
> > > > > My Mesos cluster has the default "*" role so I tried setting
> > > > > frameworkRole: "*" in myriad-config-default.yml, restarted the
> > > > > resource manager and got this error:
> > > > >
> > > > > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> > > > event.handlers.ResourceOffersEventHandler:
> > > > > Exception thrown while trying to create a task for nm
> > > > > java.lang.IllegalArgumentException: n must be positive
> > > > >     at java.util.Random.nextInt(Random.java:300)
> > > > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > > > getRandomValues(RangeResource.java:128)
> > > > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > > > consumeResource(RangeResource.java:99)
> > > > >     at org.apache.myriad.scheduler.resource.
> ResourceOfferContainer.
> > > > > consumePorts(ResourceOfferContainer.java:171)
> > > > >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > > > > NMTaskFactory.java:45)
> > > > >     at org.apache.myriad.scheduler.event.handlers.
> > > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> > > java:119)
> > > > >     at org.apache.myriad.scheduler.event.handlers.
> > > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> > java:49)
> > > > >     at com.lmax.disruptor.BatchEventProcessor.run(
> > > > > BatchEventProcessor.java:128)
> > > > >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > > ThreadPoolExecutor.java:1145)
> > > > >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > > ThreadPoolExecutor.java:615)
> > > > >     at java.lang.Thread.run(Thread.java:745)
> > > > >
> > > > > Does Myriad require its own role in Mesos?
> > > > >
> > > > > Thanks,
> > > > > Matt
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > > > Sent: Tuesday, August 16, 2016 6:18 PM
> > > > > To: Dev
> > > > > Subject: Re: Resource manager error
> > > > >
> > > > > Hey Mathew, my coworker found the same issue recently, I fixed
> > > > > it on my last pull request, if you'd like to pull from master.
> > > > >
> > > > > Alternatively, you could comment out the appendCgroups line in
> > > > > myriad-scheduler
> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-
> > > > > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > > > > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > > > > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src
> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
> > > > > &d
> > > > > =C
> > > > > wI
> > > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
> > > > > Sx
> > > > > aG
> > > > > Dn
> > > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
> > > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > > > > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
> > > > > ap
> > > > > ac
> > > > > he
> > > > > _
> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIF
> > > > > aQ &c = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
> > > > > ap
> > > > > ac
> > > > > he
> > > > > _
> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d
> > > > > =C wI Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > > > > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org
> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
> > > > > &d
> > > > > =C
> > > > > wI
> > > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
> > > > > Sx
> > > > > aG
> > > > > Dn
> > > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
> > > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&
> > > > > r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
> > > > > ap ac he _
> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGe
> > > > > Y2 Zh lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > > > > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad
> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
> > > > > &d
> > > > > =C
> > > > > wI
> > > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
> > > > > Sx
> > > > > aG
> > > > > Dn
> > > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
> > > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtX
> > > > > x- w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler
> > > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
> > > > > &d
> > > > > =C
> > > > > wI
> > > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
> > > > > Sx
> > > > > aG
> > > > > Dn
> > > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
> > > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4
> > > > > Ng
> > > > > k1
> > > > > qe zfsY
> > > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > > > > /*NMExecutorCLGenImpl* and rebuild.
> > > > >
> > > > > Sorry that missed my QA unfortunately I'm always using cgroups
> > > > > and didn't test that.  We may do a 0.2.1 release but I can say
> when.
> > > > >
> > > > > Darin
> > > > >
> > > > > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > > > > <ml...@keywcorp.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > >
> > > > > >
> > > > > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this
> > guide:
> > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.
> apache.
> > > > > > or
> > > > > > g_
> > > > > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1
> > > > > > qe
> > > > > > zf
> > > > > > sY
> > > > > > Hy
> > > > > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ib
> > > > > > xh
> > > > > > OZ
> > > > > > QS
> > > > > > sK
> > > > > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58Bu
> > > > > > SD
> > > > > > 5e
> > > > > > tw
> > > > > > Im
> > > > > > WZHzFz6Sk&e=
> > > > > > Installing+for+Developers
> > > > > >
> > > > > >
> > > > > >
> > > > > > And I get the following error in the resource manager executor
> > > > > > log in mesos after starting it with
> > > > > > `/opt/hadoop-2.7.2/bin/yarn
> > > > > resourcemanager`:
> > > > > >
> > > > > >
> > > > > >
> > > > > > chown: cannot access
> > > > > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > > > > f298affb6442’:
> > > > > > No such file or directory
> > > > > >
> > > > > > env: /bin/yarn: No such file or directory
> > > > > >
> > > > > > ory
> > > > > >
> > > > > >
> > > > > >
> > > > > > It appears the ‘mesos’ directory doesn’t exist under
> > > > /sys/fs/cgroup/cpu.
> > > > > > Any ideas what the issue could be?
> > > > > >
> > > > > >
> > > > > >
> > > > > > This is my yarn-site.xml:
> > > > > >
> > > > > >
> > > > > >
> > > > > > <configuration>
> > > > > >
> > > > > > <!-- Site-specific YARN configuration properties -->
> > > > > >
> > > > > >    <property>
> > > > > >
> > > > > >        <name>yarn.nodemanager.aux-services</name>
> > > > > >
> > > > > >        <value>mapreduce_shuffle,myriad_executor</value>
> > > > > >
> > > > > >        <!-- If using MapR distro, please use the following value:
> > > > > >
> > > > > >
> > > > > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</
> > > > > > va
> > > > > > lu
> > > > > > e>
> > > > > > -->
> > > > > >
> > > > > >    </property>
> > > > > >
> > > > > >    <property>
> > > > > >
> > > > > >
> > > > > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</n
> > > > > > am
> > > > > > e>
> > > > > >
> > > > > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > > > > >
> > > > > >    </property>
> > > > > >
> > > > > >    <property>
> > > > > >
> > > > > >
> > > > > > <name>yarn.nodemanager.aux-services.myriad_executor.class</nam
> > > > > > e>
> > > > > >
> > > > > >
> > > > > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</va
> > > > > > lu
> > > > > > e>
> > > > > >
> > > > > >    </property>
> > > > > >
> > > > > >    <property>
> > > > > >
> > > > > >
> > > > > > <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > > > > >
> > > > > >        <value>2000</value>
> > > > > >
> > > > > >    </property>
> > > > > >
> > > > > >    <property>
> > > > > >
> > > > > >
> > > > > > <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > > > > >
> > > > > >        <value>10000</value>
> > > > > >
> > > > > >    </property>
> > > > > >
> > > > > >    <property>
> > > > > >
> > > > > >
> > > > > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</na
> > > > > > me
> > > > > > >
> > > > > >
> > > > > >        <value>1000</value>
> > > > > >
> > > > > >    </property>
> > > > > >
> > > > > > <!-- Needed for Fine Grain Scaling -->
> > > > > >
> > > > > >    <property>
> > > > > >
> > > > > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > > > >
> > > > > >        <value>0</value>
> > > > > >
> > > > > >    </property>
> > > > > >
> > > > > >    <property>
> > > > > >
> > > > > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > > > > >
> > > > > >        <value>0</value>
> > > > > >
> > > > > >    </property>
> > > > > >
> > > > > > <!-- Site specific YARN configuration properties -->
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > > > >
> > > > > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > > > > >
> > > > > >    <value>${nodemanager.resource.memory-mb}</value>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > <!--These options enable dynamic port assignment by mesos -->
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.nodemanager.address</name>
> > > > > >
> > > > > >    <value>${myriad.yarn.nodemanager.address}</value>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.nodemanager.webapp.address</name>
> > > > > >
> > > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.nodemanager.webapp.https.address</name>
> > > > > >
> > > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.nodemanager.localizer.address</name>
> > > > > >
> > > > > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > <!-- Configure Myriad Scheduler here -->
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.resourcemanager.scheduler.class</name>
> > > > > >
> > > > > >
> > > > > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</v
> > > > > > al
> > > > > > ue
> > > > > > >
> > > > > >
> > > > > >    <description>One can configure other scehdulers as well
> > > > > > from following
> > > > > > list:
> > > > > > org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > > > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</descript
> > > > > > io
> > > > > > n>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > > > > >
> > > > > >    <value>false</value>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > <property>
> > > > > >
> > > > > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > > > > >
> > > > > >    <value>false</value>
> > > > > >
> > > > > > </property>
> > > > > >
> > > > > > </configuration>
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > My myriad-config-default.yml:
> > > > > >
> > > > > >
> > > > > >
> > > > > > mesosMaster: zk://myip:2181/mesos
> > > > > >
> > > > > > checkpoint: false
> > > > > >
> > > > > > frameworkFailoverTimeout: 43200000
> > > > > >
> > > > > > frameworkName: MyriadAlpha
> > > > > >
> > > > > > frameworkRole:
> > > > > >
> > > > > > frameworkUser: root     # User the Node Manager runs as, required
> > if
> > > > > > nodeManagerURI set, otherwise defaults to the user
> > > > > >
> > > > > >                          # running the resource manager.
> > > > > >
> > > > > > frameworkSuperUser: root  # To be depricated, currently
> > > > > > permissions need set by a superuser due to Mesos-1790.  Must
> > > > > > be
> > > > > >
> > > > > >                          # root or have passwordless sudo.
> > > > > > Required if nodeManagerURI set, ignored otherwise.
> > > > > >
> > > > > > nativeLibrary: /usr/local/lib/libmesos.so
> > > > > >
> > > > > > zkServers: myip:2181
> > > > > >
> > > > > > zkTimeout: 20000
> > > > > >
> > > > > > restApiPort: 8192
> > > > > >
> > > > > > servedConfigPath: dist/config.tgz
> > > > > >
> > > > > > servedBinaryPath: dist/binary.tgz
> > > > > >
> > > > > > profiles:
> > > > > >
> > > > > > zero:  # NMs launched with this profile dynamically obtain
> > > > > > cpu/mem from Mesos
> > > > > >
> > > > > >    cpu: 0
> > > > > >
> > > > > >    mem: 0
> > > > > >
> > > > > > small:
> > > > > >
> > > > > >    cpu: 2
> > > > > >
> > > > > >    mem: 2048
> > > > > >
> > > > > > medium:
> > > > > >
> > > > > >    cpu: 4
> > > > > >
> > > > > >    mem: 4096
> > > > > >
> > > > > > large:
> > > > > >
> > > > > >    cpu: 10
> > > > > >
> > > > > >    mem: 12288
> > > > > >
> > > > > > nmInstances: # NMs to start with. Requires at least 1 NM with
> > > > > > a non-zero profile.
> > > > > >
> > > > > > medium: 1 # <profile_name : instances>
> > > > > >
> > > > > > rebalancer: false
> > > > > >
> > > > > > haEnabled: false
> > > > > >
> > > > > > nodemanager:
> > > > > >
> > > > > > jvmMaxMemoryMB: 1024
> > > > > >
> > > > > > cpus: 0.2
> > > > > >
> > > > > > cgroups: false
> > > > > >
> > > > > > executor:
> > > > > >
> > > > > > jvmMaxMemoryMB: 256
> > > > > >
> > > > > > path:
> > > > > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0
> > > > > > .j
> > > > > > ar
> > > > > >
> > > > > > #The following should be used for a remotely distributed URI,
> > > > > > hdfs assumed but other URI types valid.
> > > > > >
> > > > > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > > > > >
> > > > > > #configUri:
> > > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_
> > > > > > ap
> > > > > > i_
> > > > > > ar
> > > > > > if
> > > > > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolg
> > > > > > Ge
> > > > > > Y2
> > > > > > Zh
> > > > > > lU
> > > > > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyV
> > > > > > i5
> > > > > > ir
> > > > > > uY
> > > > > > 8I
> > > > > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRU
> > > > > > fR
> > > > > > sm
> > > > > > ew
> > > > > > &e
> > > > > > =
> > > > > >
> > > > > > #jvmUri:
> > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads
> > > > > > .m
> > > > > > yc
> > > > > > om
> > > > > > pa
> > > > > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1t
> > > > > > vZ
> > > > > > eu
> > > > > > WB
> > > > > > T6
> > > > > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabK
> > > > > > IP
> > > > > > tz
> > > > > > Nh
> > > > > > AI
> > > > > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH
> > > > > > 3k
> > > > > > 3C
> > > > > > Ls
> > > > > > gl
> > > > > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > > > > >
> > > > > > yarnEnvironment:
> > > > > >
> > > > > > YARN_HOME: /opt/hadoop-2.7.2
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Matt
> > > > > >
> > > > >
> > > >
> > >
> >
>

RE: Resource manager error

Posted by "Matthew J. Loppatto" <ml...@keywcorp.com>.
Ah that worked!  I'll let you know if I run into any more issues but it looks like its good now.  Thanks for the help!

Matt

-----Original Message-----
From: Darin Johnson [mailto:dbjohnson1978@gmail.com] 
Sent: Wednesday, August 17, 2016 3:18 PM
To: Dev
Subject: Re: Resource manager error

Take a look at your myriad configuration under yarnEnvironment.  You can set JAVA_HOME there, should solve the issue. See below.
yarnEnvironment:
YARN_HOME: /usr/local/hadoop
#HADOOP_CONF_DIR=config
#HADOOP_TMP_DIR=$MESOS_SANDBOX
#YARN_HOME: hadoop-2.7.0 #this should be relative if nodeManagerUri is set
#JAVA_HOME: /usr/lib/jvm/java-default #System dependent, but sometimes necessary
#JAVA_HOME: jre1.7.0_76 # Path to JRE distribution, relative to sandbox directory
#JAVA_LIBRARY_PATH: /opt/mycompany/lib

On Wed, Aug 17, 2016 at 3:13 PM, Matthew J. Loppatto <mloppatto@keywcorp.com
> wrote:

> I'm running the resource manager as the root user.  Checking a few of 
> my nodes, JAVA_HOME is set on all of them for the root env.  Am I ok 
> to be using openjdk1.7 or do I have to use Oracle jdk?
>
> Matt
>
> -----Original Message-----
> From: John Yost [mailto:hokiegeek2@gmail.com]
> Sent: Wednesday, August 17, 2016 3:01 PM
> To: dev@myriad.incubator.apache.org
> Subject: Re: Resource manager error
>
> Progress is nice! What user are you running myriad as? root? yarn? If 
> it is the former and you are running via sudo, I've seen this type of error.
> If so, sudo to the root user and then launch. Otherwise, please type 
> in env if you are on linux box and confirm you see JAVA_HOME for the 
> user you are launching myriad as.
>
> --John
>
> On Wed, Aug 17, 2016 at 2:56 PM, Matthew J. Loppatto < 
> mloppatto@keywcorp.com
> > wrote:
>
> > Hey John,
> >
> > I set up a role for myriad, restarted mesos-master, and now I'm 
> > seeing RMs starting on the Mesos UI, but they fail with the message 
> > "lost with exit
> > status: 256".  The executor log says "Error: JAVA_HOME is not set 
> > and could not be found."  $JAVA_HOME is set on all my slaves as far 
> > as I'm
> aware.
> > Running `java -version` confirms openjdk 1.7.0_111.  Looks like its 
> > close to a working state.  Am I missing something?
> >
> > Thanks!
> > Matt
> >
> > -----Original Message-----
> > From: John Yost [mailto:hokiegeek2@gmail.com]
> > Sent: Wednesday, August 17, 2016 2:38 PM
> > To: dev@myriad.incubator.apache.org
> > Subject: Re: Resource manager error
> >
> > Please uncomment frameworkRole and then add the name of whatever 
> > Mesos role you have configured that is not *. Note: at the risk of 
> > telling you something you already know, you define roles in
> /etc/mesos-master/roles.
> >
> > In the meantime, I opened up a JIRA ticket and gonna fix this ASAP 
> > starting now! :)
> >
> > --John
> >
> > On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto < 
> > mloppatto@keywcorp.com
> > > wrote:
> >
> > > Hey Darin,
> > >
> > > Commenting out myriadFrameworkRole got rid of the log message 
> > > about the missing role, but I'm still seeing the "n must be positive"
> > exception.
> > >
> > > The only other thing of interest I see in the log is WARN fair.
> > AllocationFileLoaderService:
> > > fair-scheduler.xml not found on the classpath.  Not sure if that 
> > > is causing any issue though.
> > >
> > > Matt
> > >
> > > -----Original Message-----
> > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > Sent: Wednesday, August 17, 2016 1:26 PM
> > > To: Dev
> > > Subject: Re: Resource manager error
> > >
> > > Hey Matt,
> > >
> > > Looking through the code, I think setting myriadFrameworkRole to "*"
> > > might be the problem.  Can you try commenting out that line in 
> > > your config?  I'll double check this in a little while too.  If 
> > > that works I'll submit a patch that checks that.
> > >
> > > Sorry - Myriad is still a pretty young project!  Thanks for 
> > > checking it out though!
> > >
> > > Darin
> > >
> > > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto < 
> > > mloppatto@keywcorp.com> wrote:
> > >
> > > > Hey Darin,
> > > >
> > > > Pulling from master got rid of the errors I was seeing, however 
> > > > I'm running into a new issue.  After starting the resource 
> > > > manager, I see this in the logs:
> > > >
> > > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1
> > > > NM(s) with profile medium
> > > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> > > MyriadOperations:
> > > > Adding 1 NM instances to cluster
> > > > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> > > event.handlers.ErrorEventHandler:
> > > > Role '' is not present in the master's --roles
> > > >
> > > > My Mesos cluster has the default "*" role so I tried setting
> > > > frameworkRole: "*" in myriad-config-default.yml, restarted the 
> > > > resource manager and got this error:
> > > >
> > > > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> > > event.handlers.ResourceOffersEventHandler:
> > > > Exception thrown while trying to create a task for nm
> > > > java.lang.IllegalArgumentException: n must be positive
> > > >     at java.util.Random.nextInt(Random.java:300)
> > > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > > getRandomValues(RangeResource.java:128)
> > > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > > consumeResource(RangeResource.java:99)
> > > >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > > > consumePorts(ResourceOfferContainer.java:171)
> > > >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > > > NMTaskFactory.java:45)
> > > >     at org.apache.myriad.scheduler.event.handlers.
> > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> > java:119)
> > > >     at org.apache.myriad.scheduler.event.handlers.
> > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> java:49)
> > > >     at com.lmax.disruptor.BatchEventProcessor.run(
> > > > BatchEventProcessor.java:128)
> > > >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1145)
> > > >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:615)
> > > >     at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > Does Myriad require its own role in Mesos?
> > > >
> > > > Thanks,
> > > > Matt
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > > Sent: Tuesday, August 16, 2016 6:18 PM
> > > > To: Dev
> > > > Subject: Re: Resource manager error
> > > >
> > > > Hey Mathew, my coworker found the same issue recently, I fixed 
> > > > it on my last pull request, if you'd like to pull from master.
> > > >
> > > > Alternatively, you could comment out the appendCgroups line in 
> > > > myriad-scheduler
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-
> > > > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > > > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > > > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src 
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
> > > > &d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
> > > > Sx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
> > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > > > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main < 
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
> > > > ap
> > > > ac
> > > > he
> > > > _
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIF
> > > > aQ &c = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java < 
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
> > > > ap
> > > > ac
> > > > he
> > > > _
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d
> > > > =C wI Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > > > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org 
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
> > > > &d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
> > > > Sx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
> > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&
> > > > r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache < 
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_
> > > > ap ac he _ 
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGe
> > > > Y2 Zh lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > > > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad 
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
> > > > &d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
> > > > Sx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
> > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtX
> > > > x- w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler 
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense
> > > > &d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIF
> > > > Sx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8
> > > > wC ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4
> > > > Ng
> > > > k1
> > > > qe zfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > > > /*NMExecutorCLGenImpl* and rebuild.
> > > >
> > > > Sorry that missed my QA unfortunately I'm always using cgroups 
> > > > and didn't test that.  We may do a 0.2.1 release but I can say when.
> > > >
> > > > Darin
> > > >
> > > > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > > > <ml...@keywcorp.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > >
> > > > >
> > > > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this
> guide:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> > > > > or
> > > > > g_
> > > > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1
> > > > > qe
> > > > > zf
> > > > > sY
> > > > > Hy
> > > > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ib
> > > > > xh
> > > > > OZ
> > > > > QS
> > > > > sK
> > > > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58Bu
> > > > > SD
> > > > > 5e
> > > > > tw
> > > > > Im
> > > > > WZHzFz6Sk&e=
> > > > > Installing+for+Developers
> > > > >
> > > > >
> > > > >
> > > > > And I get the following error in the resource manager executor 
> > > > > log in mesos after starting it with 
> > > > > `/opt/hadoop-2.7.2/bin/yarn
> > > > resourcemanager`:
> > > > >
> > > > >
> > > > >
> > > > > chown: cannot access
> > > > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > > > f298affb6442’:
> > > > > No such file or directory
> > > > >
> > > > > env: /bin/yarn: No such file or directory
> > > > >
> > > > > ory
> > > > >
> > > > >
> > > > >
> > > > > It appears the ‘mesos’ directory doesn’t exist under
> > > /sys/fs/cgroup/cpu.
> > > > > Any ideas what the issue could be?
> > > > >
> > > > >
> > > > >
> > > > > This is my yarn-site.xml:
> > > > >
> > > > >
> > > > >
> > > > > <configuration>
> > > > >
> > > > > <!-- Site-specific YARN configuration properties -->
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.nodemanager.aux-services</name>
> > > > >
> > > > >        <value>mapreduce_shuffle,myriad_executor</value>
> > > > >
> > > > >        <!-- If using MapR distro, please use the following value:
> > > > >
> > > > >
> > > > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</
> > > > > va
> > > > > lu
> > > > > e>
> > > > > -->
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</n
> > > > > am
> > > > > e>
> > > > >
> > > > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.nodemanager.aux-services.myriad_executor.class</nam
> > > > > e>
> > > > >
> > > > >
> > > > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</va
> > > > > lu
> > > > > e>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        
> > > > > <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > > > >
> > > > >        <value>2000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        
> > > > > <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > > > >
> > > > >        <value>10000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</na
> > > > > me
> > > > > >
> > > > >
> > > > >        <value>1000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > > <!-- Needed for Fine Grain Scaling -->
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > > >
> > > > >        <value>0</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > > > >
> > > > >        <value>0</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > > <!-- Site specific YARN configuration properties -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > > >
> > > > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > > > >
> > > > >    <value>${nodemanager.resource.memory-mb}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!--These options enable dynamic port assignment by mesos -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.webapp.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.webapp.https.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.localizer.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!-- Configure Myriad Scheduler here -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.resourcemanager.scheduler.class</name>
> > > > >
> > > > >
> > > > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</v
> > > > > al
> > > > > ue
> > > > > >
> > > > >
> > > > >    <description>One can configure other scehdulers as well 
> > > > > from following
> > > > > list: 
> > > > > org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</descript
> > > > > io
> > > > > n>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > > > >
> > > > >    <value>false</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > > > >
> > > > >    <value>false</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > </configuration>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > My myriad-config-default.yml:
> > > > >
> > > > >
> > > > >
> > > > > mesosMaster: zk://myip:2181/mesos
> > > > >
> > > > > checkpoint: false
> > > > >
> > > > > frameworkFailoverTimeout: 43200000
> > > > >
> > > > > frameworkName: MyriadAlpha
> > > > >
> > > > > frameworkRole:
> > > > >
> > > > > frameworkUser: root     # User the Node Manager runs as, required
> if
> > > > > nodeManagerURI set, otherwise defaults to the user
> > > > >
> > > > >                          # running the resource manager.
> > > > >
> > > > > frameworkSuperUser: root  # To be depricated, currently 
> > > > > permissions need set by a superuser due to Mesos-1790.  Must 
> > > > > be
> > > > >
> > > > >                          # root or have passwordless sudo.
> > > > > Required if nodeManagerURI set, ignored otherwise.
> > > > >
> > > > > nativeLibrary: /usr/local/lib/libmesos.so
> > > > >
> > > > > zkServers: myip:2181
> > > > >
> > > > > zkTimeout: 20000
> > > > >
> > > > > restApiPort: 8192
> > > > >
> > > > > servedConfigPath: dist/config.tgz
> > > > >
> > > > > servedBinaryPath: dist/binary.tgz
> > > > >
> > > > > profiles:
> > > > >
> > > > > zero:  # NMs launched with this profile dynamically obtain 
> > > > > cpu/mem from Mesos
> > > > >
> > > > >    cpu: 0
> > > > >
> > > > >    mem: 0
> > > > >
> > > > > small:
> > > > >
> > > > >    cpu: 2
> > > > >
> > > > >    mem: 2048
> > > > >
> > > > > medium:
> > > > >
> > > > >    cpu: 4
> > > > >
> > > > >    mem: 4096
> > > > >
> > > > > large:
> > > > >
> > > > >    cpu: 10
> > > > >
> > > > >    mem: 12288
> > > > >
> > > > > nmInstances: # NMs to start with. Requires at least 1 NM with 
> > > > > a non-zero profile.
> > > > >
> > > > > medium: 1 # <profile_name : instances>
> > > > >
> > > > > rebalancer: false
> > > > >
> > > > > haEnabled: false
> > > > >
> > > > > nodemanager:
> > > > >
> > > > > jvmMaxMemoryMB: 1024
> > > > >
> > > > > cpus: 0.2
> > > > >
> > > > > cgroups: false
> > > > >
> > > > > executor:
> > > > >
> > > > > jvmMaxMemoryMB: 256
> > > > >
> > > > > path:
> > > > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0
> > > > > .j
> > > > > ar
> > > > >
> > > > > #The following should be used for a remotely distributed URI, 
> > > > > hdfs assumed but other URI types valid.
> > > > >
> > > > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > > > >
> > > > > #configUri:
> > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_
> > > > > ap
> > > > > i_
> > > > > ar
> > > > > if
> > > > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolg
> > > > > Ge
> > > > > Y2
> > > > > Zh
> > > > > lU
> > > > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyV
> > > > > i5
> > > > > ir
> > > > > uY
> > > > > 8I
> > > > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRU
> > > > > fR
> > > > > sm
> > > > > ew
> > > > > &e
> > > > > =
> > > > >
> > > > > #jvmUri:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads
> > > > > .m
> > > > > yc
> > > > > om
> > > > > pa
> > > > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1t
> > > > > vZ
> > > > > eu
> > > > > WB
> > > > > T6
> > > > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabK
> > > > > IP
> > > > > tz
> > > > > Nh
> > > > > AI
> > > > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH
> > > > > 3k
> > > > > 3C
> > > > > Ls
> > > > > gl
> > > > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > > > >
> > > > > yarnEnvironment:
> > > > >
> > > > > YARN_HOME: /opt/hadoop-2.7.2
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Matt
> > > > >
> > > >
> > >
> >
>

Re: Resource manager error

Posted by Darin Johnson <db...@gmail.com>.
Take a look at your myriad configuration under yarnEnvironment.  You can
set JAVA_HOME there, should solve the issue. See below.
yarnEnvironment:
YARN_HOME: /usr/local/hadoop
#HADOOP_CONF_DIR=config
#HADOOP_TMP_DIR=$MESOS_SANDBOX
#YARN_HOME: hadoop-2.7.0 #this should be relative if nodeManagerUri is set
#JAVA_HOME: /usr/lib/jvm/java-default #System dependent, but sometimes
necessary
#JAVA_HOME: jre1.7.0_76 # Path to JRE distribution, relative to sandbox
directory
#JAVA_LIBRARY_PATH: /opt/mycompany/lib

On Wed, Aug 17, 2016 at 3:13 PM, Matthew J. Loppatto <mloppatto@keywcorp.com
> wrote:

> I'm running the resource manager as the root user.  Checking a few of my
> nodes, JAVA_HOME is set on all of them for the root env.  Am I ok to be
> using openjdk1.7 or do I have to use Oracle jdk?
>
> Matt
>
> -----Original Message-----
> From: John Yost [mailto:hokiegeek2@gmail.com]
> Sent: Wednesday, August 17, 2016 3:01 PM
> To: dev@myriad.incubator.apache.org
> Subject: Re: Resource manager error
>
> Progress is nice! What user are you running myriad as? root? yarn? If it
> is the former and you are running via sudo, I've seen this type of error.
> If so, sudo to the root user and then launch. Otherwise, please type in env
> if you are on linux box and confirm you see JAVA_HOME for the user you are
> launching myriad as.
>
> --John
>
> On Wed, Aug 17, 2016 at 2:56 PM, Matthew J. Loppatto <
> mloppatto@keywcorp.com
> > wrote:
>
> > Hey John,
> >
> > I set up a role for myriad, restarted mesos-master, and now I'm seeing
> > RMs starting on the Mesos UI, but they fail with the message "lost
> > with exit
> > status: 256".  The executor log says "Error: JAVA_HOME is not set and
> > could not be found."  $JAVA_HOME is set on all my slaves as far as I'm
> aware.
> > Running `java -version` confirms openjdk 1.7.0_111.  Looks like its
> > close to a working state.  Am I missing something?
> >
> > Thanks!
> > Matt
> >
> > -----Original Message-----
> > From: John Yost [mailto:hokiegeek2@gmail.com]
> > Sent: Wednesday, August 17, 2016 2:38 PM
> > To: dev@myriad.incubator.apache.org
> > Subject: Re: Resource manager error
> >
> > Please uncomment frameworkRole and then add the name of whatever Mesos
> > role you have configured that is not *. Note: at the risk of telling
> > you something you already know, you define roles in
> /etc/mesos-master/roles.
> >
> > In the meantime, I opened up a JIRA ticket and gonna fix this ASAP
> > starting now! :)
> >
> > --John
> >
> > On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <
> > mloppatto@keywcorp.com
> > > wrote:
> >
> > > Hey Darin,
> > >
> > > Commenting out myriadFrameworkRole got rid of the log message about
> > > the missing role, but I'm still seeing the "n must be positive"
> > exception.
> > >
> > > The only other thing of interest I see in the log is WARN fair.
> > AllocationFileLoaderService:
> > > fair-scheduler.xml not found on the classpath.  Not sure if that is
> > > causing any issue though.
> > >
> > > Matt
> > >
> > > -----Original Message-----
> > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > Sent: Wednesday, August 17, 2016 1:26 PM
> > > To: Dev
> > > Subject: Re: Resource manager error
> > >
> > > Hey Matt,
> > >
> > > Looking through the code, I think setting myriadFrameworkRole to "*"
> > > might be the problem.  Can you try commenting out that line in your
> > > config?  I'll double check this in a little while too.  If that
> > > works I'll submit a patch that checks that.
> > >
> > > Sorry - Myriad is still a pretty young project!  Thanks for checking
> > > it out though!
> > >
> > > Darin
> > >
> > > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
> > > mloppatto@keywcorp.com> wrote:
> > >
> > > > Hey Darin,
> > > >
> > > > Pulling from master got rid of the errors I was seeing, however
> > > > I'm running into a new issue.  After starting the resource
> > > > manager, I see this in the logs:
> > > >
> > > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1
> > > > NM(s) with profile medium
> > > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> > > MyriadOperations:
> > > > Adding 1 NM instances to cluster
> > > > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> > > event.handlers.ErrorEventHandler:
> > > > Role '' is not present in the master's --roles
> > > >
> > > > My Mesos cluster has the default "*" role so I tried setting
> > > > frameworkRole: "*" in myriad-config-default.yml, restarted the
> > > > resource manager and got this error:
> > > >
> > > > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> > > event.handlers.ResourceOffersEventHandler:
> > > > Exception thrown while trying to create a task for nm
> > > > java.lang.IllegalArgumentException: n must be positive
> > > >     at java.util.Random.nextInt(Random.java:300)
> > > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > > getRandomValues(RangeResource.java:128)
> > > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > > consumeResource(RangeResource.java:99)
> > > >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > > > consumePorts(ResourceOfferContainer.java:171)
> > > >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > > > NMTaskFactory.java:45)
> > > >     at org.apache.myriad.scheduler.event.handlers.
> > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> > java:119)
> > > >     at org.apache.myriad.scheduler.event.handlers.
> > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> java:49)
> > > >     at com.lmax.disruptor.BatchEventProcessor.run(
> > > > BatchEventProcessor.java:128)
> > > >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1145)
> > > >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:615)
> > > >     at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > Does Myriad require its own role in Mesos?
> > > >
> > > > Thanks,
> > > > Matt
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > > Sent: Tuesday, August 16, 2016 6:18 PM
> > > > To: Dev
> > > > Subject: Re: Resource manager error
> > > >
> > > > Hey Mathew, my coworker found the same issue recently, I fixed it
> > > > on my last pull request, if you'd like to pull from master.
> > > >
> > > > Alternatively, you could comment out the appendCgroups line in
> > > > myriad-scheduler
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-
> > > > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > > > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > > > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > > > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > > ac
> > > > he
> > > > _
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ
> > > > &c = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > > ac
> > > > he
> > > > _
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=C
> > > > wI Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > > > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > > ac he _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2
> > > > Zh lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > > > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> > > > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ng
> > > > k1
> > > > qe zfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > > > /*NMExecutorCLGenImpl* and rebuild.
> > > >
> > > > Sorry that missed my QA unfortunately I'm always using cgroups and
> > > > didn't test that.  We may do a 0.2.1 release but I can say when.
> > > >
> > > > Darin
> > > >
> > > > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > > > <ml...@keywcorp.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > >
> > > > >
> > > > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this
> guide:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> > > > > or
> > > > > g_
> > > > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qe
> > > > > zf
> > > > > sY
> > > > > Hy
> > > > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxh
> > > > > OZ
> > > > > QS
> > > > > sK
> > > > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD
> > > > > 5e
> > > > > tw
> > > > > Im
> > > > > WZHzFz6Sk&e=
> > > > > Installing+for+Developers
> > > > >
> > > > >
> > > > >
> > > > > And I get the following error in the resource manager executor
> > > > > log in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> > > > resourcemanager`:
> > > > >
> > > > >
> > > > >
> > > > > chown: cannot access
> > > > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > > > f298affb6442’:
> > > > > No such file or directory
> > > > >
> > > > > env: /bin/yarn: No such file or directory
> > > > >
> > > > > ory
> > > > >
> > > > >
> > > > >
> > > > > It appears the ‘mesos’ directory doesn’t exist under
> > > /sys/fs/cgroup/cpu.
> > > > > Any ideas what the issue could be?
> > > > >
> > > > >
> > > > >
> > > > > This is my yarn-site.xml:
> > > > >
> > > > >
> > > > >
> > > > > <configuration>
> > > > >
> > > > > <!-- Site-specific YARN configuration properties -->
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.nodemanager.aux-services</name>
> > > > >
> > > > >        <value>mapreduce_shuffle,myriad_executor</value>
> > > > >
> > > > >        <!-- If using MapR distro, please use the following value:
> > > > >
> > > > >
> > > > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</va
> > > > > lu
> > > > > e>
> > > > > -->
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</nam
> > > > > e>
> > > > >
> > > > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> > > > >
> > > > >
> > > > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</valu
> > > > > e>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > > > >
> > > > >        <value>2000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > > > >
> > > > >        <value>10000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name
> > > > > >
> > > > >
> > > > >        <value>1000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > > <!-- Needed for Fine Grain Scaling -->
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > > >
> > > > >        <value>0</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > > > >
> > > > >        <value>0</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > > <!-- Site specific YARN configuration properties -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > > >
> > > > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > > > >
> > > > >    <value>${nodemanager.resource.memory-mb}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!--These options enable dynamic port assignment by mesos -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.webapp.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.webapp.https.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.localizer.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!-- Configure Myriad Scheduler here -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.resourcemanager.scheduler.class</name>
> > > > >
> > > > >
> > > > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</val
> > > > > ue
> > > > > >
> > > > >
> > > > >    <description>One can configure other scehdulers as well from
> > > > > following
> > > > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</descriptio
> > > > > n>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > > > >
> > > > >    <value>false</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > > > >
> > > > >    <value>false</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > </configuration>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > My myriad-config-default.yml:
> > > > >
> > > > >
> > > > >
> > > > > mesosMaster: zk://myip:2181/mesos
> > > > >
> > > > > checkpoint: false
> > > > >
> > > > > frameworkFailoverTimeout: 43200000
> > > > >
> > > > > frameworkName: MyriadAlpha
> > > > >
> > > > > frameworkRole:
> > > > >
> > > > > frameworkUser: root     # User the Node Manager runs as, required
> if
> > > > > nodeManagerURI set, otherwise defaults to the user
> > > > >
> > > > >                          # running the resource manager.
> > > > >
> > > > > frameworkSuperUser: root  # To be depricated, currently
> > > > > permissions need set by a superuser due to Mesos-1790.  Must be
> > > > >
> > > > >                          # root or have passwordless sudo.
> > > > > Required if nodeManagerURI set, ignored otherwise.
> > > > >
> > > > > nativeLibrary: /usr/local/lib/libmesos.so
> > > > >
> > > > > zkServers: myip:2181
> > > > >
> > > > > zkTimeout: 20000
> > > > >
> > > > > restApiPort: 8192
> > > > >
> > > > > servedConfigPath: dist/config.tgz
> > > > >
> > > > > servedBinaryPath: dist/binary.tgz
> > > > >
> > > > > profiles:
> > > > >
> > > > > zero:  # NMs launched with this profile dynamically obtain
> > > > > cpu/mem from Mesos
> > > > >
> > > > >    cpu: 0
> > > > >
> > > > >    mem: 0
> > > > >
> > > > > small:
> > > > >
> > > > >    cpu: 2
> > > > >
> > > > >    mem: 2048
> > > > >
> > > > > medium:
> > > > >
> > > > >    cpu: 4
> > > > >
> > > > >    mem: 4096
> > > > >
> > > > > large:
> > > > >
> > > > >    cpu: 10
> > > > >
> > > > >    mem: 12288
> > > > >
> > > > > nmInstances: # NMs to start with. Requires at least 1 NM with a
> > > > > non-zero profile.
> > > > >
> > > > > medium: 1 # <profile_name : instances>
> > > > >
> > > > > rebalancer: false
> > > > >
> > > > > haEnabled: false
> > > > >
> > > > > nodemanager:
> > > > >
> > > > > jvmMaxMemoryMB: 1024
> > > > >
> > > > > cpus: 0.2
> > > > >
> > > > > cgroups: false
> > > > >
> > > > > executor:
> > > > >
> > > > > jvmMaxMemoryMB: 256
> > > > >
> > > > > path:
> > > > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.j
> > > > > ar
> > > > >
> > > > > #The following should be used for a remotely distributed URI,
> > > > > hdfs assumed but other URI types valid.
> > > > >
> > > > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > > > >
> > > > > #configUri:
> > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_ap
> > > > > i_
> > > > > ar
> > > > > if
> > > > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGe
> > > > > Y2
> > > > > Zh
> > > > > lU
> > > > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5
> > > > > ir
> > > > > uY
> > > > > 8I
> > > > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfR
> > > > > sm
> > > > > ew
> > > > > &e
> > > > > =
> > > > >
> > > > > #jvmUri:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.m
> > > > > yc
> > > > > om
> > > > > pa
> > > > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZ
> > > > > eu
> > > > > WB
> > > > > T6
> > > > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIP
> > > > > tz
> > > > > Nh
> > > > > AI
> > > > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k
> > > > > 3C
> > > > > Ls
> > > > > gl
> > > > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > > > >
> > > > > yarnEnvironment:
> > > > >
> > > > > YARN_HOME: /opt/hadoop-2.7.2
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Matt
> > > > >
> > > >
> > >
> >
>

Re: Resource manager error

Posted by John Yost <ho...@gmail.com>.
Odd, please send screen shot of executor startup.

--John

On Wed, Aug 17, 2016 at 3:13 PM, Matthew J. Loppatto <mloppatto@keywcorp.com
> wrote:

> I'm running the resource manager as the root user.  Checking a few of my
> nodes, JAVA_HOME is set on all of them for the root env.  Am I ok to be
> using openjdk1.7 or do I have to use Oracle jdk?
>
> Matt
>
> -----Original Message-----
> From: John Yost [mailto:hokiegeek2@gmail.com]
> Sent: Wednesday, August 17, 2016 3:01 PM
> To: dev@myriad.incubator.apache.org
> Subject: Re: Resource manager error
>
> Progress is nice! What user are you running myriad as? root? yarn? If it
> is the former and you are running via sudo, I've seen this type of error.
> If so, sudo to the root user and then launch. Otherwise, please type in env
> if you are on linux box and confirm you see JAVA_HOME for the user you are
> launching myriad as.
>
> --John
>
> On Wed, Aug 17, 2016 at 2:56 PM, Matthew J. Loppatto <
> mloppatto@keywcorp.com
> > wrote:
>
> > Hey John,
> >
> > I set up a role for myriad, restarted mesos-master, and now I'm seeing
> > RMs starting on the Mesos UI, but they fail with the message "lost
> > with exit
> > status: 256".  The executor log says "Error: JAVA_HOME is not set and
> > could not be found."  $JAVA_HOME is set on all my slaves as far as I'm
> aware.
> > Running `java -version` confirms openjdk 1.7.0_111.  Looks like its
> > close to a working state.  Am I missing something?
> >
> > Thanks!
> > Matt
> >
> > -----Original Message-----
> > From: John Yost [mailto:hokiegeek2@gmail.com]
> > Sent: Wednesday, August 17, 2016 2:38 PM
> > To: dev@myriad.incubator.apache.org
> > Subject: Re: Resource manager error
> >
> > Please uncomment frameworkRole and then add the name of whatever Mesos
> > role you have configured that is not *. Note: at the risk of telling
> > you something you already know, you define roles in
> /etc/mesos-master/roles.
> >
> > In the meantime, I opened up a JIRA ticket and gonna fix this ASAP
> > starting now! :)
> >
> > --John
> >
> > On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <
> > mloppatto@keywcorp.com
> > > wrote:
> >
> > > Hey Darin,
> > >
> > > Commenting out myriadFrameworkRole got rid of the log message about
> > > the missing role, but I'm still seeing the "n must be positive"
> > exception.
> > >
> > > The only other thing of interest I see in the log is WARN fair.
> > AllocationFileLoaderService:
> > > fair-scheduler.xml not found on the classpath.  Not sure if that is
> > > causing any issue though.
> > >
> > > Matt
> > >
> > > -----Original Message-----
> > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > Sent: Wednesday, August 17, 2016 1:26 PM
> > > To: Dev
> > > Subject: Re: Resource manager error
> > >
> > > Hey Matt,
> > >
> > > Looking through the code, I think setting myriadFrameworkRole to "*"
> > > might be the problem.  Can you try commenting out that line in your
> > > config?  I'll double check this in a little while too.  If that
> > > works I'll submit a patch that checks that.
> > >
> > > Sorry - Myriad is still a pretty young project!  Thanks for checking
> > > it out though!
> > >
> > > Darin
> > >
> > > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
> > > mloppatto@keywcorp.com> wrote:
> > >
> > > > Hey Darin,
> > > >
> > > > Pulling from master got rid of the errors I was seeing, however
> > > > I'm running into a new issue.  After starting the resource
> > > > manager, I see this in the logs:
> > > >
> > > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1
> > > > NM(s) with profile medium
> > > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> > > MyriadOperations:
> > > > Adding 1 NM instances to cluster
> > > > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> > > event.handlers.ErrorEventHandler:
> > > > Role '' is not present in the master's --roles
> > > >
> > > > My Mesos cluster has the default "*" role so I tried setting
> > > > frameworkRole: "*" in myriad-config-default.yml, restarted the
> > > > resource manager and got this error:
> > > >
> > > > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> > > event.handlers.ResourceOffersEventHandler:
> > > > Exception thrown while trying to create a task for nm
> > > > java.lang.IllegalArgumentException: n must be positive
> > > >     at java.util.Random.nextInt(Random.java:300)
> > > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > > getRandomValues(RangeResource.java:128)
> > > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > > consumeResource(RangeResource.java:99)
> > > >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > > > consumePorts(ResourceOfferContainer.java:171)
> > > >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > > > NMTaskFactory.java:45)
> > > >     at org.apache.myriad.scheduler.event.handlers.
> > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> > java:119)
> > > >     at org.apache.myriad.scheduler.event.handlers.
> > > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> java:49)
> > > >     at com.lmax.disruptor.BatchEventProcessor.run(
> > > > BatchEventProcessor.java:128)
> > > >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1145)
> > > >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:615)
> > > >     at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > Does Myriad require its own role in Mesos?
> > > >
> > > > Thanks,
> > > > Matt
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > > Sent: Tuesday, August 16, 2016 6:18 PM
> > > > To: Dev
> > > > Subject: Re: Resource manager error
> > > >
> > > > Hey Mathew, my coworker found the same issue recently, I fixed it
> > > > on my last pull request, if you'd like to pull from master.
> > > >
> > > > Alternatively, you could comment out the appendCgroups line in
> > > > myriad-scheduler
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-
> > > > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > > > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > > > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > > > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > > ac
> > > > he
> > > > _
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ
> > > > &c = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > > ac
> > > > he
> > > > _
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=C
> > > > wI Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > > > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > > ac he _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2
> > > > Zh lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > > > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> > > > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > > =C
> > > > wI
> > > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > > aG
> > > > Dn
> > > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ng
> > > > k1
> > > > qe zfsY
> > > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > > > /*NMExecutorCLGenImpl* and rebuild.
> > > >
> > > > Sorry that missed my QA unfortunately I'm always using cgroups and
> > > > didn't test that.  We may do a 0.2.1 release but I can say when.
> > > >
> > > > Darin
> > > >
> > > > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > > > <ml...@keywcorp.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > >
> > > > >
> > > > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this
> guide:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> > > > > or
> > > > > g_
> > > > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qe
> > > > > zf
> > > > > sY
> > > > > Hy
> > > > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxh
> > > > > OZ
> > > > > QS
> > > > > sK
> > > > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD
> > > > > 5e
> > > > > tw
> > > > > Im
> > > > > WZHzFz6Sk&e=
> > > > > Installing+for+Developers
> > > > >
> > > > >
> > > > >
> > > > > And I get the following error in the resource manager executor
> > > > > log in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> > > > resourcemanager`:
> > > > >
> > > > >
> > > > >
> > > > > chown: cannot access
> > > > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > > > f298affb6442’:
> > > > > No such file or directory
> > > > >
> > > > > env: /bin/yarn: No such file or directory
> > > > >
> > > > > ory
> > > > >
> > > > >
> > > > >
> > > > > It appears the ‘mesos’ directory doesn’t exist under
> > > /sys/fs/cgroup/cpu.
> > > > > Any ideas what the issue could be?
> > > > >
> > > > >
> > > > >
> > > > > This is my yarn-site.xml:
> > > > >
> > > > >
> > > > >
> > > > > <configuration>
> > > > >
> > > > > <!-- Site-specific YARN configuration properties -->
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.nodemanager.aux-services</name>
> > > > >
> > > > >        <value>mapreduce_shuffle,myriad_executor</value>
> > > > >
> > > > >        <!-- If using MapR distro, please use the following value:
> > > > >
> > > > >
> > > > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</va
> > > > > lu
> > > > > e>
> > > > > -->
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</nam
> > > > > e>
> > > > >
> > > > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> > > > >
> > > > >
> > > > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</valu
> > > > > e>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > > > >
> > > > >        <value>2000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > > > >
> > > > >        <value>10000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >
> > > > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name
> > > > > >
> > > > >
> > > > >        <value>1000</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > > <!-- Needed for Fine Grain Scaling -->
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > > >
> > > > >        <value>0</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > >    <property>
> > > > >
> > > > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > > > >
> > > > >        <value>0</value>
> > > > >
> > > > >    </property>
> > > > >
> > > > > <!-- Site specific YARN configuration properties -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > > >
> > > > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > > > >
> > > > >    <value>${nodemanager.resource.memory-mb}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!--These options enable dynamic port assignment by mesos -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.webapp.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.webapp.https.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.localizer.address</name>
> > > > >
> > > > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!-- Configure Myriad Scheduler here -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.resourcemanager.scheduler.class</name>
> > > > >
> > > > >
> > > > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</val
> > > > > ue
> > > > > >
> > > > >
> > > > >    <description>One can configure other scehdulers as well from
> > > > > following
> > > > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</descriptio
> > > > > n>
> > > > >
> > > > > </property>
> > > > >
> > > > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > > > >
> > > > >    <value>false</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > <property>
> > > > >
> > > > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > > > >
> > > > >    <value>false</value>
> > > > >
> > > > > </property>
> > > > >
> > > > > </configuration>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > My myriad-config-default.yml:
> > > > >
> > > > >
> > > > >
> > > > > mesosMaster: zk://myip:2181/mesos
> > > > >
> > > > > checkpoint: false
> > > > >
> > > > > frameworkFailoverTimeout: 43200000
> > > > >
> > > > > frameworkName: MyriadAlpha
> > > > >
> > > > > frameworkRole:
> > > > >
> > > > > frameworkUser: root     # User the Node Manager runs as, required
> if
> > > > > nodeManagerURI set, otherwise defaults to the user
> > > > >
> > > > >                          # running the resource manager.
> > > > >
> > > > > frameworkSuperUser: root  # To be depricated, currently
> > > > > permissions need set by a superuser due to Mesos-1790.  Must be
> > > > >
> > > > >                          # root or have passwordless sudo.
> > > > > Required if nodeManagerURI set, ignored otherwise.
> > > > >
> > > > > nativeLibrary: /usr/local/lib/libmesos.so
> > > > >
> > > > > zkServers: myip:2181
> > > > >
> > > > > zkTimeout: 20000
> > > > >
> > > > > restApiPort: 8192
> > > > >
> > > > > servedConfigPath: dist/config.tgz
> > > > >
> > > > > servedBinaryPath: dist/binary.tgz
> > > > >
> > > > > profiles:
> > > > >
> > > > > zero:  # NMs launched with this profile dynamically obtain
> > > > > cpu/mem from Mesos
> > > > >
> > > > >    cpu: 0
> > > > >
> > > > >    mem: 0
> > > > >
> > > > > small:
> > > > >
> > > > >    cpu: 2
> > > > >
> > > > >    mem: 2048
> > > > >
> > > > > medium:
> > > > >
> > > > >    cpu: 4
> > > > >
> > > > >    mem: 4096
> > > > >
> > > > > large:
> > > > >
> > > > >    cpu: 10
> > > > >
> > > > >    mem: 12288
> > > > >
> > > > > nmInstances: # NMs to start with. Requires at least 1 NM with a
> > > > > non-zero profile.
> > > > >
> > > > > medium: 1 # <profile_name : instances>
> > > > >
> > > > > rebalancer: false
> > > > >
> > > > > haEnabled: false
> > > > >
> > > > > nodemanager:
> > > > >
> > > > > jvmMaxMemoryMB: 1024
> > > > >
> > > > > cpus: 0.2
> > > > >
> > > > > cgroups: false
> > > > >
> > > > > executor:
> > > > >
> > > > > jvmMaxMemoryMB: 256
> > > > >
> > > > > path:
> > > > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.j
> > > > > ar
> > > > >
> > > > > #The following should be used for a remotely distributed URI,
> > > > > hdfs assumed but other URI types valid.
> > > > >
> > > > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > > > >
> > > > > #configUri:
> > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_ap
> > > > > i_
> > > > > ar
> > > > > if
> > > > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGe
> > > > > Y2
> > > > > Zh
> > > > > lU
> > > > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5
> > > > > ir
> > > > > uY
> > > > > 8I
> > > > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfR
> > > > > sm
> > > > > ew
> > > > > &e
> > > > > =
> > > > >
> > > > > #jvmUri:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.m
> > > > > yc
> > > > > om
> > > > > pa
> > > > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZ
> > > > > eu
> > > > > WB
> > > > > T6
> > > > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIP
> > > > > tz
> > > > > Nh
> > > > > AI
> > > > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k
> > > > > 3C
> > > > > Ls
> > > > > gl
> > > > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > > > >
> > > > > yarnEnvironment:
> > > > >
> > > > > YARN_HOME: /opt/hadoop-2.7.2
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Matt
> > > > >
> > > >
> > >
> >
>

RE: Resource manager error

Posted by "Matthew J. Loppatto" <ml...@keywcorp.com>.
I'm running the resource manager as the root user.  Checking a few of my nodes, JAVA_HOME is set on all of them for the root env.  Am I ok to be using openjdk1.7 or do I have to use Oracle jdk?

Matt

-----Original Message-----
From: John Yost [mailto:hokiegeek2@gmail.com] 
Sent: Wednesday, August 17, 2016 3:01 PM
To: dev@myriad.incubator.apache.org
Subject: Re: Resource manager error

Progress is nice! What user are you running myriad as? root? yarn? If it is the former and you are running via sudo, I've seen this type of error. If so, sudo to the root user and then launch. Otherwise, please type in env if you are on linux box and confirm you see JAVA_HOME for the user you are launching myriad as.

--John

On Wed, Aug 17, 2016 at 2:56 PM, Matthew J. Loppatto <mloppatto@keywcorp.com
> wrote:

> Hey John,
>
> I set up a role for myriad, restarted mesos-master, and now I'm seeing 
> RMs starting on the Mesos UI, but they fail with the message "lost 
> with exit
> status: 256".  The executor log says "Error: JAVA_HOME is not set and 
> could not be found."  $JAVA_HOME is set on all my slaves as far as I'm aware.
> Running `java -version` confirms openjdk 1.7.0_111.  Looks like its 
> close to a working state.  Am I missing something?
>
> Thanks!
> Matt
>
> -----Original Message-----
> From: John Yost [mailto:hokiegeek2@gmail.com]
> Sent: Wednesday, August 17, 2016 2:38 PM
> To: dev@myriad.incubator.apache.org
> Subject: Re: Resource manager error
>
> Please uncomment frameworkRole and then add the name of whatever Mesos 
> role you have configured that is not *. Note: at the risk of telling 
> you something you already know, you define roles in /etc/mesos-master/roles.
>
> In the meantime, I opened up a JIRA ticket and gonna fix this ASAP 
> starting now! :)
>
> --John
>
> On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto < 
> mloppatto@keywcorp.com
> > wrote:
>
> > Hey Darin,
> >
> > Commenting out myriadFrameworkRole got rid of the log message about 
> > the missing role, but I'm still seeing the "n must be positive"
> exception.
> >
> > The only other thing of interest I see in the log is WARN fair.
> AllocationFileLoaderService:
> > fair-scheduler.xml not found on the classpath.  Not sure if that is 
> > causing any issue though.
> >
> > Matt
> >
> > -----Original Message-----
> > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > Sent: Wednesday, August 17, 2016 1:26 PM
> > To: Dev
> > Subject: Re: Resource manager error
> >
> > Hey Matt,
> >
> > Looking through the code, I think setting myriadFrameworkRole to "*"
> > might be the problem.  Can you try commenting out that line in your 
> > config?  I'll double check this in a little while too.  If that 
> > works I'll submit a patch that checks that.
> >
> > Sorry - Myriad is still a pretty young project!  Thanks for checking 
> > it out though!
> >
> > Darin
> >
> > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto < 
> > mloppatto@keywcorp.com> wrote:
> >
> > > Hey Darin,
> > >
> > > Pulling from master got rid of the errors I was seeing, however 
> > > I'm running into a new issue.  After starting the resource 
> > > manager, I see this in the logs:
> > >
> > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1
> > > NM(s) with profile medium
> > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> > MyriadOperations:
> > > Adding 1 NM instances to cluster
> > > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> > event.handlers.ErrorEventHandler:
> > > Role '' is not present in the master's --roles
> > >
> > > My Mesos cluster has the default "*" role so I tried setting
> > > frameworkRole: "*" in myriad-config-default.yml, restarted the 
> > > resource manager and got this error:
> > >
> > > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> > event.handlers.ResourceOffersEventHandler:
> > > Exception thrown while trying to create a task for nm
> > > java.lang.IllegalArgumentException: n must be positive
> > >     at java.util.Random.nextInt(Random.java:300)
> > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > getRandomValues(RangeResource.java:128)
> > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > consumeResource(RangeResource.java:99)
> > >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > > consumePorts(ResourceOfferContainer.java:171)
> > >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > > NMTaskFactory.java:45)
> > >     at org.apache.myriad.scheduler.event.handlers.
> > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> java:119)
> > >     at org.apache.myriad.scheduler.event.handlers.
> > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
> > >     at com.lmax.disruptor.BatchEventProcessor.run(
> > > BatchEventProcessor.java:128)
> > >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > ThreadPoolExecutor.java:1145)
> > >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > ThreadPoolExecutor.java:615)
> > >     at java.lang.Thread.run(Thread.java:745)
> > >
> > > Does Myriad require its own role in Mesos?
> > >
> > > Thanks,
> > > Matt
> > >
> > >
> > > -----Original Message-----
> > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > Sent: Tuesday, August 16, 2016 6:18 PM
> > > To: Dev
> > > Subject: Re: Resource manager error
> > >
> > > Hey Mathew, my coworker found the same issue recently, I fixed it 
> > > on my last pull request, if you'd like to pull from master.
> > >
> > > Alternatively, you could comment out the appendCgroups line in 
> > > myriad-scheduler 
> > > <https://urldefense.proofpoint.com/v2/url?u=https-
> > > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src 
> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > =C
> > > wI
> > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > aG
> > > Dn
> > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main < 
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > ac
> > > he
> > > _
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ
> > > &c = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java < 
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > ac
> > > he
> > > _
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=C
> > > wI Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org 
> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > =C
> > > wI
> > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > aG
> > > Dn
> > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache < 
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ap
> > > ac he _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2
> > > Zh lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad 
> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > =C
> > > wI
> > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > aG
> > > Dn
> > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> > > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler 
> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d
> > > =C
> > > wI
> > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSx
> > > aG
> > > Dn
> > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wC
> > > ZT jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ng
> > > k1
> > > qe zfsY
> > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > > /*NMExecutorCLGenImpl* and rebuild.
> > >
> > > Sorry that missed my QA unfortunately I'm always using cgroups and 
> > > didn't test that.  We may do a 0.2.1 release but I can say when.
> > >
> > > Darin
> > >
> > > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > > <ml...@keywcorp.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > >
> > > >
> > > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> > > > or
> > > > g_
> > > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qe
> > > > zf
> > > > sY
> > > > Hy
> > > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxh
> > > > OZ
> > > > QS
> > > > sK
> > > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD
> > > > 5e
> > > > tw
> > > > Im
> > > > WZHzFz6Sk&e=
> > > > Installing+for+Developers
> > > >
> > > >
> > > >
> > > > And I get the following error in the resource manager executor 
> > > > log in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> > > resourcemanager`:
> > > >
> > > >
> > > >
> > > > chown: cannot access
> > > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > > f298affb6442’:
> > > > No such file or directory
> > > >
> > > > env: /bin/yarn: No such file or directory
> > > >
> > > > ory
> > > >
> > > >
> > > >
> > > > It appears the ‘mesos’ directory doesn’t exist under
> > /sys/fs/cgroup/cpu.
> > > > Any ideas what the issue could be?
> > > >
> > > >
> > > >
> > > > This is my yarn-site.xml:
> > > >
> > > >
> > > >
> > > > <configuration>
> > > >
> > > > <!-- Site-specific YARN configuration properties -->
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.nodemanager.aux-services</name>
> > > >
> > > >        <value>mapreduce_shuffle,myriad_executor</value>
> > > >
> > > >        <!-- If using MapR distro, please use the following value:
> > > >
> > > >
> > > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</va
> > > > lu
> > > > e>
> > > > -->
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >
> > > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</nam
> > > > e>
> > > >
> > > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >
> > > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> > > >
> > > >
> > > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</valu
> > > > e>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > > >
> > > >        <value>2000</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > > >
> > > >        <value>10000</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >
> > > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name
> > > > >
> > > >
> > > >        <value>1000</value>
> > > >
> > > >    </property>
> > > >
> > > > <!-- Needed for Fine Grain Scaling -->
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > >
> > > >        <value>0</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > > >
> > > >        <value>0</value>
> > > >
> > > >    </property>
> > > >
> > > > <!-- Site specific YARN configuration properties -->
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > >
> > > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > > >
> > > >    <value>${nodemanager.resource.memory-mb}</value>
> > > >
> > > > </property>
> > > >
> > > > <!--These options enable dynamic port assignment by mesos -->
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.address</name>
> > > >
> > > >    <value>${myriad.yarn.nodemanager.address}</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.webapp.address</name>
> > > >
> > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.webapp.https.address</name>
> > > >
> > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.localizer.address</name>
> > > >
> > > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > > >
> > > > </property>
> > > >
> > > > <!-- Configure Myriad Scheduler here -->
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.resourcemanager.scheduler.class</name>
> > > >
> > > >
> > > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</val
> > > > ue
> > > > >
> > > >
> > > >    <description>One can configure other scehdulers as well from 
> > > > following
> > > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</descriptio
> > > > n>
> > > >
> > > > </property>
> > > >
> > > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > > >
> > > >    <value>false</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > > >
> > > >    <value>false</value>
> > > >
> > > > </property>
> > > >
> > > > </configuration>
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > My myriad-config-default.yml:
> > > >
> > > >
> > > >
> > > > mesosMaster: zk://myip:2181/mesos
> > > >
> > > > checkpoint: false
> > > >
> > > > frameworkFailoverTimeout: 43200000
> > > >
> > > > frameworkName: MyriadAlpha
> > > >
> > > > frameworkRole:
> > > >
> > > > frameworkUser: root     # User the Node Manager runs as, required if
> > > > nodeManagerURI set, otherwise defaults to the user
> > > >
> > > >                          # running the resource manager.
> > > >
> > > > frameworkSuperUser: root  # To be depricated, currently 
> > > > permissions need set by a superuser due to Mesos-1790.  Must be
> > > >
> > > >                          # root or have passwordless sudo.
> > > > Required if nodeManagerURI set, ignored otherwise.
> > > >
> > > > nativeLibrary: /usr/local/lib/libmesos.so
> > > >
> > > > zkServers: myip:2181
> > > >
> > > > zkTimeout: 20000
> > > >
> > > > restApiPort: 8192
> > > >
> > > > servedConfigPath: dist/config.tgz
> > > >
> > > > servedBinaryPath: dist/binary.tgz
> > > >
> > > > profiles:
> > > >
> > > > zero:  # NMs launched with this profile dynamically obtain 
> > > > cpu/mem from Mesos
> > > >
> > > >    cpu: 0
> > > >
> > > >    mem: 0
> > > >
> > > > small:
> > > >
> > > >    cpu: 2
> > > >
> > > >    mem: 2048
> > > >
> > > > medium:
> > > >
> > > >    cpu: 4
> > > >
> > > >    mem: 4096
> > > >
> > > > large:
> > > >
> > > >    cpu: 10
> > > >
> > > >    mem: 12288
> > > >
> > > > nmInstances: # NMs to start with. Requires at least 1 NM with a 
> > > > non-zero profile.
> > > >
> > > > medium: 1 # <profile_name : instances>
> > > >
> > > > rebalancer: false
> > > >
> > > > haEnabled: false
> > > >
> > > > nodemanager:
> > > >
> > > > jvmMaxMemoryMB: 1024
> > > >
> > > > cpus: 0.2
> > > >
> > > > cgroups: false
> > > >
> > > > executor:
> > > >
> > > > jvmMaxMemoryMB: 256
> > > >
> > > > path:
> > > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.j
> > > > ar
> > > >
> > > > #The following should be used for a remotely distributed URI, 
> > > > hdfs assumed but other URI types valid.
> > > >
> > > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > > >
> > > > #configUri:
> > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_ap
> > > > i_
> > > > ar
> > > > if
> > > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGe
> > > > Y2
> > > > Zh
> > > > lU
> > > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5
> > > > ir
> > > > uY
> > > > 8I
> > > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfR
> > > > sm
> > > > ew
> > > > &e
> > > > =
> > > >
> > > > #jvmUri:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.m
> > > > yc
> > > > om
> > > > pa
> > > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZ
> > > > eu
> > > > WB
> > > > T6
> > > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIP
> > > > tz
> > > > Nh
> > > > AI
> > > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k
> > > > 3C
> > > > Ls
> > > > gl
> > > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > > >
> > > > yarnEnvironment:
> > > >
> > > > YARN_HOME: /opt/hadoop-2.7.2
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Thanks!
> > > >
> > > > Matt
> > > >
> > >
> >
>

Re: Resource manager error

Posted by John Yost <ho...@gmail.com>.
Progress is nice! What user are you running myriad as? root? yarn? If it is
the former and you are running via sudo, I've seen this type of error. If
so, sudo to the root user and then launch. Otherwise, please type in env if
you are on linux box and confirm you see JAVA_HOME for the user you are
launching myriad as.

--John

On Wed, Aug 17, 2016 at 2:56 PM, Matthew J. Loppatto <mloppatto@keywcorp.com
> wrote:

> Hey John,
>
> I set up a role for myriad, restarted mesos-master, and now I'm seeing RMs
> starting on the Mesos UI, but they fail with the message "lost with exit
> status: 256".  The executor log says "Error: JAVA_HOME is not set and could
> not be found."  $JAVA_HOME is set on all my slaves as far as I'm aware.
> Running `java -version` confirms openjdk 1.7.0_111.  Looks like its close
> to a working state.  Am I missing something?
>
> Thanks!
> Matt
>
> -----Original Message-----
> From: John Yost [mailto:hokiegeek2@gmail.com]
> Sent: Wednesday, August 17, 2016 2:38 PM
> To: dev@myriad.incubator.apache.org
> Subject: Re: Resource manager error
>
> Please uncomment frameworkRole and then add the name of whatever Mesos
> role you have configured that is not *. Note: at the risk of telling you
> something you already know, you define roles in /etc/mesos-master/roles.
>
> In the meantime, I opened up a JIRA ticket and gonna fix this ASAP
> starting now! :)
>
> --John
>
> On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <
> mloppatto@keywcorp.com
> > wrote:
>
> > Hey Darin,
> >
> > Commenting out myriadFrameworkRole got rid of the log message about
> > the missing role, but I'm still seeing the "n must be positive"
> exception.
> >
> > The only other thing of interest I see in the log is WARN fair.
> AllocationFileLoaderService:
> > fair-scheduler.xml not found on the classpath.  Not sure if that is
> > causing any issue though.
> >
> > Matt
> >
> > -----Original Message-----
> > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > Sent: Wednesday, August 17, 2016 1:26 PM
> > To: Dev
> > Subject: Re: Resource manager error
> >
> > Hey Matt,
> >
> > Looking through the code, I think setting myriadFrameworkRole to "*"
> > might be the problem.  Can you try commenting out that line in your
> > config?  I'll double check this in a little while too.  If that works
> > I'll submit a patch that checks that.
> >
> > Sorry - Myriad is still a pretty young project!  Thanks for checking
> > it out though!
> >
> > Darin
> >
> > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
> > mloppatto@keywcorp.com> wrote:
> >
> > > Hey Darin,
> > >
> > > Pulling from master got rid of the errors I was seeing, however I'm
> > > running into a new issue.  After starting the resource manager, I
> > > see this in the logs:
> > >
> > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1
> > > NM(s) with profile medium
> > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> > MyriadOperations:
> > > Adding 1 NM instances to cluster
> > > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> > event.handlers.ErrorEventHandler:
> > > Role '' is not present in the master's --roles
> > >
> > > My Mesos cluster has the default "*" role so I tried setting
> > > frameworkRole: "*" in myriad-config-default.yml, restarted the
> > > resource manager and got this error:
> > >
> > > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> > event.handlers.ResourceOffersEventHandler:
> > > Exception thrown while trying to create a task for nm
> > > java.lang.IllegalArgumentException: n must be positive
> > >     at java.util.Random.nextInt(Random.java:300)
> > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > getRandomValues(RangeResource.java:128)
> > >     at org.apache.myriad.scheduler.resource.RangeResource.
> > > consumeResource(RangeResource.java:99)
> > >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > > consumePorts(ResourceOfferContainer.java:171)
> > >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > > NMTaskFactory.java:45)
> > >     at org.apache.myriad.scheduler.event.handlers.
> > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.
> java:119)
> > >     at org.apache.myriad.scheduler.event.handlers.
> > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
> > >     at com.lmax.disruptor.BatchEventProcessor.run(
> > > BatchEventProcessor.java:128)
> > >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > ThreadPoolExecutor.java:1145)
> > >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > ThreadPoolExecutor.java:615)
> > >     at java.lang.Thread.run(Thread.java:745)
> > >
> > > Does Myriad require its own role in Mesos?
> > >
> > > Thanks,
> > > Matt
> > >
> > >
> > > -----Original Message-----
> > > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > > Sent: Tuesday, August 16, 2016 6:18 PM
> > > To: Dev
> > > Subject: Re: Resource manager error
> > >
> > > Hey Mathew, my coworker found the same issue recently, I fixed it on
> > > my last pull request, if you'd like to pull from master.
> > >
> > > Alternatively, you could comment out the appendCgroups line in
> > > myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-
> > > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src
> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > > wI
> > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > > Dn
> > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > > he
> > > _
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c
> > > = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > > he
> > > _
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=CwI
> > > Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org
> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > > wI
> > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > > Dn
> > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > > he _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2Zh
> > > lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad
> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > > wI
> > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > > Dn
> > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> > > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler
> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > > wI
> > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > > Dn
> > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1
> > > qe zfsY
> > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > > /*NMExecutorCLGenImpl* and rebuild.
> > >
> > > Sorry that missed my QA unfortunately I'm always using cgroups and
> > > didn't test that.  We may do a 0.2.1 release but I can say when.
> > >
> > > Darin
> > >
> > > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > > <ml...@keywcorp.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > >
> > > >
> > > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> > > > or
> > > > g_
> > > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezf
> > > > sY
> > > > Hy
> > > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZ
> > > > QS
> > > > sK
> > > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5e
> > > > tw
> > > > Im
> > > > WZHzFz6Sk&e=
> > > > Installing+for+Developers
> > > >
> > > >
> > > >
> > > > And I get the following error in the resource manager executor log
> > > > in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> > > resourcemanager`:
> > > >
> > > >
> > > >
> > > > chown: cannot access
> > > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > > f298affb6442’:
> > > > No such file or directory
> > > >
> > > > env: /bin/yarn: No such file or directory
> > > >
> > > > ory
> > > >
> > > >
> > > >
> > > > It appears the ‘mesos’ directory doesn’t exist under
> > /sys/fs/cgroup/cpu.
> > > > Any ideas what the issue could be?
> > > >
> > > >
> > > >
> > > > This is my yarn-site.xml:
> > > >
> > > >
> > > >
> > > > <configuration>
> > > >
> > > > <!-- Site-specific YARN configuration properties -->
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.nodemanager.aux-services</name>
> > > >
> > > >        <value>mapreduce_shuffle,myriad_executor</value>
> > > >
> > > >        <!-- If using MapR distro, please use the following value:
> > > >
> > > >
> > > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</valu
> > > > e>
> > > > -->
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >
> > > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
> > > >
> > > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >
> > > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> > > >
> > > >
> > > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > > >
> > > >        <value>2000</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > > >
> > > >        <value>10000</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >
> > > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
> > > >
> > > >        <value>1000</value>
> > > >
> > > >    </property>
> > > >
> > > > <!-- Needed for Fine Grain Scaling -->
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > >
> > > >        <value>0</value>
> > > >
> > > >    </property>
> > > >
> > > >    <property>
> > > >
> > > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > > >
> > > >        <value>0</value>
> > > >
> > > >    </property>
> > > >
> > > > <!-- Site specific YARN configuration properties -->
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > >
> > > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > > >
> > > >    <value>${nodemanager.resource.memory-mb}</value>
> > > >
> > > > </property>
> > > >
> > > > <!--These options enable dynamic port assignment by mesos -->
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.address</name>
> > > >
> > > >    <value>${myriad.yarn.nodemanager.address}</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.webapp.address</name>
> > > >
> > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.webapp.https.address</name>
> > > >
> > > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.localizer.address</name>
> > > >
> > > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > > >
> > > > </property>
> > > >
> > > > <!-- Configure Myriad Scheduler here -->
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.resourcemanager.scheduler.class</name>
> > > >
> > > >
> > > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value
> > > > >
> > > >
> > > >    <description>One can configure other scehdulers as well from
> > > > following
> > > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
> > > >
> > > > </property>
> > > >
> > > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > > >
> > > >    <value>false</value>
> > > >
> > > > </property>
> > > >
> > > > <property>
> > > >
> > > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > > >
> > > >    <value>false</value>
> > > >
> > > > </property>
> > > >
> > > > </configuration>
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > My myriad-config-default.yml:
> > > >
> > > >
> > > >
> > > > mesosMaster: zk://myip:2181/mesos
> > > >
> > > > checkpoint: false
> > > >
> > > > frameworkFailoverTimeout: 43200000
> > > >
> > > > frameworkName: MyriadAlpha
> > > >
> > > > frameworkRole:
> > > >
> > > > frameworkUser: root     # User the Node Manager runs as, required if
> > > > nodeManagerURI set, otherwise defaults to the user
> > > >
> > > >                          # running the resource manager.
> > > >
> > > > frameworkSuperUser: root  # To be depricated, currently
> > > > permissions need set by a superuser due to Mesos-1790.  Must be
> > > >
> > > >                          # root or have passwordless sudo.
> > > > Required if nodeManagerURI set, ignored otherwise.
> > > >
> > > > nativeLibrary: /usr/local/lib/libmesos.so
> > > >
> > > > zkServers: myip:2181
> > > >
> > > > zkTimeout: 20000
> > > >
> > > > restApiPort: 8192
> > > >
> > > > servedConfigPath: dist/config.tgz
> > > >
> > > > servedBinaryPath: dist/binary.tgz
> > > >
> > > > profiles:
> > > >
> > > > zero:  # NMs launched with this profile dynamically obtain cpu/mem
> > > > from Mesos
> > > >
> > > >    cpu: 0
> > > >
> > > >    mem: 0
> > > >
> > > > small:
> > > >
> > > >    cpu: 2
> > > >
> > > >    mem: 2048
> > > >
> > > > medium:
> > > >
> > > >    cpu: 4
> > > >
> > > >    mem: 4096
> > > >
> > > > large:
> > > >
> > > >    cpu: 10
> > > >
> > > >    mem: 12288
> > > >
> > > > nmInstances: # NMs to start with. Requires at least 1 NM with a
> > > > non-zero profile.
> > > >
> > > > medium: 1 # <profile_name : instances>
> > > >
> > > > rebalancer: false
> > > >
> > > > haEnabled: false
> > > >
> > > > nodemanager:
> > > >
> > > > jvmMaxMemoryMB: 1024
> > > >
> > > > cpus: 0.2
> > > >
> > > > cgroups: false
> > > >
> > > > executor:
> > > >
> > > > jvmMaxMemoryMB: 256
> > > >
> > > > path:
> > > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
> > > >
> > > > #The following should be used for a remotely distributed URI, hdfs
> > > > assumed but other URI types valid.
> > > >
> > > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > > >
> > > > #configUri:
> > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_
> > > > ar
> > > > if
> > > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2
> > > > Zh
> > > > lU
> > > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5ir
> > > > uY
> > > > 8I
> > > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsm
> > > > ew
> > > > &e
> > > > =
> > > >
> > > > #jvmUri:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.myc
> > > > om
> > > > pa
> > > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeu
> > > > WB
> > > > T6
> > > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtz
> > > > Nh
> > > > AI
> > > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3C
> > > > Ls
> > > > gl
> > > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > > >
> > > > yarnEnvironment:
> > > >
> > > > YARN_HOME: /opt/hadoop-2.7.2
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Thanks!
> > > >
> > > > Matt
> > > >
> > >
> >
>

RE: Resource manager error

Posted by "Matthew J. Loppatto" <ml...@keywcorp.com>.
Hey John,

I set up a role for myriad, restarted mesos-master, and now I'm seeing RMs starting on the Mesos UI, but they fail with the message "lost with exit status: 256".  The executor log says "Error: JAVA_HOME is not set and could not be found."  $JAVA_HOME is set on all my slaves as far as I'm aware.  Running `java -version` confirms openjdk 1.7.0_111.  Looks like its close to a working state.  Am I missing something?

Thanks!
Matt

-----Original Message-----
From: John Yost [mailto:hokiegeek2@gmail.com] 
Sent: Wednesday, August 17, 2016 2:38 PM
To: dev@myriad.incubator.apache.org
Subject: Re: Resource manager error

Please uncomment frameworkRole and then add the name of whatever Mesos role you have configured that is not *. Note: at the risk of telling you something you already know, you define roles in /etc/mesos-master/roles.

In the meantime, I opened up a JIRA ticket and gonna fix this ASAP starting now! :)

--John

On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <mloppatto@keywcorp.com
> wrote:

> Hey Darin,
>
> Commenting out myriadFrameworkRole got rid of the log message about 
> the missing role, but I'm still seeing the "n must be positive" exception.
>
> The only other thing of interest I see in the log is WARN fair.AllocationFileLoaderService:
> fair-scheduler.xml not found on the classpath.  Not sure if that is 
> causing any issue though.
>
> Matt
>
> -----Original Message-----
> From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> Sent: Wednesday, August 17, 2016 1:26 PM
> To: Dev
> Subject: Re: Resource manager error
>
> Hey Matt,
>
> Looking through the code, I think setting myriadFrameworkRole to "*" 
> might be the problem.  Can you try commenting out that line in your 
> config?  I'll double check this in a little while too.  If that works 
> I'll submit a patch that checks that.
>
> Sorry - Myriad is still a pretty young project!  Thanks for checking 
> it out though!
>
> Darin
>
> On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto < 
> mloppatto@keywcorp.com> wrote:
>
> > Hey Darin,
> >
> > Pulling from master got rid of the errors I was seeing, however I'm 
> > running into a new issue.  After starting the resource manager, I 
> > see this in the logs:
> >
> > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 
> > NM(s) with profile medium
> > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> MyriadOperations:
> > Adding 1 NM instances to cluster
> > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> event.handlers.ErrorEventHandler:
> > Role '' is not present in the master's --roles
> >
> > My Mesos cluster has the default "*" role so I tried setting
> > frameworkRole: "*" in myriad-config-default.yml, restarted the 
> > resource manager and got this error:
> >
> > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> event.handlers.ResourceOffersEventHandler:
> > Exception thrown while trying to create a task for nm
> > java.lang.IllegalArgumentException: n must be positive
> >     at java.util.Random.nextInt(Random.java:300)
> >     at org.apache.myriad.scheduler.resource.RangeResource.
> > getRandomValues(RangeResource.java:128)
> >     at org.apache.myriad.scheduler.resource.RangeResource.
> > consumeResource(RangeResource.java:99)
> >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > consumePorts(ResourceOfferContainer.java:171)
> >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > NMTaskFactory.java:45)
> >     at org.apache.myriad.scheduler.event.handlers.
> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119)
> >     at org.apache.myriad.scheduler.event.handlers.
> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
> >     at com.lmax.disruptor.BatchEventProcessor.run(
> > BatchEventProcessor.java:128)
> >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> >     at java.lang.Thread.run(Thread.java:745)
> >
> > Does Myriad require its own role in Mesos?
> >
> > Thanks,
> > Matt
> >
> >
> > -----Original Message-----
> > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > Sent: Tuesday, August 16, 2016 6:18 PM
> > To: Dev
> > Subject: Re: Resource manager error
> >
> > Hey Mathew, my coworker found the same issue recently, I fixed it on 
> > my last pull request, if you'd like to pull from master.
> >
> > Alternatively, you could comment out the appendCgroups line in 
> > myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > wI 
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > Dn 
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main < 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > he
> > _
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c
> > = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java < 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > he
> > _
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=CwI
> > Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > wI 
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > Dn 
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache < 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > he _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2Zh
> > lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > wI 
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > Dn 
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > wI 
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > Dn 
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1
> > qe zfsY 
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > /*NMExecutorCLGenImpl* and rebuild.
> >
> > Sorry that missed my QA unfortunately I'm always using cgroups and 
> > didn't test that.  We may do a 0.2.1 release but I can say when.
> >
> > Darin
> >
> > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > <ml...@keywcorp.com>
> > wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> > > or
> > > g_
> > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezf
> > > sY
> > > Hy
> > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZ
> > > QS
> > > sK
> > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5e
> > > tw
> > > Im
> > > WZHzFz6Sk&e=
> > > Installing+for+Developers
> > >
> > >
> > >
> > > And I get the following error in the resource manager executor log 
> > > in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> > resourcemanager`:
> > >
> > >
> > >
> > > chown: cannot access
> > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > f298affb6442’:
> > > No such file or directory
> > >
> > > env: /bin/yarn: No such file or directory
> > >
> > > ory
> > >
> > >
> > >
> > > It appears the ‘mesos’ directory doesn’t exist under
> /sys/fs/cgroup/cpu.
> > > Any ideas what the issue could be?
> > >
> > >
> > >
> > > This is my yarn-site.xml:
> > >
> > >
> > >
> > > <configuration>
> > >
> > > <!-- Site-specific YARN configuration properties -->
> > >
> > >    <property>
> > >
> > >        <name>yarn.nodemanager.aux-services</name>
> > >
> > >        <value>mapreduce_shuffle,myriad_executor</value>
> > >
> > >        <!-- If using MapR distro, please use the following value:
> > >
> > >
> > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</valu
> > > e>
> > > -->
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
> > >
> > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> > >
> > >
> > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > >
> > >        <value>2000</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > >
> > >        <value>10000</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
> > >
> > >        <value>1000</value>
> > >
> > >    </property>
> > >
> > > <!-- Needed for Fine Grain Scaling -->
> > >
> > >    <property>
> > >
> > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > >
> > >        <value>0</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > >
> > >        <value>0</value>
> > >
> > >    </property>
> > >
> > > <!-- Site specific YARN configuration properties -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > >
> > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > >
> > >    <value>${nodemanager.resource.memory-mb}</value>
> > >
> > > </property>
> > >
> > > <!--These options enable dynamic port assignment by mesos -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.webapp.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.webapp.https.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.localizer.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > >
> > > </property>
> > >
> > > <!-- Configure Myriad Scheduler here -->
> > >
> > > <property>
> > >
> > >    <name>yarn.resourcemanager.scheduler.class</name>
> > >
> > >
> > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value
> > > >
> > >
> > >    <description>One can configure other scehdulers as well from 
> > > following
> > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
> > >
> > > </property>
> > >
> > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > >
> > >    <value>false</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > >
> > >    <value>false</value>
> > >
> > > </property>
> > >
> > > </configuration>
> > >
> > >
> > >
> > >
> > >
> > > My myriad-config-default.yml:
> > >
> > >
> > >
> > > mesosMaster: zk://myip:2181/mesos
> > >
> > > checkpoint: false
> > >
> > > frameworkFailoverTimeout: 43200000
> > >
> > > frameworkName: MyriadAlpha
> > >
> > > frameworkRole:
> > >
> > > frameworkUser: root     # User the Node Manager runs as, required if
> > > nodeManagerURI set, otherwise defaults to the user
> > >
> > >                          # running the resource manager.
> > >
> > > frameworkSuperUser: root  # To be depricated, currently 
> > > permissions need set by a superuser due to Mesos-1790.  Must be
> > >
> > >                          # root or have passwordless sudo. 
> > > Required if nodeManagerURI set, ignored otherwise.
> > >
> > > nativeLibrary: /usr/local/lib/libmesos.so
> > >
> > > zkServers: myip:2181
> > >
> > > zkTimeout: 20000
> > >
> > > restApiPort: 8192
> > >
> > > servedConfigPath: dist/config.tgz
> > >
> > > servedBinaryPath: dist/binary.tgz
> > >
> > > profiles:
> > >
> > > zero:  # NMs launched with this profile dynamically obtain cpu/mem 
> > > from Mesos
> > >
> > >    cpu: 0
> > >
> > >    mem: 0
> > >
> > > small:
> > >
> > >    cpu: 2
> > >
> > >    mem: 2048
> > >
> > > medium:
> > >
> > >    cpu: 4
> > >
> > >    mem: 4096
> > >
> > > large:
> > >
> > >    cpu: 10
> > >
> > >    mem: 12288
> > >
> > > nmInstances: # NMs to start with. Requires at least 1 NM with a 
> > > non-zero profile.
> > >
> > > medium: 1 # <profile_name : instances>
> > >
> > > rebalancer: false
> > >
> > > haEnabled: false
> > >
> > > nodemanager:
> > >
> > > jvmMaxMemoryMB: 1024
> > >
> > > cpus: 0.2
> > >
> > > cgroups: false
> > >
> > > executor:
> > >
> > > jvmMaxMemoryMB: 256
> > >
> > > path:
> > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
> > >
> > > #The following should be used for a remotely distributed URI, hdfs 
> > > assumed but other URI types valid.
> > >
> > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > >
> > > #configUri:
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_
> > > ar
> > > if
> > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2
> > > Zh
> > > lU
> > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5ir
> > > uY
> > > 8I
> > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsm
> > > ew
> > > &e
> > > =
> > >
> > > #jvmUri:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.myc
> > > om
> > > pa
> > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeu
> > > WB
> > > T6
> > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtz
> > > Nh
> > > AI
> > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3C
> > > Ls
> > > gl
> > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > >
> > > yarnEnvironment:
> > >
> > > YARN_HOME: /opt/hadoop-2.7.2
> > >
> > >
> > >
> > >
> > >
> > > Thanks!
> > >
> > > Matt
> > >
> >
>

Re: Resource manager error

Posted by John Yost <ho...@gmail.com>.
Please uncomment frameworkRole and then add the name of whatever Mesos role
you have configured that is not *. Note: at the risk of telling you
something you already know, you define roles in /etc/mesos-master/roles.

In the meantime, I opened up a JIRA ticket and gonna fix this ASAP starting
now! :)

--John

On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <mloppatto@keywcorp.com
> wrote:

> Hey Darin,
>
> Commenting out myriadFrameworkRole got rid of the log message about the
> missing role, but I'm still seeing the "n must be positive" exception.
>
> The only other thing of interest I see in the log is WARN fair.AllocationFileLoaderService:
> fair-scheduler.xml not found on the classpath.  Not sure if that is causing
> any issue though.
>
> Matt
>
> -----Original Message-----
> From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> Sent: Wednesday, August 17, 2016 1:26 PM
> To: Dev
> Subject: Re: Resource manager error
>
> Hey Matt,
>
> Looking through the code, I think setting myriadFrameworkRole to "*" might
> be the problem.  Can you try commenting out that line in your config?  I'll
> double check this in a little while too.  If that works I'll submit a patch
> that checks that.
>
> Sorry - Myriad is still a pretty young project!  Thanks for checking it
> out though!
>
> Darin
>
> On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
> mloppatto@keywcorp.com> wrote:
>
> > Hey Darin,
> >
> > Pulling from master got rid of the errors I was seeing, however I'm
> > running into a new issue.  After starting the resource manager, I see
> > this in the logs:
> >
> > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 NM(s)
> > with profile medium
> > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> MyriadOperations:
> > Adding 1 NM instances to cluster
> > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> event.handlers.ErrorEventHandler:
> > Role '' is not present in the master's --roles
> >
> > My Mesos cluster has the default "*" role so I tried setting
> > frameworkRole: "*" in myriad-config-default.yml, restarted the
> > resource manager and got this error:
> >
> > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> event.handlers.ResourceOffersEventHandler:
> > Exception thrown while trying to create a task for nm
> > java.lang.IllegalArgumentException: n must be positive
> >     at java.util.Random.nextInt(Random.java:300)
> >     at org.apache.myriad.scheduler.resource.RangeResource.
> > getRandomValues(RangeResource.java:128)
> >     at org.apache.myriad.scheduler.resource.RangeResource.
> > consumeResource(RangeResource.java:99)
> >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > consumePorts(ResourceOfferContainer.java:171)
> >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > NMTaskFactory.java:45)
> >     at org.apache.myriad.scheduler.event.handlers.
> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119)
> >     at org.apache.myriad.scheduler.event.handlers.
> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
> >     at com.lmax.disruptor.BatchEventProcessor.run(
> > BatchEventProcessor.java:128)
> >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> >     at java.lang.Thread.run(Thread.java:745)
> >
> > Does Myriad require its own role in Mesos?
> >
> > Thanks,
> > Matt
> >
> >
> > -----Original Message-----
> > From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> > Sent: Tuesday, August 16, 2016 6:18 PM
> > To: Dev
> > Subject: Re: Resource manager error
> >
> > Hey Mathew, my coworker found the same issue recently, I fixed it on
> > my last pull request, if you'd like to pull from master.
> >
> > Alternatively, you could comment out the appendCgroups line in
> > myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=CwI
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDn
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZTjV
> > w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache
> > _
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c=
> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache
> > _
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=CwIFa
> > Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=CwI
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDn
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZTjV
> > w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache
> > _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU
> > &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=CwI
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDn
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZTjV
> > w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=CwI
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDn
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZTjV
> > w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .>
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qe
> > zfsY HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > /*NMExecutorCLGenImpl* and rebuild.
> >
> > Sorry that missed my QA unfortunately I'm always using cgroups and
> > didn't test that.  We may do a 0.2.1 release but I can say when.
> >
> > Darin
> >
> > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > <ml...@keywcorp.com>
> > wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.or
> > > g_
> > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > > Hy
> > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQS
> > > sK
> > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5etw
> > > Im
> > > WZHzFz6Sk&e=
> > > Installing+for+Developers
> > >
> > >
> > >
> > > And I get the following error in the resource manager executor log
> > > in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> > resourcemanager`:
> > >
> > >
> > >
> > > chown: cannot access
> > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > f298affb6442’:
> > > No such file or directory
> > >
> > > env: /bin/yarn: No such file or directory
> > >
> > > ory
> > >
> > >
> > >
> > > It appears the ‘mesos’ directory doesn’t exist under
> /sys/fs/cgroup/cpu.
> > > Any ideas what the issue could be?
> > >
> > >
> > >
> > > This is my yarn-site.xml:
> > >
> > >
> > >
> > > <configuration>
> > >
> > > <!-- Site-specific YARN configuration properties -->
> > >
> > >    <property>
> > >
> > >        <name>yarn.nodemanager.aux-services</name>
> > >
> > >        <value>mapreduce_shuffle,myriad_executor</value>
> > >
> > >        <!-- If using MapR distro, please use the following value:
> > >
> > >
> > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value>
> > > -->
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
> > >
> > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> > >
> > >
> > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > >
> > >        <value>2000</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > >
> > >        <value>10000</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
> > >
> > >        <value>1000</value>
> > >
> > >    </property>
> > >
> > > <!-- Needed for Fine Grain Scaling -->
> > >
> > >    <property>
> > >
> > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > >
> > >        <value>0</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > >
> > >        <value>0</value>
> > >
> > >    </property>
> > >
> > > <!-- Site specific YARN configuration properties -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > >
> > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > >
> > >    <value>${nodemanager.resource.memory-mb}</value>
> > >
> > > </property>
> > >
> > > <!--These options enable dynamic port assignment by mesos -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.webapp.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.webapp.https.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.localizer.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > >
> > > </property>
> > >
> > > <!-- Configure Myriad Scheduler here -->
> > >
> > > <property>
> > >
> > >    <name>yarn.resourcemanager.scheduler.class</name>
> > >
> > >
> > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value>
> > >
> > >    <description>One can configure other scehdulers as well from
> > > following
> > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
> > >
> > > </property>
> > >
> > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > >
> > >    <value>false</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > >
> > >    <value>false</value>
> > >
> > > </property>
> > >
> > > </configuration>
> > >
> > >
> > >
> > >
> > >
> > > My myriad-config-default.yml:
> > >
> > >
> > >
> > > mesosMaster: zk://myip:2181/mesos
> > >
> > > checkpoint: false
> > >
> > > frameworkFailoverTimeout: 43200000
> > >
> > > frameworkName: MyriadAlpha
> > >
> > > frameworkRole:
> > >
> > > frameworkUser: root     # User the Node Manager runs as, required if
> > > nodeManagerURI set, otherwise defaults to the user
> > >
> > >                          # running the resource manager.
> > >
> > > frameworkSuperUser: root  # To be depricated, currently permissions
> > > need set by a superuser due to Mesos-1790.  Must be
> > >
> > >                          # root or have passwordless sudo. Required
> > > if nodeManagerURI set, ignored otherwise.
> > >
> > > nativeLibrary: /usr/local/lib/libmesos.so
> > >
> > > zkServers: myip:2181
> > >
> > > zkTimeout: 20000
> > >
> > > restApiPort: 8192
> > >
> > > servedConfigPath: dist/config.tgz
> > >
> > > servedBinaryPath: dist/binary.tgz
> > >
> > > profiles:
> > >
> > > zero:  # NMs launched with this profile dynamically obtain cpu/mem
> > > from Mesos
> > >
> > >    cpu: 0
> > >
> > >    mem: 0
> > >
> > > small:
> > >
> > >    cpu: 2
> > >
> > >    mem: 2048
> > >
> > > medium:
> > >
> > >    cpu: 4
> > >
> > >    mem: 4096
> > >
> > > large:
> > >
> > >    cpu: 10
> > >
> > >    mem: 12288
> > >
> > > nmInstances: # NMs to start with. Requires at least 1 NM with a
> > > non-zero profile.
> > >
> > > medium: 1 # <profile_name : instances>
> > >
> > > rebalancer: false
> > >
> > > haEnabled: false
> > >
> > > nodemanager:
> > >
> > > jvmMaxMemoryMB: 1024
> > >
> > > cpus: 0.2
> > >
> > > cgroups: false
> > >
> > > executor:
> > >
> > > jvmMaxMemoryMB: 256
> > >
> > > path:
> > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
> > >
> > > #The following should be used for a remotely distributed URI, hdfs
> > > assumed but other URI types valid.
> > >
> > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > >
> > > #configUri:
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_ar
> > > if
> > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2Zh
> > > lU
> > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY
> > > 8I
> > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsmew
> > > &e
> > > =
> > >
> > > #jvmUri:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.mycom
> > > pa
> > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeuWB
> > > T6
> > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNh
> > > AI
> > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3CLs
> > > gl
> > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > >
> > > yarnEnvironment:
> > >
> > > YARN_HOME: /opt/hadoop-2.7.2
> > >
> > >
> > >
> > >
> > >
> > > Thanks!
> > >
> > > Matt
> > >
> >
>

RE: Resource manager error

Posted by "Matthew J. Loppatto" <ml...@keywcorp.com>.
Hey Darin,

Commenting out myriadFrameworkRole got rid of the log message about the missing role, but I'm still seeing the "n must be positive" exception.

The only other thing of interest I see in the log is WARN fair.AllocationFileLoaderService: fair-scheduler.xml not found on the classpath.  Not sure if that is causing any issue though.

Matt

-----Original Message-----
From: Darin Johnson [mailto:dbjohnson1978@gmail.com] 
Sent: Wednesday, August 17, 2016 1:26 PM
To: Dev
Subject: Re: Resource manager error

Hey Matt,

Looking through the code, I think setting myriadFrameworkRole to "*" might be the problem.  Can you try commenting out that line in your config?  I'll double check this in a little while too.  If that works I'll submit a patch that checks that.

Sorry - Myriad is still a pretty young project!  Thanks for checking it out though!

Darin

On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto < mloppatto@keywcorp.com> wrote:

> Hey Darin,
>
> Pulling from master got rid of the errors I was seeing, however I'm 
> running into a new issue.  After starting the resource manager, I see 
> this in the logs:
>
> 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 NM(s) 
> with profile medium
> 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.MyriadOperations:
> Adding 1 NM instances to cluster
> 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.event.handlers.ErrorEventHandler:
> Role '' is not present in the master's --roles
>
> My Mesos cluster has the default "*" role so I tried setting
> frameworkRole: "*" in myriad-config-default.yml, restarted the 
> resource manager and got this error:
>
> 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler:
> Exception thrown while trying to create a task for nm
> java.lang.IllegalArgumentException: n must be positive
>     at java.util.Random.nextInt(Random.java:300)
>     at org.apache.myriad.scheduler.resource.RangeResource.
> getRandomValues(RangeResource.java:128)
>     at org.apache.myriad.scheduler.resource.RangeResource.
> consumeResource(RangeResource.java:99)
>     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> consumePorts(ResourceOfferContainer.java:171)
>     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> NMTaskFactory.java:45)
>     at org.apache.myriad.scheduler.event.handlers.
> ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119)
>     at org.apache.myriad.scheduler.event.handlers.
> ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
>     at com.lmax.disruptor.BatchEventProcessor.run(
> BatchEventProcessor.java:128)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
> Does Myriad require its own role in Mesos?
>
> Thanks,
> Matt
>
>
> -----Original Message-----
> From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> Sent: Tuesday, August 16, 2016 6:18 PM
> To: Dev
> Subject: Re: Resource manager error
>
> Hey Mathew, my coworker found the same issue recently, I fixed it on 
> my last pull request, if you'd like to pull from master.
>
> Alternatively, you could comment out the appendCgroups line in 
> myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=CwI
> FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDn
> Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZTjV
> w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main < 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache
> _ 
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c=
> 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java < 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache
> _ 
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=CwIFa
> Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=CwI
> FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDn
> Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZTjV
> w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache < 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache
> _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU
> &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=CwI
> FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDn
> Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZTjV
> w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=CwI
> FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDn
> Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZTjV
> w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qe
> zfsY HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> /*NMExecutorCLGenImpl* and rebuild.
>
> Sorry that missed my QA unfortunately I'm always using cgroups and 
> didn't test that.  We may do a 0.2.1 release but I can say when.
>
> Darin
>
> On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto" 
> <ml...@keywcorp.com>
> wrote:
>
> > Hi,
> >
> >
> >
> > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.or
> > g_ 
> > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > Hy 
> > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQS
> > sK 
> > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5etw
> > Im
> > WZHzFz6Sk&e=
> > Installing+for+Developers
> >
> >
> >
> > And I get the following error in the resource manager executor log 
> > in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> resourcemanager`:
> >
> >
> >
> > chown: cannot access 
> > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> f298affb6442’:
> > No such file or directory
> >
> > env: /bin/yarn: No such file or directory
> >
> > ory
> >
> >
> >
> > It appears the ‘mesos’ directory doesn’t exist under /sys/fs/cgroup/cpu.
> > Any ideas what the issue could be?
> >
> >
> >
> > This is my yarn-site.xml:
> >
> >
> >
> > <configuration>
> >
> > <!-- Site-specific YARN configuration properties -->
> >
> >    <property>
> >
> >        <name>yarn.nodemanager.aux-services</name>
> >
> >        <value>mapreduce_shuffle,myriad_executor</value>
> >
> >        <!-- If using MapR distro, please use the following value:
> >
> >
> > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value>
> > -->
> >
> >    </property>
> >
> >    <property>
> >
> >
> > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
> >
> >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> >
> >    </property>
> >
> >    <property>
> >
> >
> > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> >
> >
> > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
> >
> >    </property>
> >
> >    <property>
> >
> >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> >
> >        <value>2000</value>
> >
> >    </property>
> >
> >    <property>
> >
> >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> >
> >        <value>10000</value>
> >
> >    </property>
> >
> >    <property>
> >
> >
> > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
> >
> >        <value>1000</value>
> >
> >    </property>
> >
> > <!-- Needed for Fine Grain Scaling -->
> >
> >    <property>
> >
> >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> >
> >        <value>0</value>
> >
> >    </property>
> >
> >    <property>
> >
> >        <name>yarn.scheduler.minimum-allocation-mb</name>
> >
> >        <value>0</value>
> >
> >    </property>
> >
> > <!-- Site specific YARN configuration properties -->
> >
> > <property>
> >
> >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> >
> >    <value>${nodemanager.resource.cpu-vcores}</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.resource.memory-mb</name>
> >
> >    <value>${nodemanager.resource.memory-mb}</value>
> >
> > </property>
> >
> > <!--These options enable dynamic port assignment by mesos -->
> >
> > <property>
> >
> >    <name>yarn.nodemanager.address</name>
> >
> >    <value>${myriad.yarn.nodemanager.address}</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.webapp.address</name>
> >
> >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.webapp.https.address</name>
> >
> >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.localizer.address</name>
> >
> >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> >
> > </property>
> >
> > <!-- Configure Myriad Scheduler here -->
> >
> > <property>
> >
> >    <name>yarn.resourcemanager.scheduler.class</name>
> >
> >    
> > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value>
> >
> >    <description>One can configure other scehdulers as well from 
> > following
> > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
> >
> > </property>
> >
> > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> >
> > <property>
> >
> >    <name>yarn.nodemanager.pmem-check-enabled</name>
> >
> >    <value>false</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.vmem-check-enabled</name>
> >
> >    <value>false</value>
> >
> > </property>
> >
> > </configuration>
> >
> >
> >
> >
> >
> > My myriad-config-default.yml:
> >
> >
> >
> > mesosMaster: zk://myip:2181/mesos
> >
> > checkpoint: false
> >
> > frameworkFailoverTimeout: 43200000
> >
> > frameworkName: MyriadAlpha
> >
> > frameworkRole:
> >
> > frameworkUser: root     # User the Node Manager runs as, required if
> > nodeManagerURI set, otherwise defaults to the user
> >
> >                          # running the resource manager.
> >
> > frameworkSuperUser: root  # To be depricated, currently permissions 
> > need set by a superuser due to Mesos-1790.  Must be
> >
> >                          # root or have passwordless sudo. Required 
> > if nodeManagerURI set, ignored otherwise.
> >
> > nativeLibrary: /usr/local/lib/libmesos.so
> >
> > zkServers: myip:2181
> >
> > zkTimeout: 20000
> >
> > restApiPort: 8192
> >
> > servedConfigPath: dist/config.tgz
> >
> > servedBinaryPath: dist/binary.tgz
> >
> > profiles:
> >
> > zero:  # NMs launched with this profile dynamically obtain cpu/mem 
> > from Mesos
> >
> >    cpu: 0
> >
> >    mem: 0
> >
> > small:
> >
> >    cpu: 2
> >
> >    mem: 2048
> >
> > medium:
> >
> >    cpu: 4
> >
> >    mem: 4096
> >
> > large:
> >
> >    cpu: 10
> >
> >    mem: 12288
> >
> > nmInstances: # NMs to start with. Requires at least 1 NM with a 
> > non-zero profile.
> >
> > medium: 1 # <profile_name : instances>
> >
> > rebalancer: false
> >
> > haEnabled: false
> >
> > nodemanager:
> >
> > jvmMaxMemoryMB: 1024
> >
> > cpus: 0.2
> >
> > cgroups: false
> >
> > executor:
> >
> > jvmMaxMemoryMB: 256
> >
> > path:
> > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
> >
> > #The following should be used for a remotely distributed URI, hdfs 
> > assumed but other URI types valid.
> >
> > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> >
> > #configUri:
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_ar
> > if 
> > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2Zh
> > lU 
> > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY
> > 8I 
> > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsmew
> > &e
> > =
> >
> > #jvmUri:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.mycom
> > pa
> > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeuWB
> > T6 
> > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNh
> > AI 
> > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3CLs
> > gl
> > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> >
> > yarnEnvironment:
> >
> > YARN_HOME: /opt/hadoop-2.7.2
> >
> >
> >
> >
> >
> > Thanks!
> >
> > Matt
> >
>

Re: Resource manager error

Posted by Darin Johnson <db...@gmail.com>.
Hey Matt,

Looking through the code, I think setting myriadFrameworkRole to "*" might
be the problem.  Can you try commenting out that line in your config?  I'll
double check this in a little while too.  If that works I'll submit a patch
that checks that.

Sorry - Myriad is still a pretty young project!  Thanks for checking it out
though!

Darin

On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
mloppatto@keywcorp.com> wrote:

> Hey Darin,
>
> Pulling from master got rid of the errors I was seeing, however I'm
> running into a new issue.  After starting the resource manager, I see this
> in the logs:
>
> 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 NM(s)
> with profile medium
> 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.MyriadOperations:
> Adding 1 NM instances to cluster
> 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.event.handlers.ErrorEventHandler:
> Role '' is not present in the master's --roles
>
> My Mesos cluster has the default "*" role so I tried setting
> frameworkRole: "*" in myriad-config-default.yml, restarted the resource
> manager and got this error:
>
> 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler:
> Exception thrown while trying to create a task for nm
> java.lang.IllegalArgumentException: n must be positive
>     at java.util.Random.nextInt(Random.java:300)
>     at org.apache.myriad.scheduler.resource.RangeResource.
> getRandomValues(RangeResource.java:128)
>     at org.apache.myriad.scheduler.resource.RangeResource.
> consumeResource(RangeResource.java:99)
>     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> consumePorts(ResourceOfferContainer.java:171)
>     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> NMTaskFactory.java:45)
>     at org.apache.myriad.scheduler.event.handlers.
> ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119)
>     at org.apache.myriad.scheduler.event.handlers.
> ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
>     at com.lmax.disruptor.BatchEventProcessor.run(
> BatchEventProcessor.java:128)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
> Does Myriad require its own role in Mesos?
>
> Thanks,
> Matt
>
>
> -----Original Message-----
> From: Darin Johnson [mailto:dbjohnson1978@gmail.com]
> Sent: Tuesday, August 16, 2016 6:18 PM
> To: Dev
> Subject: Re: Resource manager error
>
> Hey Mathew, my coworker found the same issue recently, I fixed it on my
> last pull request, if you'd like to pull from master.
>
> Alternatively, you could comment out the appendCgroups line in
> myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src <https://urldefense.
> proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c=
> 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=CwIFaQ&c=
> 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org <https://urldefense.
> proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad <https://urldefense.
> proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler <https://urldefense.
> proofpoint.com/v2/url?u=https-3A__github.com_apache_
> incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> /*NMExecutorCLGenImpl* and rebuild.
>
> Sorry that missed my QA unfortunately I'm always using cgroups and didn't
> test that.  We may do a 0.2.1 release but I can say when.
>
> Darin
>
> On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto" <ml...@keywcorp.com>
> wrote:
>
> > Hi,
> >
> >
> >
> > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHy
> > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsK
> > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5etwIm
> > WZHzFz6Sk&e=
> > Installing+for+Developers
> >
> >
> >
> > And I get the following error in the resource manager executor log in
> > mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> resourcemanager`:
> >
> >
> >
> > chown: cannot access ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> f298affb6442’:
> > No such file or directory
> >
> > env: /bin/yarn: No such file or directory
> >
> > ory
> >
> >
> >
> > It appears the ‘mesos’ directory doesn’t exist under /sys/fs/cgroup/cpu.
> > Any ideas what the issue could be?
> >
> >
> >
> > This is my yarn-site.xml:
> >
> >
> >
> > <configuration>
> >
> > <!-- Site-specific YARN configuration properties -->
> >
> >    <property>
> >
> >        <name>yarn.nodemanager.aux-services</name>
> >
> >        <value>mapreduce_shuffle,myriad_executor</value>
> >
> >        <!-- If using MapR distro, please use the following value:
> >
> >
> > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value>
> > -->
> >
> >    </property>
> >
> >    <property>
> >
> >
> > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
> >
> >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> >
> >    </property>
> >
> >    <property>
> >
> >
> > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> >
> >
> > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
> >
> >    </property>
> >
> >    <property>
> >
> >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> >
> >        <value>2000</value>
> >
> >    </property>
> >
> >    <property>
> >
> >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> >
> >        <value>10000</value>
> >
> >    </property>
> >
> >    <property>
> >
> >
> > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
> >
> >        <value>1000</value>
> >
> >    </property>
> >
> > <!-- Needed for Fine Grain Scaling -->
> >
> >    <property>
> >
> >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> >
> >        <value>0</value>
> >
> >    </property>
> >
> >    <property>
> >
> >        <name>yarn.scheduler.minimum-allocation-mb</name>
> >
> >        <value>0</value>
> >
> >    </property>
> >
> > <!-- Site specific YARN configuration properties -->
> >
> > <property>
> >
> >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> >
> >    <value>${nodemanager.resource.cpu-vcores}</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.resource.memory-mb</name>
> >
> >    <value>${nodemanager.resource.memory-mb}</value>
> >
> > </property>
> >
> > <!--These options enable dynamic port assignment by mesos -->
> >
> > <property>
> >
> >    <name>yarn.nodemanager.address</name>
> >
> >    <value>${myriad.yarn.nodemanager.address}</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.webapp.address</name>
> >
> >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.webapp.https.address</name>
> >
> >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.localizer.address</name>
> >
> >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> >
> > </property>
> >
> > <!-- Configure Myriad Scheduler here -->
> >
> > <property>
> >
> >    <name>yarn.resourcemanager.scheduler.class</name>
> >
> >    <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value>
> >
> >    <description>One can configure other scehdulers as well from
> > following
> > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
> >
> > </property>
> >
> > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> >
> > <property>
> >
> >    <name>yarn.nodemanager.pmem-check-enabled</name>
> >
> >    <value>false</value>
> >
> > </property>
> >
> > <property>
> >
> >    <name>yarn.nodemanager.vmem-check-enabled</name>
> >
> >    <value>false</value>
> >
> > </property>
> >
> > </configuration>
> >
> >
> >
> >
> >
> > My myriad-config-default.yml:
> >
> >
> >
> > mesosMaster: zk://myip:2181/mesos
> >
> > checkpoint: false
> >
> > frameworkFailoverTimeout: 43200000
> >
> > frameworkName: MyriadAlpha
> >
> > frameworkRole:
> >
> > frameworkUser: root     # User the Node Manager runs as, required if
> > nodeManagerURI set, otherwise defaults to the user
> >
> >                          # running the resource manager.
> >
> > frameworkSuperUser: root  # To be depricated, currently permissions
> > need set by a superuser due to Mesos-1790.  Must be
> >
> >                          # root or have passwordless sudo. Required if
> > nodeManagerURI set, ignored otherwise.
> >
> > nativeLibrary: /usr/local/lib/libmesos.so
> >
> > zkServers: myip:2181
> >
> > zkTimeout: 20000
> >
> > restApiPort: 8192
> >
> > servedConfigPath: dist/config.tgz
> >
> > servedBinaryPath: dist/binary.tgz
> >
> > profiles:
> >
> > zero:  # NMs launched with this profile dynamically obtain cpu/mem
> > from Mesos
> >
> >    cpu: 0
> >
> >    mem: 0
> >
> > small:
> >
> >    cpu: 2
> >
> >    mem: 2048
> >
> > medium:
> >
> >    cpu: 4
> >
> >    mem: 4096
> >
> > large:
> >
> >    cpu: 10
> >
> >    mem: 12288
> >
> > nmInstances: # NMs to start with. Requires at least 1 NM with a
> > non-zero profile.
> >
> > medium: 1 # <profile_name : instances>
> >
> > rebalancer: false
> >
> > haEnabled: false
> >
> > nodemanager:
> >
> > jvmMaxMemoryMB: 1024
> >
> > cpus: 0.2
> >
> > cgroups: false
> >
> > executor:
> >
> > jvmMaxMemoryMB: 256
> >
> > path:
> > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
> >
> > #The following should be used for a remotely distributed URI, hdfs
> > assumed but other URI types valid.
> >
> > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> >
> > #configUri:
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_arif
> > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU
> > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8I
> > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsmew&e
> > =
> >
> > #jvmUri:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.mycompa
> > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeuWBT6
> > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAI
> > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3CLsgl
> > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> >
> > yarnEnvironment:
> >
> > YARN_HOME: /opt/hadoop-2.7.2
> >
> >
> >
> >
> >
> > Thanks!
> >
> > Matt
> >
>

RE: Resource manager error

Posted by "Matthew J. Loppatto" <ml...@keywcorp.com>.
Hey Darin,

Pulling from master got rid of the errors I was seeing, however I'm running into a new issue.  After starting the resource manager, I see this in the logs:

2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 NM(s) with profile medium
2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.MyriadOperations: Adding 1 NM instances to cluster
2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.event.handlers.ErrorEventHandler: Role '' is not present in the master's --roles

My Mesos cluster has the default "*" role so I tried setting frameworkRole: "*" in myriad-config-default.yml, restarted the resource manager and got this error:

2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: Exception thrown while trying to create a task for nm
java.lang.IllegalArgumentException: n must be positive
    at java.util.Random.nextInt(Random.java:300)
    at org.apache.myriad.scheduler.resource.RangeResource.getRandomValues(RangeResource.java:128)
    at org.apache.myriad.scheduler.resource.RangeResource.consumeResource(RangeResource.java:99)
    at org.apache.myriad.scheduler.resource.ResourceOfferContainer.consumePorts(ResourceOfferContainer.java:171)
    at org.apache.myriad.scheduler.NMTaskFactory.createTask(NMTaskFactory.java:45)
    at org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119)
    at org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
    at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

Does Myriad require its own role in Mesos?

Thanks,
Matt


-----Original Message-----
From: Darin Johnson [mailto:dbjohnson1978@gmail.com] 
Sent: Tuesday, August 16, 2016 6:18 PM
To: Dev
Subject: Re: Resource manager error

Hey Mathew, my coworker found the same issue recently, I fixed it on my last pull request, if you'd like to pull from master.

Alternatively, you could comment out the appendCgroups line in myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
/*NMExecutorCLGenImpl* and rebuild.

Sorry that missed my QA unfortunately I'm always using cgroups and didn't test that.  We may do a 0.2.1 release but I can say when.

Darin

On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto" <ml...@keywcorp.com>
wrote:

> Hi,
>
>
>
> I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHy
> olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsK
> tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5etwIm
> WZHzFz6Sk&e=
> Installing+for+Developers
>
>
>
> And I get the following error in the resource manager executor log in 
> mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn resourcemanager`:
>
>
>
> chown: cannot access ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-f298affb6442’:
> No such file or directory
>
> env: /bin/yarn: No such file or directory
>
> ory
>
>
>
> It appears the ‘mesos’ directory doesn’t exist under /sys/fs/cgroup/cpu.
> Any ideas what the issue could be?
>
>
>
> This is my yarn-site.xml:
>
>
>
> <configuration>
>
> <!-- Site-specific YARN configuration properties -->
>
>    <property>
>
>        <name>yarn.nodemanager.aux-services</name>
>
>        <value>mapreduce_shuffle,myriad_executor</value>
>
>        <!-- If using MapR distro, please use the following value:
>
>                     
> <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value>
> -->
>
>    </property>
>
>    <property>
>
>        
> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
>
>        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>
>    </property>
>
>    <property>
>
>        
> <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
>
>        
> <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
>
>    </property>
>
>    <property>
>
>        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
>
>        <value>2000</value>
>
>    </property>
>
>    <property>
>
>        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
>
>        <value>10000</value>
>
>    </property>
>
>    <property>
>
>        
> <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
>
>        <value>1000</value>
>
>    </property>
>
> <!-- Needed for Fine Grain Scaling -->
>
>    <property>
>
>        <name>yarn.scheduler.minimum-allocation-vcores</name>
>
>        <value>0</value>
>
>    </property>
>
>    <property>
>
>        <name>yarn.scheduler.minimum-allocation-mb</name>
>
>        <value>0</value>
>
>    </property>
>
> <!-- Site specific YARN configuration properties -->
>
> <property>
>
>    <name>yarn.nodemanager.resource.cpu-vcores</name>
>
>    <value>${nodemanager.resource.cpu-vcores}</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.resource.memory-mb</name>
>
>    <value>${nodemanager.resource.memory-mb}</value>
>
> </property>
>
> <!--These options enable dynamic port assignment by mesos -->
>
> <property>
>
>    <name>yarn.nodemanager.address</name>
>
>    <value>${myriad.yarn.nodemanager.address}</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.webapp.address</name>
>
>    <value>${myriad.yarn.nodemanager.webapp.address}</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.webapp.https.address</name>
>
>    <value>${myriad.yarn.nodemanager.webapp.address}</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.localizer.address</name>
>
>    <value>${myriad.yarn.nodemanager.localizer.address}</value>
>
> </property>
>
> <!-- Configure Myriad Scheduler here -->
>
> <property>
>
>    <name>yarn.resourcemanager.scheduler.class</name>
>
>    <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value>
>
>    <description>One can configure other scehdulers as well from 
> following
> list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
>
> </property>
>
> <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
>
> <property>
>
>    <name>yarn.nodemanager.pmem-check-enabled</name>
>
>    <value>false</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.vmem-check-enabled</name>
>
>    <value>false</value>
>
> </property>
>
> </configuration>
>
>
>
>
>
> My myriad-config-default.yml:
>
>
>
> mesosMaster: zk://myip:2181/mesos
>
> checkpoint: false
>
> frameworkFailoverTimeout: 43200000
>
> frameworkName: MyriadAlpha
>
> frameworkRole:
>
> frameworkUser: root     # User the Node Manager runs as, required if
> nodeManagerURI set, otherwise defaults to the user
>
>                          # running the resource manager.
>
> frameworkSuperUser: root  # To be depricated, currently permissions 
> need set by a superuser due to Mesos-1790.  Must be
>
>                          # root or have passwordless sudo. Required if 
> nodeManagerURI set, ignored otherwise.
>
> nativeLibrary: /usr/local/lib/libmesos.so
>
> zkServers: myip:2181
>
> zkTimeout: 20000
>
> restApiPort: 8192
>
> servedConfigPath: dist/config.tgz
>
> servedBinaryPath: dist/binary.tgz
>
> profiles:
>
> zero:  # NMs launched with this profile dynamically obtain cpu/mem 
> from Mesos
>
>    cpu: 0
>
>    mem: 0
>
> small:
>
>    cpu: 2
>
>    mem: 2048
>
> medium:
>
>    cpu: 4
>
>    mem: 4096
>
> large:
>
>    cpu: 10
>
>    mem: 12288
>
> nmInstances: # NMs to start with. Requires at least 1 NM with a 
> non-zero profile.
>
> medium: 1 # <profile_name : instances>
>
> rebalancer: false
>
> haEnabled: false
>
> nodemanager:
>
> jvmMaxMemoryMB: 1024
>
> cpus: 0.2
>
> cgroups: false
>
> executor:
>
> jvmMaxMemoryMB: 256
>
> path: 
> file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
>
> #The following should be used for a remotely distributed URI, hdfs 
> assumed but other URI types valid.
>
> #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
>
> #configUri: 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_arif
> acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU
> &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5iruY8I
> mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsmew&e
> =
>
> #jvmUri: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.mycompa
> ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeuWBT6
> LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAI
> fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3CLsgl
> 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
>
> yarnEnvironment:
>
> YARN_HOME: /opt/hadoop-2.7.2
>
>
>
>
>
> Thanks!
>
> Matt
>

Re: Resource manager error

Posted by Darin Johnson <db...@gmail.com>.
Hey Mathew, my coworker found the same issue recently, I fixed it on my
last pull request, if you'd like to pull from master.

Alternatively, you could comment out the appendCgroups line in
myriad-scheduler
<https://github.com/apache/incubator-myriad/tree/0.2.x/myriad-scheduler>/src
<https://github.com/apache/incubator-myriad/tree/0.2.x/myriad-scheduler/src>
/main
<https://github.com/apache/incubator-myriad/tree/0.2.x/myriad-scheduler/src/main>
/java
<https://github.com/apache/incubator-myriad/tree/0.2.x/myriad-scheduler/src/main/java>
/org
<https://github.com/apache/incubator-myriad/tree/0.2.x/myriad-scheduler/src/main/java/org>
/apache
<https://github.com/apache/incubator-myriad/tree/0.2.x/myriad-scheduler/src/main/java/org/apache>
/myriad
<https://github.com/apache/incubator-myriad/tree/0.2.x/myriad-scheduler/src/main/java/org/apache/myriad>
/scheduler
<https://github.com/apache/incubator-myriad/tree/0.2.x/myriad-scheduler/src/main/java/org/apache/myriad/scheduler>
/*NMExecutorCLGenImpl* and rebuild.

Sorry that missed my QA unfortunately I'm always using cgroups and didn't
test that.  We may do a 0.2.1 release but I can say when.

Darin

On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto" <ml...@keywcorp.com>
wrote:

> Hi,
>
>
>
> I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> https://cwiki.apache.org/confluence/display/MYRIAD/
> Installing+for+Developers
>
>
>
> And I get the following error in the resource manager executor log in
> mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn resourcemanager`:
>
>
>
> chown: cannot access ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-f298affb6442’:
> No such file or directory
>
> env: /bin/yarn: No such file or directory
>
> ory
>
>
>
> It appears the ‘mesos’ directory doesn’t exist under /sys/fs/cgroup/cpu.
> Any ideas what the issue could be?
>
>
>
> This is my yarn-site.xml:
>
>
>
> <configuration>
>
> <!-- Site-specific YARN configuration properties -->
>
>    <property>
>
>        <name>yarn.nodemanager.aux-services</name>
>
>        <value>mapreduce_shuffle,myriad_executor</value>
>
>        <!-- If using MapR distro, please use the following value:
>
>                     <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</value>
> -->
>
>    </property>
>
>    <property>
>
>        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
>
>        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>
>    </property>
>
>    <property>
>
>        <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
>
>        <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
>
>    </property>
>
>    <property>
>
>        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
>
>        <value>2000</value>
>
>    </property>
>
>    <property>
>
>        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
>
>        <value>10000</value>
>
>    </property>
>
>    <property>
>
>        <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
>
>        <value>1000</value>
>
>    </property>
>
> <!-- Needed for Fine Grain Scaling -->
>
>    <property>
>
>        <name>yarn.scheduler.minimum-allocation-vcores</name>
>
>        <value>0</value>
>
>    </property>
>
>    <property>
>
>        <name>yarn.scheduler.minimum-allocation-mb</name>
>
>        <value>0</value>
>
>    </property>
>
> <!-- Site specific YARN configuration properties -->
>
> <property>
>
>    <name>yarn.nodemanager.resource.cpu-vcores</name>
>
>    <value>${nodemanager.resource.cpu-vcores}</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.resource.memory-mb</name>
>
>    <value>${nodemanager.resource.memory-mb}</value>
>
> </property>
>
> <!--These options enable dynamic port assignment by mesos -->
>
> <property>
>
>    <name>yarn.nodemanager.address</name>
>
>    <value>${myriad.yarn.nodemanager.address}</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.webapp.address</name>
>
>    <value>${myriad.yarn.nodemanager.webapp.address}</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.webapp.https.address</name>
>
>    <value>${myriad.yarn.nodemanager.webapp.address}</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.localizer.address</name>
>
>    <value>${myriad.yarn.nodemanager.localizer.address}</value>
>
> </property>
>
> <!-- Configure Myriad Scheduler here -->
>
> <property>
>
>    <name>yarn.resourcemanager.scheduler.class</name>
>
>    <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value>
>
>    <description>One can configure other scehdulers as well from following
> list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
>
> </property>
>
> <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
>
> <property>
>
>    <name>yarn.nodemanager.pmem-check-enabled</name>
>
>    <value>false</value>
>
> </property>
>
> <property>
>
>    <name>yarn.nodemanager.vmem-check-enabled</name>
>
>    <value>false</value>
>
> </property>
>
> </configuration>
>
>
>
>
>
> My myriad-config-default.yml:
>
>
>
> mesosMaster: zk://myip:2181/mesos
>
> checkpoint: false
>
> frameworkFailoverTimeout: 43200000
>
> frameworkName: MyriadAlpha
>
> frameworkRole:
>
> frameworkUser: root     # User the Node Manager runs as, required if
> nodeManagerURI set, otherwise defaults to the user
>
>                          # running the resource manager.
>
> frameworkSuperUser: root  # To be depricated, currently permissions need
> set by a superuser due to Mesos-1790.  Must be
>
>                          # root or have passwordless sudo. Required if
> nodeManagerURI set, ignored otherwise.
>
> nativeLibrary: /usr/local/lib/libmesos.so
>
> zkServers: myip:2181
>
> zkTimeout: 20000
>
> restApiPort: 8192
>
> servedConfigPath: dist/config.tgz
>
> servedBinaryPath: dist/binary.tgz
>
> profiles:
>
> zero:  # NMs launched with this profile dynamically obtain cpu/mem from
> Mesos
>
>    cpu: 0
>
>    mem: 0
>
> small:
>
>    cpu: 2
>
>    mem: 2048
>
> medium:
>
>    cpu: 4
>
>    mem: 4096
>
> large:
>
>    cpu: 10
>
>    mem: 12288
>
> nmInstances: # NMs to start with. Requires at least 1 NM with a non-zero
> profile.
>
> medium: 1 # <profile_name : instances>
>
> rebalancer: false
>
> haEnabled: false
>
> nodemanager:
>
> jvmMaxMemoryMB: 1024
>
> cpus: 0.2
>
> cgroups: false
>
> executor:
>
> jvmMaxMemoryMB: 256
>
> path: file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
>
> #The following should be used for a remotely distributed URI, hdfs assumed
> but other URI types valid.
>
> #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
>
> #configUri: http://127.0.0.1/api/arifacts/config.tgz
>
> #jvmUri: https://downloads.mycompany.com/java/jre-7u76-linux-x64.tar.gz
>
> yarnEnvironment:
>
> YARN_HOME: /opt/hadoop-2.7.2
>
>
>
>
>
> Thanks!
>
> Matt
>