You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@samza.apache.org by Malcolm McFarland <mm...@cavulus.com> on 2019/04/01 09:33:23 UTC

Running w/ multiple CPUs/container on YARN

Hey Folks,

I'm having some issues getting multiple cores for containers in yarn.
I seem to have my YARN settings correct, and the RM interface says
that I have 24vcores available. However, when I set the
cluster-manager.container.cpu.cores Samza setting to anything other
than 1, I get a message about how the container is requesting more
resources than it can allocate. With 1 core, everything is fine. Is
there another Samza option I need to set?

Cheers,
Malcolm


-- 
Malcolm McFarland
Cavulus

Re: Running w/ multiple CPUs/container on YARN

Posted by Prateek Maheshwari <pr...@gmail.com>.

Glad you were able to figure it out. FWIW, I had the same interpretation as
you. Let us know if you need anything else.

- Prateek

On Tue, Apr 2, 2019 at 4:55 PM Malcolm McFarland <mm...@cavulus.com>
wrote:

> Found the issue, and thank goodness it was a configuration issue on my end:
> I was setting the yarn.scheduler.maximum-allocation-vcores too low and
> artificially constraining the cluster. This is (as the name implies) the
> maximum allocation for the entire cluster; I had interpreted the
> description from the v2.6.1 docs (which I was initially using because of
> its inclusion in the hello-samza project) to mean that this was a
> per-container setting.
>
> Thanks again for the help, and for the tip on upgrading to Yarn 2.7.6!
>
> Cheers,
> Malcolm
>
> On Tue, Apr 2, 2019 at 1:47 PM Malcolm McFarland <mm...@cavulus.com>
> wrote:
>
> > Interestingly, I just tried setting
> > yarn.scheduler.minimum-allocation-vcores=2 and restarting everything. On
> > startup, the RM now displays a Minimum Allocation of <memory:256,
> > vCores:2>, but my application container still shows "Resource:4096
> Memory,
> > 1 VCores". The statistics page for the "default" queue shows "Used
> > Resources:<memory:13312, vCores:6>...Num Containers:6", which is accurate
> > (3 tasks + 3 AMs).
> >
> > This seems a long shot, but is there a chance that I'm reading this
> > incorrectly, and that YARN will show CPU usage when the processes
> actually
> > start processing -- ie, is the resource allocation shown on-demand, as
> > opposed to preemptive?
> >
> > Cheers,
> > Malcolm
> >
> > Cheers,
> > Malcolm
> >
> >
> > On Tue, Apr 2, 2019 at 12:54 PM Malcolm McFarland <
> mmcfarland@cavulus.com>
> > wrote:
> >
> >> Hi Prateek,
> >>
> >> I'm not getting an error now, just an unyielding vcore allotment of 1.
> >> I just verified that we're setting
> >>
> >>
> yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> >>  and
> >>
> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
> >> via the configuration tab in the RM UI. (Interestingly, although the
> >> resource-calculator property is discussed on the page to which you
> >> linked, it's not on the yarn-defaults.xml reference page for v2.7.6).
> >> Pretty much every configuration option related to max-vcores is set to
> >> a number >1, as is the cluster-manager.container.cpu.cores setting in
> >> Samza. So although no errors, still just a single core per container.
> >>
> >> Here's a question: in Samza, are the cluster resource allocation
> >> properties pulled from a properties file in the application bundle, or
> >> are they sourced from the properties file that is passed to run-app.sh
> >> (and used to submit the task to YARN)? Are there any other properties
> >> for which this would make a difference?
> >>
> >> Cheers,
> >> Malcolm
> >>
> >>
> >> On Tue, Apr 2, 2019 at 9:50 AM Prateek Maheshwari <prateekmi2@gmail.com
> >
> >> wrote:
> >> >
> >> > And just to double check, you also changed the
> >> > yarn.resourcemanager.scheduler.class to CapacityScheduler?
> >> >
> >> > On Tue, Apr 2, 2019 at 9:49 AM Prateek Maheshwari <
> prateekmi2@gmail.com
> >> >
> >> > wrote:
> >> >
> >> > > Is it still the same message from the AM? The one that says: "Got AM
> >> > > register response. The YARN RM supports container requests with
> >> max-mem:
> >> > > 14336, max-cpu: 1"
> >> > >
> >> > > On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <
> >> mmcfarland@cavulus.com>
> >> > > wrote:
> >> > >
> >> > >> Hey Prateek,
> >> > >>
> >> > >> The upgrade to Hadoop 2.7.6 went fine; everything seems to be
> >> working, and
> >> > >> access to S3 via an access key/secret pair is working as well.
> >> However, my
> >> > >> requested tasks are still only getting allocated 1 core, despite
> >> > >> requesting
> >> > >> more than that. Once again, I have a 3-node cluster that should
> have
> >> 24
> >> > >> vcores available; on the yarn side, I have these options set:
> >> > >>
> >> > >> nodemanager.resource.cpu-vcores=8
> >> > >> yarn.scheduler.minimum-allocation-vcore=1
> >> > >> yarn.scheduler.maximum-allocation-vcores=4
> >> > >>
> >> > >>
> >>
> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
> >> > >>
> >> > >> And on the Samza side, I'm setting:
> >> > >>
> >> > >> cluster-manager.container.cpu.cores=2
> >> > >>
> >> > >> However, YARN is still telling me that the running task has 1 vcore
> >> > >> assigned. Do you have any other suggestions for options to tweak?
> >> > >>
> >> > >> Cheers,
> >> > >> Malcolm
> >> > >>
> >> > >>
> >> > >> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <
> >> mmcfarland@cavulus.com>
> >> > >> wrote:
> >> > >>
> >> > >> > One more thing -- fwiw, I actually also came across the
> possibility
> >> > >> that I
> >> > >> > would need to use the DominantResourceCalculator, but as you
> point
> >> out,
> >> > >> > this doesn't seem to be available in Hadoop 2.6.
> >> > >> >
> >> > >> >
> >> > >> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <
> >> > >> mmcfarland@cavulus.com>
> >> > >> > wrote:
> >> > >> >
> >> > >> >> That's quite helpful! I actually initially tried using a version
> >> of
> >> > >> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials
> in
> >> YARN
> >> > >> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed,
> as
> >> I
> >> > >> >> received lots of "No AWS Credentials
> >> > >> >> provided by DefaultAWSCredentialsProviderChain" messages. I
> found
> >> a
> >> > >> >> way around this by providing the credentials to the AM directly
> >> via
> >> > >> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>,
> but
> >> > >> >> since this seemed very workaround-ish, I just assumed that I
> would
> >> > >> >> eventually hit other problems using a version of Hadoop not
> >> pinned in
> >> > >> >> the Samza repo. If you're running 2.7.x at LinkedIn, however,
> I'll
> >> > >> >> give it a shot again.
> >> > >> >>
> >> > >> >> Have you done any AWS credential integration, and if so, did you
> >> need
> >> > >> >> to do anything special to get it to work?
> >> > >> >>
> >> > >> >> Cheers,
> >> > >> >> Malcolm
> >> > >> >>
> >> > >> >>
> >> > >> >>
> >> > >> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <
> >> > >> prateekmi2@gmail.com>
> >> > >> >> wrote:
> >> > >> >> >
> >> > >> >> > Hi Malcolm,
> >> > >> >> >
> >> > >> >> > I think this is because in YARN 2.6 the FifoScheduler only
> >> accounts
> >> > >> for
> >> > >> >> > memory for 'maximumAllocation':
> >> > >> >> >
> >> > >> >>
> >> > >>
> >>
> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >> > >> >> >
> >> > >> >> > This has been changed as early as 2.7.0:
> >> > >> >> >
> >> > >> >>
> >> > >>
> >>
> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >> > >> >> >
> >> > >> >> > So upgrading will likely fix this issue. For reference, at
> >> LinkedIn
> >> > >> we
> >> > >> >> are
> >> > >> >> > running YARN 2.7.2 with the CapacityScheduler
> >> > >> >> > <
> >> > >> >>
> >> > >>
> >>
> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> >> > >> >> >
> >> > >> >> > and DominantResourceCalculator to account for vcore
> allocations
> >> in
> >> > >> >> > scheduling.
> >> > >> >> >
> >> > >> >> > - Prateek
> >> > >> >> >
> >> > >> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
> >> > >> >> mmcfarland@cavulus.com>
> >> > >> >> > wrote:
> >> > >> >> >
> >> > >> >> > > Hi Prateek,
> >> > >> >> > >
> >> > >> >> > > This still seems to be manifesting with the same problem.
> >> Since
> >> > >> this
> >> > >> >> seems
> >> > >> >> > > to be something in the hadoop codebase, and I've emailed the
> >> > >> >> hadoop-dev
> >> > >> >> > > mailing list about it.
> >> > >> >> > >
> >> > >> >> > > Cheers,
> >> > >> >> > > Malcolm
> >> > >> >> > >
> >> > >> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
> >> > >> >> prateekmi2@gmail.com>
> >> > >> >> > > wrote:
> >> > >> >> > >
> >> > >> >> > > > Hi Malcolm,
> >> > >> >> > > >
> >> > >> >> > > > Yes, the AM is just reporting what the RM specified as the
> >> > >> maximum
> >> > >> >> > > allowed
> >> > >> >> > > > request size.
> >> > >> >> > > >
> >> > >> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs
> to
> >> be
> >> > >> less
> >> > >> >> than
> >> > >> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container
> >> must
> >> > >> fit
> >> > >> >> on a
> >> > >> >> > > > single NM. Maybe the RM detected this and decided to
> >> default to
> >> > >> 1?
> >> > >> >> Can
> >> > >> >> > > you
> >> > >> >> > > > try setting maximum-allocation-vcores lower?
> >> > >> >> > > >
> >> > >> >> > > > - Prateek
> >> > >> >> > > >
> >> > >> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> >> > >> >> > > mmcfarland@cavulus.com>
> >> > >> >> > > > wrote:
> >> > >> >> > > >
> >> > >> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has
> >> anybody
> >> > >> seen
> >> > >> >> > > > > issues with core allocation in this environment? I'm
> >> seeing
> >> > >> this
> >> > >> >> in
> >> > >> >> > > > > the samza log:
> >> > >> >> > > > >
> >> > >> >> > > > > "Got AM register response. The YARN RM supports
> container
> >> > >> requests
> >> > >> >> > > > > with max-mem: 14336, max-cpu: 1"
> >> > >> >> > > > >
> >> > >> >> > > > > How does samza determine this? Looking at the Samza
> >> source on
> >> > >> >> Github,
> >> > >> >> > > > > it appears to be information that's passed back to the
> AM
> >> when
> >> > >> it
> >> > >> >> > > > > starts up.
> >> > >> >> > > > >
> >> > >> >> > > > > Cheers,
> >> > >> >> > > > > Malcolm
> >> > >> >> > > > >
> >> > >> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> >> > >> >> > > > > <mm...@cavulus.com> wrote:
> >> > >> >> > > > > >
> >> > >> >> > > > > > Hi Prateek,
> >> > >> >> > > > > >
> >> > >> >> > > > > > Sorry, meant to include these versions with my email;
> >> I'm
> >> > >> >> running
> >> > >> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three
> >> containers
> >> > >> >> across 3
> >> > >> >> > > > > > node managers, each with 16GB and 8 vcores. The other
> >> two
> >> > >> >> containers
> >> > >> >> > > > > > are requesting 1 vcore each; even with the AMs
> running,
> >> that
> >> > >> >> should
> >> > >> >> > > be
> >> > >> >> > > > > > 4 for them in total, leaving plenty of processing
> power
> >> > >> >> available.
> >> > >> >> > > > > >
> >> > >> >> > > > > > The error is in the application attempt diagnostics
> >> field:
> >> > >> "The
> >> > >> >> YARN
> >> > >> >> > > > > > cluster is unable to run your job due to unsatisfiable
> >> > >> resource
> >> > >> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I
> >> do not
> >> > >> >> see this
> >> > >> >> > > > > > error with the same memory request, but a cpu count
> >> request
> >> > >> of
> >> > >> >> 1.
> >> > >> >> > > > > >
> >> > >> >> > > > > > Here are the configuration options pertaining to
> >> resource
> >> > >> >> allocation:
> >> > >> >> > > > > >
> >> > >> >> > > > > > <?xml version="1.0"?>
> >> > >> >> > > > > > <configuration>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
> >> > >> >> > > > > >
> >> > >> >> > > > >
> >> > >> >> > > >
> >> > >> >> > >
> >> > >> >>
> >> > >>
> >>
> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> >> > >> >> > > > > >     <value>false</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> >> > >> >> > > > > >     <value>2.1</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> >> > >> >> > > > > >     <value>14336</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> >> > >> >> > > > > >     <value>256</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> >> > >> >> > > > > >     <value>14336</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >
> >>  <name>yarn.scheduler.minimum-allocation-vcores</name>
> >> > >> >> > > > > >     <value>1</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >
> >>  <name>yarn.scheduler.maximum-allocation-vcores</name>
> >> > >> >> > > > > >     <value>16</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> >> > >> >> > > > > >     <value>8</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > >   <property>
> >> > >> >> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
> >> > >> >> > > > > >     <value>processor-cluster</value>
> >> > >> >> > > > > >   </property>
> >> > >> >> > > > > > </configuration>
> >> > >> >> > > > > >
> >> > >> >> > > > > > Cheers,
> >> > >> >> > > > > > Malcolm
> >> > >> >> > > > > >
> >> > >> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> >> > >> >> > > > prateekmi2@gmail.com>
> >> > >> >> > > > > wrote:
> >> > >> >> > > > > > >
> >> > >> >> > > > > > > Hi Malcolm,
> >> > >> >> > > > > > >
> >> > >> >> > > > > > > Just setting that configuration should be
> sufficient.
> >> We
> >> > >> >> haven't
> >> > >> >> > > seen
> >> > >> >> > > > > this
> >> > >> >> > > > > > > issue before. What Samza/YARN versions are you
> using?
> >> Can
> >> > >> you
> >> > >> >> also
> >> > >> >> > > > > include
> >> > >> >> > > > > > > the logs from where you get the error and your yarn
> >> > >> >> configuration?
> >> > >> >> > > > > > >
> >> > >> >> > > > > > > - Prateek
> >> > >> >> > > > > > >
> >> > >> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> >> > >> >> > > > > mmcfarland@cavulus.com>
> >> > >> >> > > > > > > wrote:
> >> > >> >> > > > > > >
> >> > >> >> > > > > > > > Hey Folks,
> >> > >> >> > > > > > > >
> >> > >> >> > > > > > > > I'm having some issues getting multiple cores for
> >> > >> >> containers in
> >> > >> >> > > > yarn.
> >> > >> >> > > > > > > > I seem to have my YARN settings correct, and the
> RM
> >> > >> >> interface
> >> > >> >> > > says
> >> > >> >> > > > > > > > that I have 24vcores available. However, when I
> set
> >> the
> >> > >> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting
> to
> >> > >> >> anything
> >> > >> >> > > other
> >> > >> >> > > > > > > > than 1, I get a message about how the container is
> >> > >> >> requesting
> >> > >> >> > > more
> >> > >> >> > > > > > > > resources than it can allocate. With 1 core,
> >> everything
> >> > >> is
> >> > >> >> fine.
> >> > >> >> > > Is
> >> > >> >> > > > > > > > there another Samza option I need to set?
> >> > >> >> > > > > > > >
> >> > >> >> > > > > > > > Cheers,
> >> > >> >> > > > > > > > Malcolm
> >> > >> >> > > > > > > >
> >> > >> >> > > > > > > >
> >> > >> >> > > > > > > > --
> >> > >> >> > > > > > > > Malcolm McFarland
> >> > >> >> > > > > > > > Cavulus
> >> > >> >> > > > > > > >
> >> > >> >> > > > > >
> >> > >> >> > > > > >
> >> > >> >> > > > > >
> >> > >> >> > > > > > --
> >> > >> >> > > > > > Malcolm McFarland
> >> > >> >> > > > > > Cavulus
> >> > >> >> > > > > > 1-800-760-6915
> >> > >> >> > > > > > mmcfarland@cavulus.com
> >> > >> >> > > > > >
> >> > >> >> > > > > >
> >> > >> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
> >> > >> Cavulus.
> >> > >> >> Any
> >> > >> >> > > > > > unauthorized or improper disclosure, copying,
> >> distribution,
> >> > >> or
> >> > >> >> use of
> >> > >> >> > > > > > the contents of this message is prohibited. The
> >> information
> >> > >> >> contained
> >> > >> >> > > > > > in this message is intended only for the personal and
> >> > >> >> confidential
> >> > >> >> > > use
> >> > >> >> > > > > > of the recipient(s) named above. If you have received
> >> this
> >> > >> >> message in
> >> > >> >> > > > > > error, please notify the sender immediately and delete
> >> the
> >> > >> >> original
> >> > >> >> > > > > > message.
> >> > >> >> > > > >
> >> > >> >> > > > >
> >> > >> >> > > > >
> >> > >> >> > > > > --
> >> > >> >> > > > > Malcolm McFarland
> >> > >> >> > > > > Cavulus
> >> > >> >> > > > > 1-800-760-6915
> >> > >> >> > > > > mmcfarland@cavulus.com
> >> > >> >> > > > >
> >> > >> >> > > > >
> >> > >> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
> >> Cavulus.
> >> > >> Any
> >> > >> >> > > > > unauthorized or improper disclosure, copying,
> >> distribution, or
> >> > >> >> use of
> >> > >> >> > > > > the contents of this message is prohibited. The
> >> information
> >> > >> >> contained
> >> > >> >> > > > > in this message is intended only for the personal and
> >> > >> >> confidential use
> >> > >> >> > > > > of the recipient(s) named above. If you have received
> this
> >> > >> >> message in
> >> > >> >> > > > > error, please notify the sender immediately and delete
> the
> >> > >> >> original
> >> > >> >> > > > > message.
> >> > >> >> > > > >
> >> > >> >> > > >
> >> > >> >> > >
> >> > >> >> > >
> >> > >> >> > > --
> >> > >> >> > > Malcolm McFarland
> >> > >> >> > > Cavulus
> >> > >> >> > > 1-800-760-6915
> >> > >> >> > > mmcfarland@cavulus.com
> >> > >> >> > >
> >> > >> >> > >
> >> > >> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a
> >> Cavulus. Any
> >> > >> >> > > unauthorized or improper disclosure, copying, distribution,
> >> or use
> >> > >> of
> >> > >> >> the
> >> > >> >> > > contents of this message is prohibited. The information
> >> contained
> >> > >> in
> >> > >> >> this
> >> > >> >> > > message is intended only for the personal and confidential
> >> use of
> >> > >> the
> >> > >> >> > > recipient(s) named above. If you have received this message
> in
> >> > >> error,
> >> > >> >> > > please notify the sender immediately and delete the original
> >> > >> message.
> >> > >> >> > >
> >> > >> >>
> >> > >> >>
> >> > >> >>
> >> > >> >> --
> >> > >> >> Malcolm McFarland
> >> > >> >> Cavulus
> >> > >> >> 1-800-760-6915
> >> > >> >> mmcfarland@cavulus.com
> >> > >> >>
> >> > >> >>
> >> > >> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> Any
> >> > >> >> unauthorized or improper disclosure, copying, distribution, or
> >> use of
> >> > >> >> the contents of this message is prohibited. The information
> >> contained
> >> > >> >> in this message is intended only for the personal and
> >> confidential use
> >> > >> >> of the recipient(s) named above. If you have received this
> >> message in
> >> > >> >> error, please notify the sender immediately and delete the
> >> original
> >> > >> >> message.
> >> > >> >>
> >> > >> >
> >> > >> >
> >> > >> > --
> >> > >> > Malcolm McFarland
> >> > >> > Cavulus
> >> > >> > 1-800-760-6915
> >> > >> > mmcfarland@cavulus.com
> >> > >> >
> >> > >> >
> >> > >> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> Any
> >> > >> > unauthorized or improper disclosure, copying, distribution, or
> use
> >> of
> >> > >> the
> >> > >> > contents of this message is prohibited. The information contained
> >> in
> >> > >> this
> >> > >> > message is intended only for the personal and confidential use of
> >> the
> >> > >> > recipient(s) named above. If you have received this message in
> >> error,
> >> > >> > please notify the sender immediately and delete the original
> >> message.
> >> > >> >
> >> > >>
> >> > >>
> >> > >> --
> >> > >> Malcolm McFarland
> >> > >> Cavulus
> >> > >> 1-800-760-6915
> >> > >> mmcfarland@cavulus.com
> >> > >>
> >> > >>
> >> > >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> > >> unauthorized or improper disclosure, copying, distribution, or use
> >> of the
> >> > >> contents of this message is prohibited. The information contained
> in
> >> this
> >> > >> message is intended only for the personal and confidential use of
> the
> >> > >> recipient(s) named above. If you have received this message in
> error,
> >> > >> please notify the sender immediately and delete the original
> message.
> >> > >>
> >> > >
> >>
> >>
> >>
> >> --
> >> Malcolm McFarland
> >> Cavulus
> >> 1-800-760-6915
> >> mmcfarland@cavulus.com
> >>
> >>
> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> unauthorized or improper disclosure, copying, distribution, or use of
> >> the contents of this message is prohibited. The information contained
> >> in this message is intended only for the personal and confidential use
> >> of the recipient(s) named above. If you have received this message in
> >> error, please notify the sender immediately and delete the original
> >> message.
> >>
> >
> >
> > --
> > Malcolm McFarland
> > Cavulus
> > 1-800-760-6915
> > mmcfarland@cavulus.com
> >
> >
> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > unauthorized or improper disclosure, copying, distribution, or use of the
> > contents of this message is prohibited. The information contained in this
> > message is intended only for the personal and confidential use of the
> > recipient(s) named above. If you have received this message in error,
> > please notify the sender immediately and delete the original message.
> >
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of the
> contents of this message is prohibited. The information contained in this
> message is intended only for the personal and confidential use of the
> recipient(s) named above. If you have received this message in error,
> please notify the sender immediately and delete the original message.
>

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

Found the issue, and thank goodness it was a configuration issue on my end:
I was setting the yarn.scheduler.maximum-allocation-vcores too low and
artificially constraining the cluster. This is (as the name implies) the
maximum allocation for the entire cluster; I had interpreted the
description from the v2.6.1 docs (which I was initially using because of
its inclusion in the hello-samza project) to mean that this was a
per-container setting.

Thanks again for the help, and for the tip on upgrading to Yarn 2.7.6!

Cheers,
Malcolm

On Tue, Apr 2, 2019 at 1:47 PM Malcolm McFarland <mm...@cavulus.com>
wrote:

> Interestingly, I just tried setting
> yarn.scheduler.minimum-allocation-vcores=2 and restarting everything. On
> startup, the RM now displays a Minimum Allocation of <memory:256,
> vCores:2>, but my application container still shows "Resource:4096 Memory,
> 1 VCores". The statistics page for the "default" queue shows "Used
> Resources:<memory:13312, vCores:6>...Num Containers:6", which is accurate
> (3 tasks + 3 AMs).
>
> This seems a long shot, but is there a chance that I'm reading this
> incorrectly, and that YARN will show CPU usage when the processes actually
> start processing -- ie, is the resource allocation shown on-demand, as
> opposed to preemptive?
>
> Cheers,
> Malcolm
>
> Cheers,
> Malcolm
>
>
> On Tue, Apr 2, 2019 at 12:54 PM Malcolm McFarland <mm...@cavulus.com>
> wrote:
>
>> Hi Prateek,
>>
>> I'm not getting an error now, just an unyielding vcore allotment of 1.
>> I just verified that we're setting
>>
>> yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>>  and
>> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
>> via the configuration tab in the RM UI. (Interestingly, although the
>> resource-calculator property is discussed on the page to which you
>> linked, it's not on the yarn-defaults.xml reference page for v2.7.6).
>> Pretty much every configuration option related to max-vcores is set to
>> a number >1, as is the cluster-manager.container.cpu.cores setting in
>> Samza. So although no errors, still just a single core per container.
>>
>> Here's a question: in Samza, are the cluster resource allocation
>> properties pulled from a properties file in the application bundle, or
>> are they sourced from the properties file that is passed to run-app.sh
>> (and used to submit the task to YARN)? Are there any other properties
>> for which this would make a difference?
>>
>> Cheers,
>> Malcolm
>>
>>
>> On Tue, Apr 2, 2019 at 9:50 AM Prateek Maheshwari <pr...@gmail.com>
>> wrote:
>> >
>> > And just to double check, you also changed the
>> > yarn.resourcemanager.scheduler.class to CapacityScheduler?
>> >
>> > On Tue, Apr 2, 2019 at 9:49 AM Prateek Maheshwari <prateekmi2@gmail.com
>> >
>> > wrote:
>> >
>> > > Is it still the same message from the AM? The one that says: "Got AM
>> > > register response. The YARN RM supports container requests with
>> max-mem:
>> > > 14336, max-cpu: 1"
>> > >
>> > > On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <
>> mmcfarland@cavulus.com>
>> > > wrote:
>> > >
>> > >> Hey Prateek,
>> > >>
>> > >> The upgrade to Hadoop 2.7.6 went fine; everything seems to be
>> working, and
>> > >> access to S3 via an access key/secret pair is working as well.
>> However, my
>> > >> requested tasks are still only getting allocated 1 core, despite
>> > >> requesting
>> > >> more than that. Once again, I have a 3-node cluster that should have
>> 24
>> > >> vcores available; on the yarn side, I have these options set:
>> > >>
>> > >> nodemanager.resource.cpu-vcores=8
>> > >> yarn.scheduler.minimum-allocation-vcore=1
>> > >> yarn.scheduler.maximum-allocation-vcores=4
>> > >>
>> > >>
>> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
>> > >>
>> > >> And on the Samza side, I'm setting:
>> > >>
>> > >> cluster-manager.container.cpu.cores=2
>> > >>
>> > >> However, YARN is still telling me that the running task has 1 vcore
>> > >> assigned. Do you have any other suggestions for options to tweak?
>> > >>
>> > >> Cheers,
>> > >> Malcolm
>> > >>
>> > >>
>> > >> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <
>> mmcfarland@cavulus.com>
>> > >> wrote:
>> > >>
>> > >> > One more thing -- fwiw, I actually also came across the possibility
>> > >> that I
>> > >> > would need to use the DominantResourceCalculator, but as you point
>> out,
>> > >> > this doesn't seem to be available in Hadoop 2.6.
>> > >> >
>> > >> >
>> > >> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <
>> > >> mmcfarland@cavulus.com>
>> > >> > wrote:
>> > >> >
>> > >> >> That's quite helpful! I actually initially tried using a version
>> of
>> > >> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in
>> YARN
>> > >> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as
>> I
>> > >> >> received lots of "No AWS Credentials
>> > >> >> provided by DefaultAWSCredentialsProviderChain" messages. I found
>> a
>> > >> >> way around this by providing the credentials to the AM directly
>> via
>> > >> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
>> > >> >> since this seemed very workaround-ish, I just assumed that I would
>> > >> >> eventually hit other problems using a version of Hadoop not
>> pinned in
>> > >> >> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
>> > >> >> give it a shot again.
>> > >> >>
>> > >> >> Have you done any AWS credential integration, and if so, did you
>> need
>> > >> >> to do anything special to get it to work?
>> > >> >>
>> > >> >> Cheers,
>> > >> >> Malcolm
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <
>> > >> prateekmi2@gmail.com>
>> > >> >> wrote:
>> > >> >> >
>> > >> >> > Hi Malcolm,
>> > >> >> >
>> > >> >> > I think this is because in YARN 2.6 the FifoScheduler only
>> accounts
>> > >> for
>> > >> >> > memory for 'maximumAllocation':
>> > >> >> >
>> > >> >>
>> > >>
>> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> > >> >> >
>> > >> >> > This has been changed as early as 2.7.0:
>> > >> >> >
>> > >> >>
>> > >>
>> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> > >> >> >
>> > >> >> > So upgrading will likely fix this issue. For reference, at
>> LinkedIn
>> > >> we
>> > >> >> are
>> > >> >> > running YARN 2.7.2 with the CapacityScheduler
>> > >> >> > <
>> > >> >>
>> > >>
>> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
>> > >> >> >
>> > >> >> > and DominantResourceCalculator to account for vcore allocations
>> in
>> > >> >> > scheduling.
>> > >> >> >
>> > >> >> > - Prateek
>> > >> >> >
>> > >> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
>> > >> >> mmcfarland@cavulus.com>
>> > >> >> > wrote:
>> > >> >> >
>> > >> >> > > Hi Prateek,
>> > >> >> > >
>> > >> >> > > This still seems to be manifesting with the same problem.
>> Since
>> > >> this
>> > >> >> seems
>> > >> >> > > to be something in the hadoop codebase, and I've emailed the
>> > >> >> hadoop-dev
>> > >> >> > > mailing list about it.
>> > >> >> > >
>> > >> >> > > Cheers,
>> > >> >> > > Malcolm
>> > >> >> > >
>> > >> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
>> > >> >> prateekmi2@gmail.com>
>> > >> >> > > wrote:
>> > >> >> > >
>> > >> >> > > > Hi Malcolm,
>> > >> >> > > >
>> > >> >> > > > Yes, the AM is just reporting what the RM specified as the
>> > >> maximum
>> > >> >> > > allowed
>> > >> >> > > > request size.
>> > >> >> > > >
>> > >> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to
>> be
>> > >> less
>> > >> >> than
>> > >> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container
>> must
>> > >> fit
>> > >> >> on a
>> > >> >> > > > single NM. Maybe the RM detected this and decided to
>> default to
>> > >> 1?
>> > >> >> Can
>> > >> >> > > you
>> > >> >> > > > try setting maximum-allocation-vcores lower?
>> > >> >> > > >
>> > >> >> > > > - Prateek
>> > >> >> > > >
>> > >> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
>> > >> >> > > mmcfarland@cavulus.com>
>> > >> >> > > > wrote:
>> > >> >> > > >
>> > >> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has
>> anybody
>> > >> seen
>> > >> >> > > > > issues with core allocation in this environment? I'm
>> seeing
>> > >> this
>> > >> >> in
>> > >> >> > > > > the samza log:
>> > >> >> > > > >
>> > >> >> > > > > "Got AM register response. The YARN RM supports container
>> > >> requests
>> > >> >> > > > > with max-mem: 14336, max-cpu: 1"
>> > >> >> > > > >
>> > >> >> > > > > How does samza determine this? Looking at the Samza
>> source on
>> > >> >> Github,
>> > >> >> > > > > it appears to be information that's passed back to the AM
>> when
>> > >> it
>> > >> >> > > > > starts up.
>> > >> >> > > > >
>> > >> >> > > > > Cheers,
>> > >> >> > > > > Malcolm
>> > >> >> > > > >
>> > >> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
>> > >> >> > > > > <mm...@cavulus.com> wrote:
>> > >> >> > > > > >
>> > >> >> > > > > > Hi Prateek,
>> > >> >> > > > > >
>> > >> >> > > > > > Sorry, meant to include these versions with my email;
>> I'm
>> > >> >> running
>> > >> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three
>> containers
>> > >> >> across 3
>> > >> >> > > > > > node managers, each with 16GB and 8 vcores. The other
>> two
>> > >> >> containers
>> > >> >> > > > > > are requesting 1 vcore each; even with the AMs running,
>> that
>> > >> >> should
>> > >> >> > > be
>> > >> >> > > > > > 4 for them in total, leaving plenty of processing power
>> > >> >> available.
>> > >> >> > > > > >
>> > >> >> > > > > > The error is in the application attempt diagnostics
>> field:
>> > >> "The
>> > >> >> YARN
>> > >> >> > > > > > cluster is unable to run your job due to unsatisfiable
>> > >> resource
>> > >> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I
>> do not
>> > >> >> see this
>> > >> >> > > > > > error with the same memory request, but a cpu count
>> request
>> > >> of
>> > >> >> 1.
>> > >> >> > > > > >
>> > >> >> > > > > > Here are the configuration options pertaining to
>> resource
>> > >> >> allocation:
>> > >> >> > > > > >
>> > >> >> > > > > > <?xml version="1.0"?>
>> > >> >> > > > > > <configuration>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
>> > >> >> > > > > >
>> > >> >> > > > >
>> > >> >> > > >
>> > >> >> > >
>> > >> >>
>> > >>
>> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
>> > >> >> > > > > >     <value>false</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>> > >> >> > > > > >     <value>2.1</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
>> > >> >> > > > > >     <value>14336</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
>> > >> >> > > > > >     <value>256</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
>> > >> >> > > > > >     <value>14336</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >
>>  <name>yarn.scheduler.minimum-allocation-vcores</name>
>> > >> >> > > > > >     <value>1</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >
>>  <name>yarn.scheduler.maximum-allocation-vcores</name>
>> > >> >> > > > > >     <value>16</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
>> > >> >> > > > > >     <value>8</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > >   <property>
>> > >> >> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
>> > >> >> > > > > >     <value>processor-cluster</value>
>> > >> >> > > > > >   </property>
>> > >> >> > > > > > </configuration>
>> > >> >> > > > > >
>> > >> >> > > > > > Cheers,
>> > >> >> > > > > > Malcolm
>> > >> >> > > > > >
>> > >> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
>> > >> >> > > > prateekmi2@gmail.com>
>> > >> >> > > > > wrote:
>> > >> >> > > > > > >
>> > >> >> > > > > > > Hi Malcolm,
>> > >> >> > > > > > >
>> > >> >> > > > > > > Just setting that configuration should be sufficient.
>> We
>> > >> >> haven't
>> > >> >> > > seen
>> > >> >> > > > > this
>> > >> >> > > > > > > issue before. What Samza/YARN versions are you using?
>> Can
>> > >> you
>> > >> >> also
>> > >> >> > > > > include
>> > >> >> > > > > > > the logs from where you get the error and your yarn
>> > >> >> configuration?
>> > >> >> > > > > > >
>> > >> >> > > > > > > - Prateek
>> > >> >> > > > > > >
>> > >> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
>> > >> >> > > > > mmcfarland@cavulus.com>
>> > >> >> > > > > > > wrote:
>> > >> >> > > > > > >
>> > >> >> > > > > > > > Hey Folks,
>> > >> >> > > > > > > >
>> > >> >> > > > > > > > I'm having some issues getting multiple cores for
>> > >> >> containers in
>> > >> >> > > > yarn.
>> > >> >> > > > > > > > I seem to have my YARN settings correct, and the RM
>> > >> >> interface
>> > >> >> > > says
>> > >> >> > > > > > > > that I have 24vcores available. However, when I set
>> the
>> > >> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
>> > >> >> anything
>> > >> >> > > other
>> > >> >> > > > > > > > than 1, I get a message about how the container is
>> > >> >> requesting
>> > >> >> > > more
>> > >> >> > > > > > > > resources than it can allocate. With 1 core,
>> everything
>> > >> is
>> > >> >> fine.
>> > >> >> > > Is
>> > >> >> > > > > > > > there another Samza option I need to set?
>> > >> >> > > > > > > >
>> > >> >> > > > > > > > Cheers,
>> > >> >> > > > > > > > Malcolm
>> > >> >> > > > > > > >
>> > >> >> > > > > > > >
>> > >> >> > > > > > > > --
>> > >> >> > > > > > > > Malcolm McFarland
>> > >> >> > > > > > > > Cavulus
>> > >> >> > > > > > > >
>> > >> >> > > > > >
>> > >> >> > > > > >
>> > >> >> > > > > >
>> > >> >> > > > > > --
>> > >> >> > > > > > Malcolm McFarland
>> > >> >> > > > > > Cavulus
>> > >> >> > > > > > 1-800-760-6915
>> > >> >> > > > > > mmcfarland@cavulus.com
>> > >> >> > > > > >
>> > >> >> > > > > >
>> > >> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
>> > >> Cavulus.
>> > >> >> Any
>> > >> >> > > > > > unauthorized or improper disclosure, copying,
>> distribution,
>> > >> or
>> > >> >> use of
>> > >> >> > > > > > the contents of this message is prohibited. The
>> information
>> > >> >> contained
>> > >> >> > > > > > in this message is intended only for the personal and
>> > >> >> confidential
>> > >> >> > > use
>> > >> >> > > > > > of the recipient(s) named above. If you have received
>> this
>> > >> >> message in
>> > >> >> > > > > > error, please notify the sender immediately and delete
>> the
>> > >> >> original
>> > >> >> > > > > > message.
>> > >> >> > > > >
>> > >> >> > > > >
>> > >> >> > > > >
>> > >> >> > > > > --
>> > >> >> > > > > Malcolm McFarland
>> > >> >> > > > > Cavulus
>> > >> >> > > > > 1-800-760-6915
>> > >> >> > > > > mmcfarland@cavulus.com
>> > >> >> > > > >
>> > >> >> > > > >
>> > >> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
>> Cavulus.
>> > >> Any
>> > >> >> > > > > unauthorized or improper disclosure, copying,
>> distribution, or
>> > >> >> use of
>> > >> >> > > > > the contents of this message is prohibited. The
>> information
>> > >> >> contained
>> > >> >> > > > > in this message is intended only for the personal and
>> > >> >> confidential use
>> > >> >> > > > > of the recipient(s) named above. If you have received this
>> > >> >> message in
>> > >> >> > > > > error, please notify the sender immediately and delete the
>> > >> >> original
>> > >> >> > > > > message.
>> > >> >> > > > >
>> > >> >> > > >
>> > >> >> > >
>> > >> >> > >
>> > >> >> > > --
>> > >> >> > > Malcolm McFarland
>> > >> >> > > Cavulus
>> > >> >> > > 1-800-760-6915
>> > >> >> > > mmcfarland@cavulus.com
>> > >> >> > >
>> > >> >> > >
>> > >> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a
>> Cavulus. Any
>> > >> >> > > unauthorized or improper disclosure, copying, distribution,
>> or use
>> > >> of
>> > >> >> the
>> > >> >> > > contents of this message is prohibited. The information
>> contained
>> > >> in
>> > >> >> this
>> > >> >> > > message is intended only for the personal and confidential
>> use of
>> > >> the
>> > >> >> > > recipient(s) named above. If you have received this message in
>> > >> error,
>> > >> >> > > please notify the sender immediately and delete the original
>> > >> message.
>> > >> >> > >
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >> --
>> > >> >> Malcolm McFarland
>> > >> >> Cavulus
>> > >> >> 1-800-760-6915
>> > >> >> mmcfarland@cavulus.com
>> > >> >>
>> > >> >>
>> > >> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > >> >> unauthorized or improper disclosure, copying, distribution, or
>> use of
>> > >> >> the contents of this message is prohibited. The information
>> contained
>> > >> >> in this message is intended only for the personal and
>> confidential use
>> > >> >> of the recipient(s) named above. If you have received this
>> message in
>> > >> >> error, please notify the sender immediately and delete the
>> original
>> > >> >> message.
>> > >> >>
>> > >> >
>> > >> >
>> > >> > --
>> > >> > Malcolm McFarland
>> > >> > Cavulus
>> > >> > 1-800-760-6915
>> > >> > mmcfarland@cavulus.com
>> > >> >
>> > >> >
>> > >> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > >> > unauthorized or improper disclosure, copying, distribution, or use
>> of
>> > >> the
>> > >> > contents of this message is prohibited. The information contained
>> in
>> > >> this
>> > >> > message is intended only for the personal and confidential use of
>> the
>> > >> > recipient(s) named above. If you have received this message in
>> error,
>> > >> > please notify the sender immediately and delete the original
>> message.
>> > >> >
>> > >>
>> > >>
>> > >> --
>> > >> Malcolm McFarland
>> > >> Cavulus
>> > >> 1-800-760-6915
>> > >> mmcfarland@cavulus.com
>> > >>
>> > >>
>> > >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > >> unauthorized or improper disclosure, copying, distribution, or use
>> of the
>> > >> contents of this message is prohibited. The information contained in
>> this
>> > >> message is intended only for the personal and confidential use of the
>> > >> recipient(s) named above. If you have received this message in error,
>> > >> please notify the sender immediately and delete the original message.
>> > >>
>> > >
>>
>>
>>
>> --
>> Malcolm McFarland
>> Cavulus
>> 1-800-760-6915
>> mmcfarland@cavulus.com
>>
>>
>> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> unauthorized or improper disclosure, copying, distribution, or use of
>> the contents of this message is prohibited. The information contained
>> in this message is intended only for the personal and confidential use
>> of the recipient(s) named above. If you have received this message in
>> error, please notify the sender immediately and delete the original
>> message.
>>
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of the
> contents of this message is prohibited. The information contained in this
> message is intended only for the personal and confidential use of the
> recipient(s) named above. If you have received this message in error,
> please notify the sender immediately and delete the original message.
>


-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of the
contents of this message is prohibited. The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If you have received this message in error,
please notify the sender immediately and delete the original message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

Interestingly, I just tried setting
yarn.scheduler.minimum-allocation-vcores=2 and restarting everything. On
startup, the RM now displays a Minimum Allocation of <memory:256,
vCores:2>, but my application container still shows "Resource:4096 Memory,
1 VCores". The statistics page for the "default" queue shows "Used
Resources:<memory:13312, vCores:6>...Num Containers:6", which is accurate
(3 tasks + 3 AMs).

This seems a long shot, but is there a chance that I'm reading this
incorrectly, and that YARN will show CPU usage when the processes actually
start processing -- ie, is the resource allocation shown on-demand, as
opposed to preemptive?

Cheers,
Malcolm

Cheers,
Malcolm


On Tue, Apr 2, 2019 at 12:54 PM Malcolm McFarland <mm...@cavulus.com>
wrote:

> Hi Prateek,
>
> I'm not getting an error now, just an unyielding vcore allotment of 1.
> I just verified that we're setting
>
> yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  and
> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
> via the configuration tab in the RM UI. (Interestingly, although the
> resource-calculator property is discussed on the page to which you
> linked, it's not on the yarn-defaults.xml reference page for v2.7.6).
> Pretty much every configuration option related to max-vcores is set to
> a number >1, as is the cluster-manager.container.cpu.cores setting in
> Samza. So although no errors, still just a single core per container.
>
> Here's a question: in Samza, are the cluster resource allocation
> properties pulled from a properties file in the application bundle, or
> are they sourced from the properties file that is passed to run-app.sh
> (and used to submit the task to YARN)? Are there any other properties
> for which this would make a difference?
>
> Cheers,
> Malcolm
>
>
> On Tue, Apr 2, 2019 at 9:50 AM Prateek Maheshwari <pr...@gmail.com>
> wrote:
> >
> > And just to double check, you also changed the
> > yarn.resourcemanager.scheduler.class to CapacityScheduler?
> >
> > On Tue, Apr 2, 2019 at 9:49 AM Prateek Maheshwari <pr...@gmail.com>
> > wrote:
> >
> > > Is it still the same message from the AM? The one that says: "Got AM
> > > register response. The YARN RM supports container requests with
> max-mem:
> > > 14336, max-cpu: 1"
> > >
> > > On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <
> mmcfarland@cavulus.com>
> > > wrote:
> > >
> > >> Hey Prateek,
> > >>
> > >> The upgrade to Hadoop 2.7.6 went fine; everything seems to be
> working, and
> > >> access to S3 via an access key/secret pair is working as well.
> However, my
> > >> requested tasks are still only getting allocated 1 core, despite
> > >> requesting
> > >> more than that. Once again, I have a 3-node cluster that should have
> 24
> > >> vcores available; on the yarn side, I have these options set:
> > >>
> > >> nodemanager.resource.cpu-vcores=8
> > >> yarn.scheduler.minimum-allocation-vcore=1
> > >> yarn.scheduler.maximum-allocation-vcores=4
> > >>
> > >>
> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
> > >>
> > >> And on the Samza side, I'm setting:
> > >>
> > >> cluster-manager.container.cpu.cores=2
> > >>
> > >> However, YARN is still telling me that the running task has 1 vcore
> > >> assigned. Do you have any other suggestions for options to tweak?
> > >>
> > >> Cheers,
> > >> Malcolm
> > >>
> > >>
> > >> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <
> mmcfarland@cavulus.com>
> > >> wrote:
> > >>
> > >> > One more thing -- fwiw, I actually also came across the possibility
> > >> that I
> > >> > would need to use the DominantResourceCalculator, but as you point
> out,
> > >> > this doesn't seem to be available in Hadoop 2.6.
> > >> >
> > >> >
> > >> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <
> > >> mmcfarland@cavulus.com>
> > >> > wrote:
> > >> >
> > >> >> That's quite helpful! I actually initially tried using a version of
> > >> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in
> YARN
> > >> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
> > >> >> received lots of "No AWS Credentials
> > >> >> provided by DefaultAWSCredentialsProviderChain" messages. I found a
> > >> >> way around this by providing the credentials to the AM directly via
> > >> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
> > >> >> since this seemed very workaround-ish, I just assumed that I would
> > >> >> eventually hit other problems using a version of Hadoop not pinned
> in
> > >> >> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
> > >> >> give it a shot again.
> > >> >>
> > >> >> Have you done any AWS credential integration, and if so, did you
> need
> > >> >> to do anything special to get it to work?
> > >> >>
> > >> >> Cheers,
> > >> >> Malcolm
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <
> > >> prateekmi2@gmail.com>
> > >> >> wrote:
> > >> >> >
> > >> >> > Hi Malcolm,
> > >> >> >
> > >> >> > I think this is because in YARN 2.6 the FifoScheduler only
> accounts
> > >> for
> > >> >> > memory for 'maximumAllocation':
> > >> >> >
> > >> >>
> > >>
> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> > >> >> >
> > >> >> > This has been changed as early as 2.7.0:
> > >> >> >
> > >> >>
> > >>
> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> > >> >> >
> > >> >> > So upgrading will likely fix this issue. For reference, at
> LinkedIn
> > >> we
> > >> >> are
> > >> >> > running YARN 2.7.2 with the CapacityScheduler
> > >> >> > <
> > >> >>
> > >>
> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> > >> >> >
> > >> >> > and DominantResourceCalculator to account for vcore allocations
> in
> > >> >> > scheduling.
> > >> >> >
> > >> >> > - Prateek
> > >> >> >
> > >> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
> > >> >> mmcfarland@cavulus.com>
> > >> >> > wrote:
> > >> >> >
> > >> >> > > Hi Prateek,
> > >> >> > >
> > >> >> > > This still seems to be manifesting with the same problem. Since
> > >> this
> > >> >> seems
> > >> >> > > to be something in the hadoop codebase, and I've emailed the
> > >> >> hadoop-dev
> > >> >> > > mailing list about it.
> > >> >> > >
> > >> >> > > Cheers,
> > >> >> > > Malcolm
> > >> >> > >
> > >> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
> > >> >> prateekmi2@gmail.com>
> > >> >> > > wrote:
> > >> >> > >
> > >> >> > > > Hi Malcolm,
> > >> >> > > >
> > >> >> > > > Yes, the AM is just reporting what the RM specified as the
> > >> maximum
> > >> >> > > allowed
> > >> >> > > > request size.
> > >> >> > > >
> > >> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to
> be
> > >> less
> > >> >> than
> > >> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container
> must
> > >> fit
> > >> >> on a
> > >> >> > > > single NM. Maybe the RM detected this and decided to default
> to
> > >> 1?
> > >> >> Can
> > >> >> > > you
> > >> >> > > > try setting maximum-allocation-vcores lower?
> > >> >> > > >
> > >> >> > > > - Prateek
> > >> >> > > >
> > >> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> > >> >> > > mmcfarland@cavulus.com>
> > >> >> > > > wrote:
> > >> >> > > >
> > >> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has
> anybody
> > >> seen
> > >> >> > > > > issues with core allocation in this environment? I'm seeing
> > >> this
> > >> >> in
> > >> >> > > > > the samza log:
> > >> >> > > > >
> > >> >> > > > > "Got AM register response. The YARN RM supports container
> > >> requests
> > >> >> > > > > with max-mem: 14336, max-cpu: 1"
> > >> >> > > > >
> > >> >> > > > > How does samza determine this? Looking at the Samza source
> on
> > >> >> Github,
> > >> >> > > > > it appears to be information that's passed back to the AM
> when
> > >> it
> > >> >> > > > > starts up.
> > >> >> > > > >
> > >> >> > > > > Cheers,
> > >> >> > > > > Malcolm
> > >> >> > > > >
> > >> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> > >> >> > > > > <mm...@cavulus.com> wrote:
> > >> >> > > > > >
> > >> >> > > > > > Hi Prateek,
> > >> >> > > > > >
> > >> >> > > > > > Sorry, meant to include these versions with my email; I'm
> > >> >> running
> > >> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers
> > >> >> across 3
> > >> >> > > > > > node managers, each with 16GB and 8 vcores. The other two
> > >> >> containers
> > >> >> > > > > > are requesting 1 vcore each; even with the AMs running,
> that
> > >> >> should
> > >> >> > > be
> > >> >> > > > > > 4 for them in total, leaving plenty of processing power
> > >> >> available.
> > >> >> > > > > >
> > >> >> > > > > > The error is in the application attempt diagnostics
> field:
> > >> "The
> > >> >> YARN
> > >> >> > > > > > cluster is unable to run your job due to unsatisfiable
> > >> resource
> > >> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do
> not
> > >> >> see this
> > >> >> > > > > > error with the same memory request, but a cpu count
> request
> > >> of
> > >> >> 1.
> > >> >> > > > > >
> > >> >> > > > > > Here are the configuration options pertaining to resource
> > >> >> allocation:
> > >> >> > > > > >
> > >> >> > > > > > <?xml version="1.0"?>
> > >> >> > > > > > <configuration>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
> > >> >> > > > > >
> > >> >> > > > >
> > >> >> > > >
> > >> >> > >
> > >> >>
> > >>
> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> > >> >> > > > > >     <value>false</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> > >> >> > > > > >     <value>2.1</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> > >> >> > > > > >     <value>14336</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> > >> >> > > > > >     <value>256</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> > >> >> > > > > >     <value>14336</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> > >> >> > > > > >     <value>1</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> > >> >> > > > > >     <value>16</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> > >> >> > > > > >     <value>8</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > >   <property>
> > >> >> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
> > >> >> > > > > >     <value>processor-cluster</value>
> > >> >> > > > > >   </property>
> > >> >> > > > > > </configuration>
> > >> >> > > > > >
> > >> >> > > > > > Cheers,
> > >> >> > > > > > Malcolm
> > >> >> > > > > >
> > >> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> > >> >> > > > prateekmi2@gmail.com>
> > >> >> > > > > wrote:
> > >> >> > > > > > >
> > >> >> > > > > > > Hi Malcolm,
> > >> >> > > > > > >
> > >> >> > > > > > > Just setting that configuration should be sufficient.
> We
> > >> >> haven't
> > >> >> > > seen
> > >> >> > > > > this
> > >> >> > > > > > > issue before. What Samza/YARN versions are you using?
> Can
> > >> you
> > >> >> also
> > >> >> > > > > include
> > >> >> > > > > > > the logs from where you get the error and your yarn
> > >> >> configuration?
> > >> >> > > > > > >
> > >> >> > > > > > > - Prateek
> > >> >> > > > > > >
> > >> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> > >> >> > > > > mmcfarland@cavulus.com>
> > >> >> > > > > > > wrote:
> > >> >> > > > > > >
> > >> >> > > > > > > > Hey Folks,
> > >> >> > > > > > > >
> > >> >> > > > > > > > I'm having some issues getting multiple cores for
> > >> >> containers in
> > >> >> > > > yarn.
> > >> >> > > > > > > > I seem to have my YARN settings correct, and the RM
> > >> >> interface
> > >> >> > > says
> > >> >> > > > > > > > that I have 24vcores available. However, when I set
> the
> > >> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
> > >> >> anything
> > >> >> > > other
> > >> >> > > > > > > > than 1, I get a message about how the container is
> > >> >> requesting
> > >> >> > > more
> > >> >> > > > > > > > resources than it can allocate. With 1 core,
> everything
> > >> is
> > >> >> fine.
> > >> >> > > Is
> > >> >> > > > > > > > there another Samza option I need to set?
> > >> >> > > > > > > >
> > >> >> > > > > > > > Cheers,
> > >> >> > > > > > > > Malcolm
> > >> >> > > > > > > >
> > >> >> > > > > > > >
> > >> >> > > > > > > > --
> > >> >> > > > > > > > Malcolm McFarland
> > >> >> > > > > > > > Cavulus
> > >> >> > > > > > > >
> > >> >> > > > > >
> > >> >> > > > > >
> > >> >> > > > > >
> > >> >> > > > > > --
> > >> >> > > > > > Malcolm McFarland
> > >> >> > > > > > Cavulus
> > >> >> > > > > > 1-800-760-6915
> > >> >> > > > > > mmcfarland@cavulus.com
> > >> >> > > > > >
> > >> >> > > > > >
> > >> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
> > >> Cavulus.
> > >> >> Any
> > >> >> > > > > > unauthorized or improper disclosure, copying,
> distribution,
> > >> or
> > >> >> use of
> > >> >> > > > > > the contents of this message is prohibited. The
> information
> > >> >> contained
> > >> >> > > > > > in this message is intended only for the personal and
> > >> >> confidential
> > >> >> > > use
> > >> >> > > > > > of the recipient(s) named above. If you have received
> this
> > >> >> message in
> > >> >> > > > > > error, please notify the sender immediately and delete
> the
> > >> >> original
> > >> >> > > > > > message.
> > >> >> > > > >
> > >> >> > > > >
> > >> >> > > > >
> > >> >> > > > > --
> > >> >> > > > > Malcolm McFarland
> > >> >> > > > > Cavulus
> > >> >> > > > > 1-800-760-6915
> > >> >> > > > > mmcfarland@cavulus.com
> > >> >> > > > >
> > >> >> > > > >
> > >> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
> Cavulus.
> > >> Any
> > >> >> > > > > unauthorized or improper disclosure, copying,
> distribution, or
> > >> >> use of
> > >> >> > > > > the contents of this message is prohibited. The information
> > >> >> contained
> > >> >> > > > > in this message is intended only for the personal and
> > >> >> confidential use
> > >> >> > > > > of the recipient(s) named above. If you have received this
> > >> >> message in
> > >> >> > > > > error, please notify the sender immediately and delete the
> > >> >> original
> > >> >> > > > > message.
> > >> >> > > > >
> > >> >> > > >
> > >> >> > >
> > >> >> > >
> > >> >> > > --
> > >> >> > > Malcolm McFarland
> > >> >> > > Cavulus
> > >> >> > > 1-800-760-6915
> > >> >> > > mmcfarland@cavulus.com
> > >> >> > >
> > >> >> > >
> > >> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> Any
> > >> >> > > unauthorized or improper disclosure, copying, distribution, or
> use
> > >> of
> > >> >> the
> > >> >> > > contents of this message is prohibited. The information
> contained
> > >> in
> > >> >> this
> > >> >> > > message is intended only for the personal and confidential use
> of
> > >> the
> > >> >> > > recipient(s) named above. If you have received this message in
> > >> error,
> > >> >> > > please notify the sender immediately and delete the original
> > >> message.
> > >> >> > >
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Malcolm McFarland
> > >> >> Cavulus
> > >> >> 1-800-760-6915
> > >> >> mmcfarland@cavulus.com
> > >> >>
> > >> >>
> > >> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > >> >> unauthorized or improper disclosure, copying, distribution, or use
> of
> > >> >> the contents of this message is prohibited. The information
> contained
> > >> >> in this message is intended only for the personal and confidential
> use
> > >> >> of the recipient(s) named above. If you have received this message
> in
> > >> >> error, please notify the sender immediately and delete the original
> > >> >> message.
> > >> >>
> > >> >
> > >> >
> > >> > --
> > >> > Malcolm McFarland
> > >> > Cavulus
> > >> > 1-800-760-6915
> > >> > mmcfarland@cavulus.com
> > >> >
> > >> >
> > >> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > >> > unauthorized or improper disclosure, copying, distribution, or use
> of
> > >> the
> > >> > contents of this message is prohibited. The information contained in
> > >> this
> > >> > message is intended only for the personal and confidential use of
> the
> > >> > recipient(s) named above. If you have received this message in
> error,
> > >> > please notify the sender immediately and delete the original
> message.
> > >> >
> > >>
> > >>
> > >> --
> > >> Malcolm McFarland
> > >> Cavulus
> > >> 1-800-760-6915
> > >> mmcfarland@cavulus.com
> > >>
> > >>
> > >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > >> unauthorized or improper disclosure, copying, distribution, or use of
> the
> > >> contents of this message is prohibited. The information contained in
> this
> > >> message is intended only for the personal and confidential use of the
> > >> recipient(s) named above. If you have received this message in error,
> > >> please notify the sender immediately and delete the original message.
> > >>
> > >
>
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of
> the contents of this message is prohibited. The information contained
> in this message is intended only for the personal and confidential use
> of the recipient(s) named above. If you have received this message in
> error, please notify the sender immediately and delete the original
> message.
>


-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of the
contents of this message is prohibited. The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If you have received this message in error,
please notify the sender immediately and delete the original message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

Hi Prateek,

I'm not getting an error now, just an unyielding vcore allotment of 1.
I just verified that we're setting
yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 and yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
via the configuration tab in the RM UI. (Interestingly, although the
resource-calculator property is discussed on the page to which you
linked, it's not on the yarn-defaults.xml reference page for v2.7.6).
Pretty much every configuration option related to max-vcores is set to
a number >1, as is the cluster-manager.container.cpu.cores setting in
Samza. So although no errors, still just a single core per container.

Here's a question: in Samza, are the cluster resource allocation
properties pulled from a properties file in the application bundle, or
are they sourced from the properties file that is passed to run-app.sh
(and used to submit the task to YARN)? Are there any other properties
for which this would make a difference?

Cheers,
Malcolm


On Tue, Apr 2, 2019 at 9:50 AM Prateek Maheshwari <pr...@gmail.com> wrote:
>
> And just to double check, you also changed the
> yarn.resourcemanager.scheduler.class to CapacityScheduler?
>
> On Tue, Apr 2, 2019 at 9:49 AM Prateek Maheshwari <pr...@gmail.com>
> wrote:
>
> > Is it still the same message from the AM? The one that says: "Got AM
> > register response. The YARN RM supports container requests with max-mem:
> > 14336, max-cpu: 1"
> >
> > On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <mm...@cavulus.com>
> > wrote:
> >
> >> Hey Prateek,
> >>
> >> The upgrade to Hadoop 2.7.6 went fine; everything seems to be working, and
> >> access to S3 via an access key/secret pair is working as well. However, my
> >> requested tasks are still only getting allocated 1 core, despite
> >> requesting
> >> more than that. Once again, I have a 3-node cluster that should have 24
> >> vcores available; on the yarn side, I have these options set:
> >>
> >> nodemanager.resource.cpu-vcores=8
> >> yarn.scheduler.minimum-allocation-vcore=1
> >> yarn.scheduler.maximum-allocation-vcores=4
> >>
> >> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
> >>
> >> And on the Samza side, I'm setting:
> >>
> >> cluster-manager.container.cpu.cores=2
> >>
> >> However, YARN is still telling me that the running task has 1 vcore
> >> assigned. Do you have any other suggestions for options to tweak?
> >>
> >> Cheers,
> >> Malcolm
> >>
> >>
> >> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <mm...@cavulus.com>
> >> wrote:
> >>
> >> > One more thing -- fwiw, I actually also came across the possibility
> >> that I
> >> > would need to use the DominantResourceCalculator, but as you point out,
> >> > this doesn't seem to be available in Hadoop 2.6.
> >> >
> >> >
> >> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <
> >> mmcfarland@cavulus.com>
> >> > wrote:
> >> >
> >> >> That's quite helpful! I actually initially tried using a version of
> >> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
> >> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
> >> >> received lots of "No AWS Credentials
> >> >> provided by DefaultAWSCredentialsProviderChain" messages. I found a
> >> >> way around this by providing the credentials to the AM directly via
> >> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
> >> >> since this seemed very workaround-ish, I just assumed that I would
> >> >> eventually hit other problems using a version of Hadoop not pinned in
> >> >> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
> >> >> give it a shot again.
> >> >>
> >> >> Have you done any AWS credential integration, and if so, did you need
> >> >> to do anything special to get it to work?
> >> >>
> >> >> Cheers,
> >> >> Malcolm
> >> >>
> >> >>
> >> >>
> >> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <
> >> prateekmi2@gmail.com>
> >> >> wrote:
> >> >> >
> >> >> > Hi Malcolm,
> >> >> >
> >> >> > I think this is because in YARN 2.6 the FifoScheduler only accounts
> >> for
> >> >> > memory for 'maximumAllocation':
> >> >> >
> >> >>
> >> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >> >> >
> >> >> > This has been changed as early as 2.7.0:
> >> >> >
> >> >>
> >> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >> >> >
> >> >> > So upgrading will likely fix this issue. For reference, at LinkedIn
> >> we
> >> >> are
> >> >> > running YARN 2.7.2 with the CapacityScheduler
> >> >> > <
> >> >>
> >> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> >> >> >
> >> >> > and DominantResourceCalculator to account for vcore allocations in
> >> >> > scheduling.
> >> >> >
> >> >> > - Prateek
> >> >> >
> >> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
> >> >> mmcfarland@cavulus.com>
> >> >> > wrote:
> >> >> >
> >> >> > > Hi Prateek,
> >> >> > >
> >> >> > > This still seems to be manifesting with the same problem. Since
> >> this
> >> >> seems
> >> >> > > to be something in the hadoop codebase, and I've emailed the
> >> >> hadoop-dev
> >> >> > > mailing list about it.
> >> >> > >
> >> >> > > Cheers,
> >> >> > > Malcolm
> >> >> > >
> >> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
> >> >> prateekmi2@gmail.com>
> >> >> > > wrote:
> >> >> > >
> >> >> > > > Hi Malcolm,
> >> >> > > >
> >> >> > > > Yes, the AM is just reporting what the RM specified as the
> >> maximum
> >> >> > > allowed
> >> >> > > > request size.
> >> >> > > >
> >> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be
> >> less
> >> >> than
> >> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must
> >> fit
> >> >> on a
> >> >> > > > single NM. Maybe the RM detected this and decided to default to
> >> 1?
> >> >> Can
> >> >> > > you
> >> >> > > > try setting maximum-allocation-vcores lower?
> >> >> > > >
> >> >> > > > - Prateek
> >> >> > > >
> >> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> >> >> > > mmcfarland@cavulus.com>
> >> >> > > > wrote:
> >> >> > > >
> >> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody
> >> seen
> >> >> > > > > issues with core allocation in this environment? I'm seeing
> >> this
> >> >> in
> >> >> > > > > the samza log:
> >> >> > > > >
> >> >> > > > > "Got AM register response. The YARN RM supports container
> >> requests
> >> >> > > > > with max-mem: 14336, max-cpu: 1"
> >> >> > > > >
> >> >> > > > > How does samza determine this? Looking at the Samza source on
> >> >> Github,
> >> >> > > > > it appears to be information that's passed back to the AM when
> >> it
> >> >> > > > > starts up.
> >> >> > > > >
> >> >> > > > > Cheers,
> >> >> > > > > Malcolm
> >> >> > > > >
> >> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> >> >> > > > > <mm...@cavulus.com> wrote:
> >> >> > > > > >
> >> >> > > > > > Hi Prateek,
> >> >> > > > > >
> >> >> > > > > > Sorry, meant to include these versions with my email; I'm
> >> >> running
> >> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers
> >> >> across 3
> >> >> > > > > > node managers, each with 16GB and 8 vcores. The other two
> >> >> containers
> >> >> > > > > > are requesting 1 vcore each; even with the AMs running, that
> >> >> should
> >> >> > > be
> >> >> > > > > > 4 for them in total, leaving plenty of processing power
> >> >> available.
> >> >> > > > > >
> >> >> > > > > > The error is in the application attempt diagnostics field:
> >> "The
> >> >> YARN
> >> >> > > > > > cluster is unable to run your job due to unsatisfiable
> >> resource
> >> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not
> >> >> see this
> >> >> > > > > > error with the same memory request, but a cpu count request
> >> of
> >> >> 1.
> >> >> > > > > >
> >> >> > > > > > Here are the configuration options pertaining to resource
> >> >> allocation:
> >> >> > > > > >
> >> >> > > > > > <?xml version="1.0"?>
> >> >> > > > > > <configuration>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
> >> >> > > > > >
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >> >>
> >> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> >> >> > > > > >     <value>false</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> >> >> > > > > >     <value>2.1</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> >> >> > > > > >     <value>14336</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> >> >> > > > > >     <value>256</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> >> >> > > > > >     <value>14336</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> >> >> > > > > >     <value>1</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> >> >> > > > > >     <value>16</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> >> >> > > > > >     <value>8</value>
> >> >> > > > > >   </property>
> >> >> > > > > >   <property>
> >> >> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
> >> >> > > > > >     <value>processor-cluster</value>
> >> >> > > > > >   </property>
> >> >> > > > > > </configuration>
> >> >> > > > > >
> >> >> > > > > > Cheers,
> >> >> > > > > > Malcolm
> >> >> > > > > >
> >> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> >> >> > > > prateekmi2@gmail.com>
> >> >> > > > > wrote:
> >> >> > > > > > >
> >> >> > > > > > > Hi Malcolm,
> >> >> > > > > > >
> >> >> > > > > > > Just setting that configuration should be sufficient. We
> >> >> haven't
> >> >> > > seen
> >> >> > > > > this
> >> >> > > > > > > issue before. What Samza/YARN versions are you using? Can
> >> you
> >> >> also
> >> >> > > > > include
> >> >> > > > > > > the logs from where you get the error and your yarn
> >> >> configuration?
> >> >> > > > > > >
> >> >> > > > > > > - Prateek
> >> >> > > > > > >
> >> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> >> >> > > > > mmcfarland@cavulus.com>
> >> >> > > > > > > wrote:
> >> >> > > > > > >
> >> >> > > > > > > > Hey Folks,
> >> >> > > > > > > >
> >> >> > > > > > > > I'm having some issues getting multiple cores for
> >> >> containers in
> >> >> > > > yarn.
> >> >> > > > > > > > I seem to have my YARN settings correct, and the RM
> >> >> interface
> >> >> > > says
> >> >> > > > > > > > that I have 24vcores available. However, when I set the
> >> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
> >> >> anything
> >> >> > > other
> >> >> > > > > > > > than 1, I get a message about how the container is
> >> >> requesting
> >> >> > > more
> >> >> > > > > > > > resources than it can allocate. With 1 core, everything
> >> is
> >> >> fine.
> >> >> > > Is
> >> >> > > > > > > > there another Samza option I need to set?
> >> >> > > > > > > >
> >> >> > > > > > > > Cheers,
> >> >> > > > > > > > Malcolm
> >> >> > > > > > > >
> >> >> > > > > > > >
> >> >> > > > > > > > --
> >> >> > > > > > > > Malcolm McFarland
> >> >> > > > > > > > Cavulus
> >> >> > > > > > > >
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > --
> >> >> > > > > > Malcolm McFarland
> >> >> > > > > > Cavulus
> >> >> > > > > > 1-800-760-6915
> >> >> > > > > > mmcfarland@cavulus.com
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
> >> Cavulus.
> >> >> Any
> >> >> > > > > > unauthorized or improper disclosure, copying, distribution,
> >> or
> >> >> use of
> >> >> > > > > > the contents of this message is prohibited. The information
> >> >> contained
> >> >> > > > > > in this message is intended only for the personal and
> >> >> confidential
> >> >> > > use
> >> >> > > > > > of the recipient(s) named above. If you have received this
> >> >> message in
> >> >> > > > > > error, please notify the sender immediately and delete the
> >> >> original
> >> >> > > > > > message.
> >> >> > > > >
> >> >> > > > >
> >> >> > > > >
> >> >> > > > > --
> >> >> > > > > Malcolm McFarland
> >> >> > > > > Cavulus
> >> >> > > > > 1-800-760-6915
> >> >> > > > > mmcfarland@cavulus.com
> >> >> > > > >
> >> >> > > > >
> >> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> >> Any
> >> >> > > > > unauthorized or improper disclosure, copying, distribution, or
> >> >> use of
> >> >> > > > > the contents of this message is prohibited. The information
> >> >> contained
> >> >> > > > > in this message is intended only for the personal and
> >> >> confidential use
> >> >> > > > > of the recipient(s) named above. If you have received this
> >> >> message in
> >> >> > > > > error, please notify the sender immediately and delete the
> >> >> original
> >> >> > > > > message.
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >> >> > >
> >> >> > > --
> >> >> > > Malcolm McFarland
> >> >> > > Cavulus
> >> >> > > 1-800-760-6915
> >> >> > > mmcfarland@cavulus.com
> >> >> > >
> >> >> > >
> >> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> >> > > unauthorized or improper disclosure, copying, distribution, or use
> >> of
> >> >> the
> >> >> > > contents of this message is prohibited. The information contained
> >> in
> >> >> this
> >> >> > > message is intended only for the personal and confidential use of
> >> the
> >> >> > > recipient(s) named above. If you have received this message in
> >> error,
> >> >> > > please notify the sender immediately and delete the original
> >> message.
> >> >> > >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Malcolm McFarland
> >> >> Cavulus
> >> >> 1-800-760-6915
> >> >> mmcfarland@cavulus.com
> >> >>
> >> >>
> >> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> >> unauthorized or improper disclosure, copying, distribution, or use of
> >> >> the contents of this message is prohibited. The information contained
> >> >> in this message is intended only for the personal and confidential use
> >> >> of the recipient(s) named above. If you have received this message in
> >> >> error, please notify the sender immediately and delete the original
> >> >> message.
> >> >>
> >> >
> >> >
> >> > --
> >> > Malcolm McFarland
> >> > Cavulus
> >> > 1-800-760-6915
> >> > mmcfarland@cavulus.com
> >> >
> >> >
> >> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> > unauthorized or improper disclosure, copying, distribution, or use of
> >> the
> >> > contents of this message is prohibited. The information contained in
> >> this
> >> > message is intended only for the personal and confidential use of the
> >> > recipient(s) named above. If you have received this message in error,
> >> > please notify the sender immediately and delete the original message.
> >> >
> >>
> >>
> >> --
> >> Malcolm McFarland
> >> Cavulus
> >> 1-800-760-6915
> >> mmcfarland@cavulus.com
> >>
> >>
> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> unauthorized or improper disclosure, copying, distribution, or use of the
> >> contents of this message is prohibited. The information contained in this
> >> message is intended only for the personal and confidential use of the
> >> recipient(s) named above. If you have received this message in error,
> >> please notify the sender immediately and delete the original message.
> >>
> >



-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of
the contents of this message is prohibited. The information contained
in this message is intended only for the personal and confidential use
of the recipient(s) named above. If you have received this message in
error, please notify the sender immediately and delete the original
message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Prateek Maheshwari <pr...@gmail.com>.

And just to double check, you also changed the
yarn.resourcemanager.scheduler.class to CapacityScheduler?

On Tue, Apr 2, 2019 at 9:49 AM Prateek Maheshwari <pr...@gmail.com>
wrote:

> Is it still the same message from the AM? The one that says: "Got AM
> register response. The YARN RM supports container requests with max-mem:
> 14336, max-cpu: 1"
>
> On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <mm...@cavulus.com>
> wrote:
>
>> Hey Prateek,
>>
>> The upgrade to Hadoop 2.7.6 went fine; everything seems to be working, and
>> access to S3 via an access key/secret pair is working as well. However, my
>> requested tasks are still only getting allocated 1 core, despite
>> requesting
>> more than that. Once again, I have a 3-node cluster that should have 24
>> vcores available; on the yarn side, I have these options set:
>>
>> nodemanager.resource.cpu-vcores=8
>> yarn.scheduler.minimum-allocation-vcore=1
>> yarn.scheduler.maximum-allocation-vcores=4
>>
>> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
>>
>> And on the Samza side, I'm setting:
>>
>> cluster-manager.container.cpu.cores=2
>>
>> However, YARN is still telling me that the running task has 1 vcore
>> assigned. Do you have any other suggestions for options to tweak?
>>
>> Cheers,
>> Malcolm
>>
>>
>> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <mm...@cavulus.com>
>> wrote:
>>
>> > One more thing -- fwiw, I actually also came across the possibility
>> that I
>> > would need to use the DominantResourceCalculator, but as you point out,
>> > this doesn't seem to be available in Hadoop 2.6.
>> >
>> >
>> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <
>> mmcfarland@cavulus.com>
>> > wrote:
>> >
>> >> That's quite helpful! I actually initially tried using a version of
>> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
>> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
>> >> received lots of "No AWS Credentials
>> >> provided by DefaultAWSCredentialsProviderChain" messages. I found a
>> >> way around this by providing the credentials to the AM directly via
>> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
>> >> since this seemed very workaround-ish, I just assumed that I would
>> >> eventually hit other problems using a version of Hadoop not pinned in
>> >> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
>> >> give it a shot again.
>> >>
>> >> Have you done any AWS credential integration, and if so, did you need
>> >> to do anything special to get it to work?
>> >>
>> >> Cheers,
>> >> Malcolm
>> >>
>> >>
>> >>
>> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <
>> prateekmi2@gmail.com>
>> >> wrote:
>> >> >
>> >> > Hi Malcolm,
>> >> >
>> >> > I think this is because in YARN 2.6 the FifoScheduler only accounts
>> for
>> >> > memory for 'maximumAllocation':
>> >> >
>> >>
>> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> >> >
>> >> > This has been changed as early as 2.7.0:
>> >> >
>> >>
>> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> >> >
>> >> > So upgrading will likely fix this issue. For reference, at LinkedIn
>> we
>> >> are
>> >> > running YARN 2.7.2 with the CapacityScheduler
>> >> > <
>> >>
>> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
>> >> >
>> >> > and DominantResourceCalculator to account for vcore allocations in
>> >> > scheduling.
>> >> >
>> >> > - Prateek
>> >> >
>> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
>> >> mmcfarland@cavulus.com>
>> >> > wrote:
>> >> >
>> >> > > Hi Prateek,
>> >> > >
>> >> > > This still seems to be manifesting with the same problem. Since
>> this
>> >> seems
>> >> > > to be something in the hadoop codebase, and I've emailed the
>> >> hadoop-dev
>> >> > > mailing list about it.
>> >> > >
>> >> > > Cheers,
>> >> > > Malcolm
>> >> > >
>> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
>> >> prateekmi2@gmail.com>
>> >> > > wrote:
>> >> > >
>> >> > > > Hi Malcolm,
>> >> > > >
>> >> > > > Yes, the AM is just reporting what the RM specified as the
>> maximum
>> >> > > allowed
>> >> > > > request size.
>> >> > > >
>> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be
>> less
>> >> than
>> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must
>> fit
>> >> on a
>> >> > > > single NM. Maybe the RM detected this and decided to default to
>> 1?
>> >> Can
>> >> > > you
>> >> > > > try setting maximum-allocation-vcores lower?
>> >> > > >
>> >> > > > - Prateek
>> >> > > >
>> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
>> >> > > mmcfarland@cavulus.com>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody
>> seen
>> >> > > > > issues with core allocation in this environment? I'm seeing
>> this
>> >> in
>> >> > > > > the samza log:
>> >> > > > >
>> >> > > > > "Got AM register response. The YARN RM supports container
>> requests
>> >> > > > > with max-mem: 14336, max-cpu: 1"
>> >> > > > >
>> >> > > > > How does samza determine this? Looking at the Samza source on
>> >> Github,
>> >> > > > > it appears to be information that's passed back to the AM when
>> it
>> >> > > > > starts up.
>> >> > > > >
>> >> > > > > Cheers,
>> >> > > > > Malcolm
>> >> > > > >
>> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
>> >> > > > > <mm...@cavulus.com> wrote:
>> >> > > > > >
>> >> > > > > > Hi Prateek,
>> >> > > > > >
>> >> > > > > > Sorry, meant to include these versions with my email; I'm
>> >> running
>> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers
>> >> across 3
>> >> > > > > > node managers, each with 16GB and 8 vcores. The other two
>> >> containers
>> >> > > > > > are requesting 1 vcore each; even with the AMs running, that
>> >> should
>> >> > > be
>> >> > > > > > 4 for them in total, leaving plenty of processing power
>> >> available.
>> >> > > > > >
>> >> > > > > > The error is in the application attempt diagnostics field:
>> "The
>> >> YARN
>> >> > > > > > cluster is unable to run your job due to unsatisfiable
>> resource
>> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not
>> >> see this
>> >> > > > > > error with the same memory request, but a cpu count request
>> of
>> >> 1.
>> >> > > > > >
>> >> > > > > > Here are the configuration options pertaining to resource
>> >> allocation:
>> >> > > > > >
>> >> > > > > > <?xml version="1.0"?>
>> >> > > > > > <configuration>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >>
>> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
>> >> > > > > >     <value>false</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>> >> > > > > >     <value>2.1</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
>> >> > > > > >     <value>14336</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
>> >> > > > > >     <value>256</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
>> >> > > > > >     <value>14336</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
>> >> > > > > >     <value>1</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
>> >> > > > > >     <value>16</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
>> >> > > > > >     <value>8</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
>> >> > > > > >     <value>processor-cluster</value>
>> >> > > > > >   </property>
>> >> > > > > > </configuration>
>> >> > > > > >
>> >> > > > > > Cheers,
>> >> > > > > > Malcolm
>> >> > > > > >
>> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
>> >> > > > prateekmi2@gmail.com>
>> >> > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > Hi Malcolm,
>> >> > > > > > >
>> >> > > > > > > Just setting that configuration should be sufficient. We
>> >> haven't
>> >> > > seen
>> >> > > > > this
>> >> > > > > > > issue before. What Samza/YARN versions are you using? Can
>> you
>> >> also
>> >> > > > > include
>> >> > > > > > > the logs from where you get the error and your yarn
>> >> configuration?
>> >> > > > > > >
>> >> > > > > > > - Prateek
>> >> > > > > > >
>> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
>> >> > > > > mmcfarland@cavulus.com>
>> >> > > > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > > Hey Folks,
>> >> > > > > > > >
>> >> > > > > > > > I'm having some issues getting multiple cores for
>> >> containers in
>> >> > > > yarn.
>> >> > > > > > > > I seem to have my YARN settings correct, and the RM
>> >> interface
>> >> > > says
>> >> > > > > > > > that I have 24vcores available. However, when I set the
>> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
>> >> anything
>> >> > > other
>> >> > > > > > > > than 1, I get a message about how the container is
>> >> requesting
>> >> > > more
>> >> > > > > > > > resources than it can allocate. With 1 core, everything
>> is
>> >> fine.
>> >> > > Is
>> >> > > > > > > > there another Samza option I need to set?
>> >> > > > > > > >
>> >> > > > > > > > Cheers,
>> >> > > > > > > > Malcolm
>> >> > > > > > > >
>> >> > > > > > > >
>> >> > > > > > > > --
>> >> > > > > > > > Malcolm McFarland
>> >> > > > > > > > Cavulus
>> >> > > > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > --
>> >> > > > > > Malcolm McFarland
>> >> > > > > > Cavulus
>> >> > > > > > 1-800-760-6915
>> >> > > > > > mmcfarland@cavulus.com
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
>> Cavulus.
>> >> Any
>> >> > > > > > unauthorized or improper disclosure, copying, distribution,
>> or
>> >> use of
>> >> > > > > > the contents of this message is prohibited. The information
>> >> contained
>> >> > > > > > in this message is intended only for the personal and
>> >> confidential
>> >> > > use
>> >> > > > > > of the recipient(s) named above. If you have received this
>> >> message in
>> >> > > > > > error, please notify the sender immediately and delete the
>> >> original
>> >> > > > > > message.
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > > --
>> >> > > > > Malcolm McFarland
>> >> > > > > Cavulus
>> >> > > > > 1-800-760-6915
>> >> > > > > mmcfarland@cavulus.com
>> >> > > > >
>> >> > > > >
>> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
>> Any
>> >> > > > > unauthorized or improper disclosure, copying, distribution, or
>> >> use of
>> >> > > > > the contents of this message is prohibited. The information
>> >> contained
>> >> > > > > in this message is intended only for the personal and
>> >> confidential use
>> >> > > > > of the recipient(s) named above. If you have received this
>> >> message in
>> >> > > > > error, please notify the sender immediately and delete the
>> >> original
>> >> > > > > message.
>> >> > > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Malcolm McFarland
>> >> > > Cavulus
>> >> > > 1-800-760-6915
>> >> > > mmcfarland@cavulus.com
>> >> > >
>> >> > >
>> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> >> > > unauthorized or improper disclosure, copying, distribution, or use
>> of
>> >> the
>> >> > > contents of this message is prohibited. The information contained
>> in
>> >> this
>> >> > > message is intended only for the personal and confidential use of
>> the
>> >> > > recipient(s) named above. If you have received this message in
>> error,
>> >> > > please notify the sender immediately and delete the original
>> message.
>> >> > >
>> >>
>> >>
>> >>
>> >> --
>> >> Malcolm McFarland
>> >> Cavulus
>> >> 1-800-760-6915
>> >> mmcfarland@cavulus.com
>> >>
>> >>
>> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> >> unauthorized or improper disclosure, copying, distribution, or use of
>> >> the contents of this message is prohibited. The information contained
>> >> in this message is intended only for the personal and confidential use
>> >> of the recipient(s) named above. If you have received this message in
>> >> error, please notify the sender immediately and delete the original
>> >> message.
>> >>
>> >
>> >
>> > --
>> > Malcolm McFarland
>> > Cavulus
>> > 1-800-760-6915
>> > mmcfarland@cavulus.com
>> >
>> >
>> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > unauthorized or improper disclosure, copying, distribution, or use of
>> the
>> > contents of this message is prohibited. The information contained in
>> this
>> > message is intended only for the personal and confidential use of the
>> > recipient(s) named above. If you have received this message in error,
>> > please notify the sender immediately and delete the original message.
>> >
>>
>>
>> --
>> Malcolm McFarland
>> Cavulus
>> 1-800-760-6915
>> mmcfarland@cavulus.com
>>
>>
>> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> unauthorized or improper disclosure, copying, distribution, or use of the
>> contents of this message is prohibited. The information contained in this
>> message is intended only for the personal and confidential use of the
>> recipient(s) named above. If you have received this message in error,
>> please notify the sender immediately and delete the original message.
>>
>

Re: Running w/ multiple CPUs/container on YARN

Posted by Prateek Maheshwari <pr...@gmail.com>.

Is it still the same message from the AM? The one that says: "Got AM
register response. The YARN RM supports container requests with max-mem:
14336, max-cpu: 1"

On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <mm...@cavulus.com>
wrote:

> Hey Prateek,
>
> The upgrade to Hadoop 2.7.6 went fine; everything seems to be working, and
> access to S3 via an access key/secret pair is working as well. However, my
> requested tasks are still only getting allocated 1 core, despite requesting
> more than that. Once again, I have a 3-node cluster that should have 24
> vcores available; on the yarn side, I have these options set:
>
> nodemanager.resource.cpu-vcores=8
> yarn.scheduler.minimum-allocation-vcore=1
> yarn.scheduler.maximum-allocation-vcores=4
>
> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
>
> And on the Samza side, I'm setting:
>
> cluster-manager.container.cpu.cores=2
>
> However, YARN is still telling me that the running task has 1 vcore
> assigned. Do you have any other suggestions for options to tweak?
>
> Cheers,
> Malcolm
>
>
> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <mm...@cavulus.com>
> wrote:
>
> > One more thing -- fwiw, I actually also came across the possibility that
> I
> > would need to use the DominantResourceCalculator, but as you point out,
> > this doesn't seem to be available in Hadoop 2.6.
> >
> >
> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <mmcfarland@cavulus.com
> >
> > wrote:
> >
> >> That's quite helpful! I actually initially tried using a version of
> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
> >> received lots of "No AWS Credentials
> >> provided by DefaultAWSCredentialsProviderChain" messages. I found a
> >> way around this by providing the credentials to the AM directly via
> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
> >> since this seemed very workaround-ish, I just assumed that I would
> >> eventually hit other problems using a version of Hadoop not pinned in
> >> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
> >> give it a shot again.
> >>
> >> Have you done any AWS credential integration, and if so, did you need
> >> to do anything special to get it to work?
> >>
> >> Cheers,
> >> Malcolm
> >>
> >>
> >>
> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <prateekmi2@gmail.com
> >
> >> wrote:
> >> >
> >> > Hi Malcolm,
> >> >
> >> > I think this is because in YARN 2.6 the FifoScheduler only accounts
> for
> >> > memory for 'maximumAllocation':
> >> >
> >>
> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >> >
> >> > This has been changed as early as 2.7.0:
> >> >
> >>
> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >> >
> >> > So upgrading will likely fix this issue. For reference, at LinkedIn we
> >> are
> >> > running YARN 2.7.2 with the CapacityScheduler
> >> > <
> >>
> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> >> >
> >> > and DominantResourceCalculator to account for vcore allocations in
> >> > scheduling.
> >> >
> >> > - Prateek
> >> >
> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
> >> mmcfarland@cavulus.com>
> >> > wrote:
> >> >
> >> > > Hi Prateek,
> >> > >
> >> > > This still seems to be manifesting with the same problem. Since this
> >> seems
> >> > > to be something in the hadoop codebase, and I've emailed the
> >> hadoop-dev
> >> > > mailing list about it.
> >> > >
> >> > > Cheers,
> >> > > Malcolm
> >> > >
> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
> >> prateekmi2@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi Malcolm,
> >> > > >
> >> > > > Yes, the AM is just reporting what the RM specified as the maximum
> >> > > allowed
> >> > > > request size.
> >> > > >
> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be
> less
> >> than
> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must fit
> >> on a
> >> > > > single NM. Maybe the RM detected this and decided to default to 1?
> >> Can
> >> > > you
> >> > > > try setting maximum-allocation-vcores lower?
> >> > > >
> >> > > > - Prateek
> >> > > >
> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> >> > > mmcfarland@cavulus.com>
> >> > > > wrote:
> >> > > >
> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody
> seen
> >> > > > > issues with core allocation in this environment? I'm seeing this
> >> in
> >> > > > > the samza log:
> >> > > > >
> >> > > > > "Got AM register response. The YARN RM supports container
> requests
> >> > > > > with max-mem: 14336, max-cpu: 1"
> >> > > > >
> >> > > > > How does samza determine this? Looking at the Samza source on
> >> Github,
> >> > > > > it appears to be information that's passed back to the AM when
> it
> >> > > > > starts up.
> >> > > > >
> >> > > > > Cheers,
> >> > > > > Malcolm
> >> > > > >
> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> >> > > > > <mm...@cavulus.com> wrote:
> >> > > > > >
> >> > > > > > Hi Prateek,
> >> > > > > >
> >> > > > > > Sorry, meant to include these versions with my email; I'm
> >> running
> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers
> >> across 3
> >> > > > > > node managers, each with 16GB and 8 vcores. The other two
> >> containers
> >> > > > > > are requesting 1 vcore each; even with the AMs running, that
> >> should
> >> > > be
> >> > > > > > 4 for them in total, leaving plenty of processing power
> >> available.
> >> > > > > >
> >> > > > > > The error is in the application attempt diagnostics field:
> "The
> >> YARN
> >> > > > > > cluster is unable to run your job due to unsatisfiable
> resource
> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not
> >> see this
> >> > > > > > error with the same memory request, but a cpu count request of
> >> 1.
> >> > > > > >
> >> > > > > > Here are the configuration options pertaining to resource
> >> allocation:
> >> > > > > >
> >> > > > > > <?xml version="1.0"?>
> >> > > > > > <configuration>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >>
> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> >> > > > > >     <value>false</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> >> > > > > >     <value>2.1</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> >> > > > > >     <value>14336</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> >> > > > > >     <value>256</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> >> > > > > >     <value>14336</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> >> > > > > >     <value>1</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> >> > > > > >     <value>16</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> >> > > > > >     <value>8</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
> >> > > > > >     <value>processor-cluster</value>
> >> > > > > >   </property>
> >> > > > > > </configuration>
> >> > > > > >
> >> > > > > > Cheers,
> >> > > > > > Malcolm
> >> > > > > >
> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> >> > > > prateekmi2@gmail.com>
> >> > > > > wrote:
> >> > > > > > >
> >> > > > > > > Hi Malcolm,
> >> > > > > > >
> >> > > > > > > Just setting that configuration should be sufficient. We
> >> haven't
> >> > > seen
> >> > > > > this
> >> > > > > > > issue before. What Samza/YARN versions are you using? Can
> you
> >> also
> >> > > > > include
> >> > > > > > > the logs from where you get the error and your yarn
> >> configuration?
> >> > > > > > >
> >> > > > > > > - Prateek
> >> > > > > > >
> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> >> > > > > mmcfarland@cavulus.com>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hey Folks,
> >> > > > > > > >
> >> > > > > > > > I'm having some issues getting multiple cores for
> >> containers in
> >> > > > yarn.
> >> > > > > > > > I seem to have my YARN settings correct, and the RM
> >> interface
> >> > > says
> >> > > > > > > > that I have 24vcores available. However, when I set the
> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
> >> anything
> >> > > other
> >> > > > > > > > than 1, I get a message about how the container is
> >> requesting
> >> > > more
> >> > > > > > > > resources than it can allocate. With 1 core, everything is
> >> fine.
> >> > > Is
> >> > > > > > > > there another Samza option I need to set?
> >> > > > > > > >
> >> > > > > > > > Cheers,
> >> > > > > > > > Malcolm
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > --
> >> > > > > > > > Malcolm McFarland
> >> > > > > > > > Cavulus
> >> > > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Malcolm McFarland
> >> > > > > > Cavulus
> >> > > > > > 1-800-760-6915
> >> > > > > > mmcfarland@cavulus.com
> >> > > > > >
> >> > > > > >
> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> >> Any
> >> > > > > > unauthorized or improper disclosure, copying, distribution, or
> >> use of
> >> > > > > > the contents of this message is prohibited. The information
> >> contained
> >> > > > > > in this message is intended only for the personal and
> >> confidential
> >> > > use
> >> > > > > > of the recipient(s) named above. If you have received this
> >> message in
> >> > > > > > error, please notify the sender immediately and delete the
> >> original
> >> > > > > > message.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Malcolm McFarland
> >> > > > > Cavulus
> >> > > > > 1-800-760-6915
> >> > > > > mmcfarland@cavulus.com
> >> > > > >
> >> > > > >
> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> Any
> >> > > > > unauthorized or improper disclosure, copying, distribution, or
> >> use of
> >> > > > > the contents of this message is prohibited. The information
> >> contained
> >> > > > > in this message is intended only for the personal and
> >> confidential use
> >> > > > > of the recipient(s) named above. If you have received this
> >> message in
> >> > > > > error, please notify the sender immediately and delete the
> >> original
> >> > > > > message.
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Malcolm McFarland
> >> > > Cavulus
> >> > > 1-800-760-6915
> >> > > mmcfarland@cavulus.com
> >> > >
> >> > >
> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> > > unauthorized or improper disclosure, copying, distribution, or use
> of
> >> the
> >> > > contents of this message is prohibited. The information contained in
> >> this
> >> > > message is intended only for the personal and confidential use of
> the
> >> > > recipient(s) named above. If you have received this message in
> error,
> >> > > please notify the sender immediately and delete the original
> message.
> >> > >
> >>
> >>
> >>
> >> --
> >> Malcolm McFarland
> >> Cavulus
> >> 1-800-760-6915
> >> mmcfarland@cavulus.com
> >>
> >>
> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> unauthorized or improper disclosure, copying, distribution, or use of
> >> the contents of this message is prohibited. The information contained
> >> in this message is intended only for the personal and confidential use
> >> of the recipient(s) named above. If you have received this message in
> >> error, please notify the sender immediately and delete the original
> >> message.
> >>
> >
> >
> > --
> > Malcolm McFarland
> > Cavulus
> > 1-800-760-6915
> > mmcfarland@cavulus.com
> >
> >
> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > unauthorized or improper disclosure, copying, distribution, or use of the
> > contents of this message is prohibited. The information contained in this
> > message is intended only for the personal and confidential use of the
> > recipient(s) named above. If you have received this message in error,
> > please notify the sender immediately and delete the original message.
> >
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of the
> contents of this message is prohibited. The information contained in this
> message is intended only for the personal and confidential use of the
> recipient(s) named above. If you have received this message in error,
> please notify the sender immediately and delete the original message.
>

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

Hey Prateek,

The upgrade to Hadoop 2.7.6 went fine; everything seems to be working, and
access to S3 via an access key/secret pair is working as well. However, my
requested tasks are still only getting allocated 1 core, despite requesting
more than that. Once again, I have a 3-node cluster that should have 24
vcores available; on the yarn side, I have these options set:

nodemanager.resource.cpu-vcores=8
yarn.scheduler.minimum-allocation-vcore=1
yarn.scheduler.maximum-allocation-vcores=4
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

And on the Samza side, I'm setting:

cluster-manager.container.cpu.cores=2

However, YARN is still telling me that the running task has 1 vcore
assigned. Do you have any other suggestions for options to tweak?

Cheers,
Malcolm


On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <mm...@cavulus.com>
wrote:

> One more thing -- fwiw, I actually also came across the possibility that I
> would need to use the DominantResourceCalculator, but as you point out,
> this doesn't seem to be available in Hadoop 2.6.
>
>
> On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <mm...@cavulus.com>
> wrote:
>
>> That's quite helpful! I actually initially tried using a version of
>> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
>> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
>> received lots of "No AWS Credentials
>> provided by DefaultAWSCredentialsProviderChain" messages. I found a
>> way around this by providing the credentials to the AM directly via
>> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
>> since this seemed very workaround-ish, I just assumed that I would
>> eventually hit other problems using a version of Hadoop not pinned in
>> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
>> give it a shot again.
>>
>> Have you done any AWS credential integration, and if so, did you need
>> to do anything special to get it to work?
>>
>> Cheers,
>> Malcolm
>>
>>
>>
>> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <pr...@gmail.com>
>> wrote:
>> >
>> > Hi Malcolm,
>> >
>> > I think this is because in YARN 2.6 the FifoScheduler only accounts for
>> > memory for 'maximumAllocation':
>> >
>> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> >
>> > This has been changed as early as 2.7.0:
>> >
>> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> >
>> > So upgrading will likely fix this issue. For reference, at LinkedIn we
>> are
>> > running YARN 2.7.2 with the CapacityScheduler
>> > <
>> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
>> >
>> > and DominantResourceCalculator to account for vcore allocations in
>> > scheduling.
>> >
>> > - Prateek
>> >
>> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
>> mmcfarland@cavulus.com>
>> > wrote:
>> >
>> > > Hi Prateek,
>> > >
>> > > This still seems to be manifesting with the same problem. Since this
>> seems
>> > > to be something in the hadoop codebase, and I've emailed the
>> hadoop-dev
>> > > mailing list about it.
>> > >
>> > > Cheers,
>> > > Malcolm
>> > >
>> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
>> prateekmi2@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Malcolm,
>> > > >
>> > > > Yes, the AM is just reporting what the RM specified as the maximum
>> > > allowed
>> > > > request size.
>> > > >
>> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less
>> than
>> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must fit
>> on a
>> > > > single NM. Maybe the RM detected this and decided to default to 1?
>> Can
>> > > you
>> > > > try setting maximum-allocation-vcores lower?
>> > > >
>> > > > - Prateek
>> > > >
>> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
>> > > mmcfarland@cavulus.com>
>> > > > wrote:
>> > > >
>> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody seen
>> > > > > issues with core allocation in this environment? I'm seeing this
>> in
>> > > > > the samza log:
>> > > > >
>> > > > > "Got AM register response. The YARN RM supports container requests
>> > > > > with max-mem: 14336, max-cpu: 1"
>> > > > >
>> > > > > How does samza determine this? Looking at the Samza source on
>> Github,
>> > > > > it appears to be information that's passed back to the AM when it
>> > > > > starts up.
>> > > > >
>> > > > > Cheers,
>> > > > > Malcolm
>> > > > >
>> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
>> > > > > <mm...@cavulus.com> wrote:
>> > > > > >
>> > > > > > Hi Prateek,
>> > > > > >
>> > > > > > Sorry, meant to include these versions with my email; I'm
>> running
>> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers
>> across 3
>> > > > > > node managers, each with 16GB and 8 vcores. The other two
>> containers
>> > > > > > are requesting 1 vcore each; even with the AMs running, that
>> should
>> > > be
>> > > > > > 4 for them in total, leaving plenty of processing power
>> available.
>> > > > > >
>> > > > > > The error is in the application attempt diagnostics field: "The
>> YARN
>> > > > > > cluster is unable to run your job due to unsatisfiable resource
>> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not
>> see this
>> > > > > > error with the same memory request, but a cpu count request of
>> 1.
>> > > > > >
>> > > > > > Here are the configuration options pertaining to resource
>> allocation:
>> > > > > >
>> > > > > > <?xml version="1.0"?>
>> > > > > > <configuration>
>> > > > > >   <property>
>> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
>> > > > > >     <value>false</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>> > > > > >     <value>2.1</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
>> > > > > >     <value>14336</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
>> > > > > >     <value>256</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
>> > > > > >     <value>14336</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
>> > > > > >     <value>1</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
>> > > > > >     <value>16</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
>> > > > > >     <value>8</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
>> > > > > >     <value>processor-cluster</value>
>> > > > > >   </property>
>> > > > > > </configuration>
>> > > > > >
>> > > > > > Cheers,
>> > > > > > Malcolm
>> > > > > >
>> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
>> > > > prateekmi2@gmail.com>
>> > > > > wrote:
>> > > > > > >
>> > > > > > > Hi Malcolm,
>> > > > > > >
>> > > > > > > Just setting that configuration should be sufficient. We
>> haven't
>> > > seen
>> > > > > this
>> > > > > > > issue before. What Samza/YARN versions are you using? Can you
>> also
>> > > > > include
>> > > > > > > the logs from where you get the error and your yarn
>> configuration?
>> > > > > > >
>> > > > > > > - Prateek
>> > > > > > >
>> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
>> > > > > mmcfarland@cavulus.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hey Folks,
>> > > > > > > >
>> > > > > > > > I'm having some issues getting multiple cores for
>> containers in
>> > > > yarn.
>> > > > > > > > I seem to have my YARN settings correct, and the RM
>> interface
>> > > says
>> > > > > > > > that I have 24vcores available. However, when I set the
>> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
>> anything
>> > > other
>> > > > > > > > than 1, I get a message about how the container is
>> requesting
>> > > more
>> > > > > > > > resources than it can allocate. With 1 core, everything is
>> fine.
>> > > Is
>> > > > > > > > there another Samza option I need to set?
>> > > > > > > >
>> > > > > > > > Cheers,
>> > > > > > > > Malcolm
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Malcolm McFarland
>> > > > > > > > Cavulus
>> > > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Malcolm McFarland
>> > > > > > Cavulus
>> > > > > > 1-800-760-6915
>> > > > > > mmcfarland@cavulus.com
>> > > > > >
>> > > > > >
>> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
>> Any
>> > > > > > unauthorized or improper disclosure, copying, distribution, or
>> use of
>> > > > > > the contents of this message is prohibited. The information
>> contained
>> > > > > > in this message is intended only for the personal and
>> confidential
>> > > use
>> > > > > > of the recipient(s) named above. If you have received this
>> message in
>> > > > > > error, please notify the sender immediately and delete the
>> original
>> > > > > > message.
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Malcolm McFarland
>> > > > > Cavulus
>> > > > > 1-800-760-6915
>> > > > > mmcfarland@cavulus.com
>> > > > >
>> > > > >
>> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > > > > unauthorized or improper disclosure, copying, distribution, or
>> use of
>> > > > > the contents of this message is prohibited. The information
>> contained
>> > > > > in this message is intended only for the personal and
>> confidential use
>> > > > > of the recipient(s) named above. If you have received this
>> message in
>> > > > > error, please notify the sender immediately and delete the
>> original
>> > > > > message.
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Malcolm McFarland
>> > > Cavulus
>> > > 1-800-760-6915
>> > > mmcfarland@cavulus.com
>> > >
>> > >
>> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > > unauthorized or improper disclosure, copying, distribution, or use of
>> the
>> > > contents of this message is prohibited. The information contained in
>> this
>> > > message is intended only for the personal and confidential use of the
>> > > recipient(s) named above. If you have received this message in error,
>> > > please notify the sender immediately and delete the original message.
>> > >
>>
>>
>>
>> --
>> Malcolm McFarland
>> Cavulus
>> 1-800-760-6915
>> mmcfarland@cavulus.com
>>
>>
>> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> unauthorized or improper disclosure, copying, distribution, or use of
>> the contents of this message is prohibited. The information contained
>> in this message is intended only for the personal and confidential use
>> of the recipient(s) named above. If you have received this message in
>> error, please notify the sender immediately and delete the original
>> message.
>>
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of the
> contents of this message is prohibited. The information contained in this
> message is intended only for the personal and confidential use of the
> recipient(s) named above. If you have received this message in error,
> please notify the sender immediately and delete the original message.
>


-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of the
contents of this message is prohibited. The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If you have received this message in error,
please notify the sender immediately and delete the original message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

One more thing -- fwiw, I actually also came across the possibility that I
would need to use the DominantResourceCalculator, but as you point out,
this doesn't seem to be available in Hadoop 2.6.


On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <mm...@cavulus.com>
wrote:

> That's quite helpful! I actually initially tried using a version of
> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
> received lots of "No AWS Credentials
> provided by DefaultAWSCredentialsProviderChain" messages. I found a
> way around this by providing the credentials to the AM directly via
> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
> since this seemed very workaround-ish, I just assumed that I would
> eventually hit other problems using a version of Hadoop not pinned in
> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
> give it a shot again.
>
> Have you done any AWS credential integration, and if so, did you need
> to do anything special to get it to work?
>
> Cheers,
> Malcolm
>
>
>
> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <pr...@gmail.com>
> wrote:
> >
> > Hi Malcolm,
> >
> > I think this is because in YARN 2.6 the FifoScheduler only accounts for
> > memory for 'maximumAllocation':
> >
> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >
> > This has been changed as early as 2.7.0:
> >
> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >
> > So upgrading will likely fix this issue. For reference, at LinkedIn we
> are
> > running YARN 2.7.2 with the CapacityScheduler
> > <
> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> >
> > and DominantResourceCalculator to account for vcore allocations in
> > scheduling.
> >
> > - Prateek
> >
> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <mmcfarland@cavulus.com
> >
> > wrote:
> >
> > > Hi Prateek,
> > >
> > > This still seems to be manifesting with the same problem. Since this
> seems
> > > to be something in the hadoop codebase, and I've emailed the hadoop-dev
> > > mailing list about it.
> > >
> > > Cheers,
> > > Malcolm
> > >
> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
> prateekmi2@gmail.com>
> > > wrote:
> > >
> > > > Hi Malcolm,
> > > >
> > > > Yes, the AM is just reporting what the RM specified as the maximum
> > > allowed
> > > > request size.
> > > >
> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less
> than
> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must fit
> on a
> > > > single NM. Maybe the RM detected this and decided to default to 1?
> Can
> > > you
> > > > try setting maximum-allocation-vcores lower?
> > > >
> > > > - Prateek
> > > >
> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> > > mmcfarland@cavulus.com>
> > > > wrote:
> > > >
> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody seen
> > > > > issues with core allocation in this environment? I'm seeing this in
> > > > > the samza log:
> > > > >
> > > > > "Got AM register response. The YARN RM supports container requests
> > > > > with max-mem: 14336, max-cpu: 1"
> > > > >
> > > > > How does samza determine this? Looking at the Samza source on
> Github,
> > > > > it appears to be information that's passed back to the AM when it
> > > > > starts up.
> > > > >
> > > > > Cheers,
> > > > > Malcolm
> > > > >
> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> > > > > <mm...@cavulus.com> wrote:
> > > > > >
> > > > > > Hi Prateek,
> > > > > >
> > > > > > Sorry, meant to include these versions with my email; I'm running
> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers across
> 3
> > > > > > node managers, each with 16GB and 8 vcores. The other two
> containers
> > > > > > are requesting 1 vcore each; even with the AMs running, that
> should
> > > be
> > > > > > 4 for them in total, leaving plenty of processing power
> available.
> > > > > >
> > > > > > The error is in the application attempt diagnostics field: "The
> YARN
> > > > > > cluster is unable to run your job due to unsatisfiable resource
> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not see
> this
> > > > > > error with the same memory request, but a cpu count request of 1.
> > > > > >
> > > > > > Here are the configuration options pertaining to resource
> allocation:
> > > > > >
> > > > > > <?xml version="1.0"?>
> > > > > > <configuration>
> > > > > >   <property>
> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
> > > > > >
> > > > >
> > > >
> > >
> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> > > > > >     <value>false</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> > > > > >     <value>2.1</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> > > > > >     <value>14336</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> > > > > >     <value>256</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> > > > > >     <value>14336</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > > > >     <value>1</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> > > > > >     <value>16</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > > > >     <value>8</value>
> > > > > >   </property>
> > > > > >   <property>
> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
> > > > > >     <value>processor-cluster</value>
> > > > > >   </property>
> > > > > > </configuration>
> > > > > >
> > > > > > Cheers,
> > > > > > Malcolm
> > > > > >
> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> > > > prateekmi2@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > Hi Malcolm,
> > > > > > >
> > > > > > > Just setting that configuration should be sufficient. We
> haven't
> > > seen
> > > > > this
> > > > > > > issue before. What Samza/YARN versions are you using? Can you
> also
> > > > > include
> > > > > > > the logs from where you get the error and your yarn
> configuration?
> > > > > > >
> > > > > > > - Prateek
> > > > > > >
> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> > > > > mmcfarland@cavulus.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hey Folks,
> > > > > > > >
> > > > > > > > I'm having some issues getting multiple cores for containers
> in
> > > > yarn.
> > > > > > > > I seem to have my YARN settings correct, and the RM interface
> > > says
> > > > > > > > that I have 24vcores available. However, when I set the
> > > > > > > > cluster-manager.container.cpu.cores Samza setting to anything
> > > other
> > > > > > > > than 1, I get a message about how the container is requesting
> > > more
> > > > > > > > resources than it can allocate. With 1 core, everything is
> fine.
> > > Is
> > > > > > > > there another Samza option I need to set?
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Malcolm
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Malcolm McFarland
> > > > > > > > Cavulus
> > > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Malcolm McFarland
> > > > > > Cavulus
> > > > > > 1-800-760-6915
> > > > > > mmcfarland@cavulus.com
> > > > > >
> > > > > >
> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> Any
> > > > > > unauthorized or improper disclosure, copying, distribution, or
> use of
> > > > > > the contents of this message is prohibited. The information
> contained
> > > > > > in this message is intended only for the personal and
> confidential
> > > use
> > > > > > of the recipient(s) named above. If you have received this
> message in
> > > > > > error, please notify the sender immediately and delete the
> original
> > > > > > message.
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Malcolm McFarland
> > > > > Cavulus
> > > > > 1-800-760-6915
> > > > > mmcfarland@cavulus.com
> > > > >
> > > > >
> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > > > unauthorized or improper disclosure, copying, distribution, or use
> of
> > > > > the contents of this message is prohibited. The information
> contained
> > > > > in this message is intended only for the personal and confidential
> use
> > > > > of the recipient(s) named above. If you have received this message
> in
> > > > > error, please notify the sender immediately and delete the original
> > > > > message.
> > > > >
> > > >
> > >
> > >
> > > --
> > > Malcolm McFarland
> > > Cavulus
> > > 1-800-760-6915
> > > mmcfarland@cavulus.com
> > >
> > >
> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > unauthorized or improper disclosure, copying, distribution, or use of
> the
> > > contents of this message is prohibited. The information contained in
> this
> > > message is intended only for the personal and confidential use of the
> > > recipient(s) named above. If you have received this message in error,
> > > please notify the sender immediately and delete the original message.
> > >
>
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of
> the contents of this message is prohibited. The information contained
> in this message is intended only for the personal and confidential use
> of the recipient(s) named above. If you have received this message in
> error, please notify the sender immediately and delete the original
> message.
>


-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of the
contents of this message is prohibited. The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If you have received this message in error,
please notify the sender immediately and delete the original message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

That's quite helpful! I actually initially tried using a version of
Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
(fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
received lots of "No AWS Credentials
provided by DefaultAWSCredentialsProviderChain" messages. I found a
way around this by providing the credentials to the AM directly via
yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
since this seemed very workaround-ish, I just assumed that I would
eventually hit other problems using a version of Hadoop not pinned in
the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
give it a shot again.

Have you done any AWS credential integration, and if so, did you need
to do anything special to get it to work?

Cheers,
Malcolm



On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <pr...@gmail.com> wrote:
>
> Hi Malcolm,
>
> I think this is because in YARN 2.6 the FifoScheduler only accounts for
> memory for 'maximumAllocation':
> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>
> This has been changed as early as 2.7.0:
> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>
> So upgrading will likely fix this issue. For reference, at LinkedIn we are
> running YARN 2.7.2 with the CapacityScheduler
> <https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html>
> and DominantResourceCalculator to account for vcore allocations in
> scheduling.
>
> - Prateek
>
> On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <mm...@cavulus.com>
> wrote:
>
> > Hi Prateek,
> >
> > This still seems to be manifesting with the same problem. Since this seems
> > to be something in the hadoop codebase, and I've emailed the hadoop-dev
> > mailing list about it.
> >
> > Cheers,
> > Malcolm
> >
> > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <pr...@gmail.com>
> > wrote:
> >
> > > Hi Malcolm,
> > >
> > > Yes, the AM is just reporting what the RM specified as the maximum
> > allowed
> > > request size.
> > >
> > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less than
> > > 'yarn.nodemanager.resource.cpu-vcores', since a container must fit on a
> > > single NM. Maybe the RM detected this and decided to default to 1? Can
> > you
> > > try setting maximum-allocation-vcores lower?
> > >
> > > - Prateek
> > >
> > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> > mmcfarland@cavulus.com>
> > > wrote:
> > >
> > > > One other detail: I'm running YARN on ECS in AWS. Has anybody seen
> > > > issues with core allocation in this environment? I'm seeing this in
> > > > the samza log:
> > > >
> > > > "Got AM register response. The YARN RM supports container requests
> > > > with max-mem: 14336, max-cpu: 1"
> > > >
> > > > How does samza determine this? Looking at the Samza source on Github,
> > > > it appears to be information that's passed back to the AM when it
> > > > starts up.
> > > >
> > > > Cheers,
> > > > Malcolm
> > > >
> > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> > > > <mm...@cavulus.com> wrote:
> > > > >
> > > > > Hi Prateek,
> > > > >
> > > > > Sorry, meant to include these versions with my email; I'm running
> > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers across 3
> > > > > node managers, each with 16GB and 8 vcores. The other two containers
> > > > > are requesting 1 vcore each; even with the AMs running, that should
> > be
> > > > > 4 for them in total, leaving plenty of processing power available.
> > > > >
> > > > > The error is in the application attempt diagnostics field: "The YARN
> > > > > cluster is unable to run your job due to unsatisfiable resource
> > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not see this
> > > > > error with the same memory request, but a cpu count request of 1.
> > > > >
> > > > > Here are the configuration options pertaining to resource allocation:
> > > > >
> > > > > <?xml version="1.0"?>
> > > > > <configuration>
> > > > >   <property>
> > > > >     <name>yarn.resourcemanager.scheduler.class</name>
> > > > >
> > > >
> > >
> > <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> > > > >     <value>false</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> > > > >     <value>2.1</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> > > > >     <value>14336</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> > > > >     <value>256</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> > > > >     <value>14336</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > > >     <value>1</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> > > > >     <value>16</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > > >     <value>8</value>
> > > > >   </property>
> > > > >   <property>
> > > > >     <name>yarn.resourcemanager.cluster-id</name>
> > > > >     <value>processor-cluster</value>
> > > > >   </property>
> > > > > </configuration>
> > > > >
> > > > > Cheers,
> > > > > Malcolm
> > > > >
> > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> > > prateekmi2@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > Hi Malcolm,
> > > > > >
> > > > > > Just setting that configuration should be sufficient. We haven't
> > seen
> > > > this
> > > > > > issue before. What Samza/YARN versions are you using? Can you also
> > > > include
> > > > > > the logs from where you get the error and your yarn configuration?
> > > > > >
> > > > > > - Prateek
> > > > > >
> > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> > > > mmcfarland@cavulus.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Folks,
> > > > > > >
> > > > > > > I'm having some issues getting multiple cores for containers in
> > > yarn.
> > > > > > > I seem to have my YARN settings correct, and the RM interface
> > says
> > > > > > > that I have 24vcores available. However, when I set the
> > > > > > > cluster-manager.container.cpu.cores Samza setting to anything
> > other
> > > > > > > than 1, I get a message about how the container is requesting
> > more
> > > > > > > resources than it can allocate. With 1 core, everything is fine.
> > Is
> > > > > > > there another Samza option I need to set?
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Malcolm
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Malcolm McFarland
> > > > > > > Cavulus
> > > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Malcolm McFarland
> > > > > Cavulus
> > > > > 1-800-760-6915
> > > > > mmcfarland@cavulus.com
> > > > >
> > > > >
> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > > > unauthorized or improper disclosure, copying, distribution, or use of
> > > > > the contents of this message is prohibited. The information contained
> > > > > in this message is intended only for the personal and confidential
> > use
> > > > > of the recipient(s) named above. If you have received this message in
> > > > > error, please notify the sender immediately and delete the original
> > > > > message.
> > > >
> > > >
> > > >
> > > > --
> > > > Malcolm McFarland
> > > > Cavulus
> > > > 1-800-760-6915
> > > > mmcfarland@cavulus.com
> > > >
> > > >
> > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > > unauthorized or improper disclosure, copying, distribution, or use of
> > > > the contents of this message is prohibited. The information contained
> > > > in this message is intended only for the personal and confidential use
> > > > of the recipient(s) named above. If you have received this message in
> > > > error, please notify the sender immediately and delete the original
> > > > message.
> > > >
> > >
> >
> >
> > --
> > Malcolm McFarland
> > Cavulus
> > 1-800-760-6915
> > mmcfarland@cavulus.com
> >
> >
> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > unauthorized or improper disclosure, copying, distribution, or use of the
> > contents of this message is prohibited. The information contained in this
> > message is intended only for the personal and confidential use of the
> > recipient(s) named above. If you have received this message in error,
> > please notify the sender immediately and delete the original message.
> >



--
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of
the contents of this message is prohibited. The information contained
in this message is intended only for the personal and confidential use
of the recipient(s) named above. If you have received this message in
error, please notify the sender immediately and delete the original
message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Prateek Maheshwari <pr...@gmail.com>.

Hi Malcolm,

I think this is because in YARN 2.6 the FifoScheduler only accounts for
memory for 'maximumAllocation':
https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218

This has been changed as early as 2.7.0:
https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218

So upgrading will likely fix this issue. For reference, at LinkedIn we are
running YARN 2.7.2 with the CapacityScheduler
<https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html>
and DominantResourceCalculator to account for vcore allocations in
scheduling.

- Prateek

On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <mm...@cavulus.com>
wrote:

> Hi Prateek,
>
> This still seems to be manifesting with the same problem. Since this seems
> to be something in the hadoop codebase, and I've emailed the hadoop-dev
> mailing list about it.
>
> Cheers,
> Malcolm
>
> On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <pr...@gmail.com>
> wrote:
>
> > Hi Malcolm,
> >
> > Yes, the AM is just reporting what the RM specified as the maximum
> allowed
> > request size.
> >
> > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less than
> > 'yarn.nodemanager.resource.cpu-vcores', since a container must fit on a
> > single NM. Maybe the RM detected this and decided to default to 1? Can
> you
> > try setting maximum-allocation-vcores lower?
> >
> > - Prateek
> >
> > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> mmcfarland@cavulus.com>
> > wrote:
> >
> > > One other detail: I'm running YARN on ECS in AWS. Has anybody seen
> > > issues with core allocation in this environment? I'm seeing this in
> > > the samza log:
> > >
> > > "Got AM register response. The YARN RM supports container requests
> > > with max-mem: 14336, max-cpu: 1"
> > >
> > > How does samza determine this? Looking at the Samza source on Github,
> > > it appears to be information that's passed back to the AM when it
> > > starts up.
> > >
> > > Cheers,
> > > Malcolm
> > >
> > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> > > <mm...@cavulus.com> wrote:
> > > >
> > > > Hi Prateek,
> > > >
> > > > Sorry, meant to include these versions with my email; I'm running
> > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers across 3
> > > > node managers, each with 16GB and 8 vcores. The other two containers
> > > > are requesting 1 vcore each; even with the AMs running, that should
> be
> > > > 4 for them in total, leaving plenty of processing power available.
> > > >
> > > > The error is in the application attempt diagnostics field: "The YARN
> > > > cluster is unable to run your job due to unsatisfiable resource
> > > > requirements. You asked for mem: 2048, and cpu: 2." I do not see this
> > > > error with the same memory request, but a cpu count request of 1.
> > > >
> > > > Here are the configuration options pertaining to resource allocation:
> > > >
> > > > <?xml version="1.0"?>
> > > > <configuration>
> > > >   <property>
> > > >     <name>yarn.resourcemanager.scheduler.class</name>
> > > >
> > >
> >
> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> > > >     <value>false</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> > > >     <value>2.1</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> > > >     <value>14336</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> > > >     <value>256</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> > > >     <value>14336</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > >     <value>1</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> > > >     <value>16</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > >     <value>8</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.resourcemanager.cluster-id</name>
> > > >     <value>processor-cluster</value>
> > > >   </property>
> > > > </configuration>
> > > >
> > > > Cheers,
> > > > Malcolm
> > > >
> > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> > prateekmi2@gmail.com>
> > > wrote:
> > > > >
> > > > > Hi Malcolm,
> > > > >
> > > > > Just setting that configuration should be sufficient. We haven't
> seen
> > > this
> > > > > issue before. What Samza/YARN versions are you using? Can you also
> > > include
> > > > > the logs from where you get the error and your yarn configuration?
> > > > >
> > > > > - Prateek
> > > > >
> > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> > > mmcfarland@cavulus.com>
> > > > > wrote:
> > > > >
> > > > > > Hey Folks,
> > > > > >
> > > > > > I'm having some issues getting multiple cores for containers in
> > yarn.
> > > > > > I seem to have my YARN settings correct, and the RM interface
> says
> > > > > > that I have 24vcores available. However, when I set the
> > > > > > cluster-manager.container.cpu.cores Samza setting to anything
> other
> > > > > > than 1, I get a message about how the container is requesting
> more
> > > > > > resources than it can allocate. With 1 core, everything is fine.
> Is
> > > > > > there another Samza option I need to set?
> > > > > >
> > > > > > Cheers,
> > > > > > Malcolm
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Malcolm McFarland
> > > > > > Cavulus
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Malcolm McFarland
> > > > Cavulus
> > > > 1-800-760-6915
> > > > mmcfarland@cavulus.com
> > > >
> > > >
> > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > > unauthorized or improper disclosure, copying, distribution, or use of
> > > > the contents of this message is prohibited. The information contained
> > > > in this message is intended only for the personal and confidential
> use
> > > > of the recipient(s) named above. If you have received this message in
> > > > error, please notify the sender immediately and delete the original
> > > > message.
> > >
> > >
> > >
> > > --
> > > Malcolm McFarland
> > > Cavulus
> > > 1-800-760-6915
> > > mmcfarland@cavulus.com
> > >
> > >
> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > unauthorized or improper disclosure, copying, distribution, or use of
> > > the contents of this message is prohibited. The information contained
> > > in this message is intended only for the personal and confidential use
> > > of the recipient(s) named above. If you have received this message in
> > > error, please notify the sender immediately and delete the original
> > > message.
> > >
> >
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of the
> contents of this message is prohibited. The information contained in this
> message is intended only for the personal and confidential use of the
> recipient(s) named above. If you have received this message in error,
> please notify the sender immediately and delete the original message.
>

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

Hi Prateek,

This still seems to be manifesting with the same problem. Since this seems
to be something in the hadoop codebase, and I've emailed the hadoop-dev
mailing list about it.

Cheers,
Malcolm

On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <pr...@gmail.com>
wrote:

> Hi Malcolm,
>
> Yes, the AM is just reporting what the RM specified as the maximum allowed
> request size.
>
> I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less than
> 'yarn.nodemanager.resource.cpu-vcores', since a container must fit on a
> single NM. Maybe the RM detected this and decided to default to 1? Can you
> try setting maximum-allocation-vcores lower?
>
> - Prateek
>
> On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <mm...@cavulus.com>
> wrote:
>
> > One other detail: I'm running YARN on ECS in AWS. Has anybody seen
> > issues with core allocation in this environment? I'm seeing this in
> > the samza log:
> >
> > "Got AM register response. The YARN RM supports container requests
> > with max-mem: 14336, max-cpu: 1"
> >
> > How does samza determine this? Looking at the Samza source on Github,
> > it appears to be information that's passed back to the AM when it
> > starts up.
> >
> > Cheers,
> > Malcolm
> >
> > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> > <mm...@cavulus.com> wrote:
> > >
> > > Hi Prateek,
> > >
> > > Sorry, meant to include these versions with my email; I'm running
> > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers across 3
> > > node managers, each with 16GB and 8 vcores. The other two containers
> > > are requesting 1 vcore each; even with the AMs running, that should be
> > > 4 for them in total, leaving plenty of processing power available.
> > >
> > > The error is in the application attempt diagnostics field: "The YARN
> > > cluster is unable to run your job due to unsatisfiable resource
> > > requirements. You asked for mem: 2048, and cpu: 2." I do not see this
> > > error with the same memory request, but a cpu count request of 1.
> > >
> > > Here are the configuration options pertaining to resource allocation:
> > >
> > > <?xml version="1.0"?>
> > > <configuration>
> > >   <property>
> > >     <name>yarn.resourcemanager.scheduler.class</name>
> > >
> >
> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> > >     <value>false</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> > >     <value>2.1</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.nodemanager.resource.memory-mb</name>
> > >     <value>14336</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> > >     <value>256</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> > >     <value>14336</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> > >     <value>1</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> > >     <value>16</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> > >     <value>8</value>
> > >   </property>
> > >   <property>
> > >     <name>yarn.resourcemanager.cluster-id</name>
> > >     <value>processor-cluster</value>
> > >   </property>
> > > </configuration>
> > >
> > > Cheers,
> > > Malcolm
> > >
> > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> prateekmi2@gmail.com>
> > wrote:
> > > >
> > > > Hi Malcolm,
> > > >
> > > > Just setting that configuration should be sufficient. We haven't seen
> > this
> > > > issue before. What Samza/YARN versions are you using? Can you also
> > include
> > > > the logs from where you get the error and your yarn configuration?
> > > >
> > > > - Prateek
> > > >
> > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> > mmcfarland@cavulus.com>
> > > > wrote:
> > > >
> > > > > Hey Folks,
> > > > >
> > > > > I'm having some issues getting multiple cores for containers in
> yarn.
> > > > > I seem to have my YARN settings correct, and the RM interface says
> > > > > that I have 24vcores available. However, when I set the
> > > > > cluster-manager.container.cpu.cores Samza setting to anything other
> > > > > than 1, I get a message about how the container is requesting more
> > > > > resources than it can allocate. With 1 core, everything is fine. Is
> > > > > there another Samza option I need to set?
> > > > >
> > > > > Cheers,
> > > > > Malcolm
> > > > >
> > > > >
> > > > > --
> > > > > Malcolm McFarland
> > > > > Cavulus
> > > > >
> > >
> > >
> > >
> > > --
> > > Malcolm McFarland
> > > Cavulus
> > > 1-800-760-6915
> > > mmcfarland@cavulus.com
> > >
> > >
> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > unauthorized or improper disclosure, copying, distribution, or use of
> > > the contents of this message is prohibited. The information contained
> > > in this message is intended only for the personal and confidential use
> > > of the recipient(s) named above. If you have received this message in
> > > error, please notify the sender immediately and delete the original
> > > message.
> >
> >
> >
> > --
> > Malcolm McFarland
> > Cavulus
> > 1-800-760-6915
> > mmcfarland@cavulus.com
> >
> >
> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > unauthorized or improper disclosure, copying, distribution, or use of
> > the contents of this message is prohibited. The information contained
> > in this message is intended only for the personal and confidential use
> > of the recipient(s) named above. If you have received this message in
> > error, please notify the sender immediately and delete the original
> > message.
> >
>


-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of the
contents of this message is prohibited. The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If you have received this message in error,
please notify the sender immediately and delete the original message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Prateek Maheshwari <pr...@gmail.com>.

Hi Malcolm,

Yes, the AM is just reporting what the RM specified as the maximum allowed
request size.

I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less than
'yarn.nodemanager.resource.cpu-vcores', since a container must fit on a
single NM. Maybe the RM detected this and decided to default to 1? Can you
try setting maximum-allocation-vcores lower?

- Prateek

On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <mm...@cavulus.com>
wrote:

> One other detail: I'm running YARN on ECS in AWS. Has anybody seen
> issues with core allocation in this environment? I'm seeing this in
> the samza log:
>
> "Got AM register response. The YARN RM supports container requests
> with max-mem: 14336, max-cpu: 1"
>
> How does samza determine this? Looking at the Samza source on Github,
> it appears to be information that's passed back to the AM when it
> starts up.
>
> Cheers,
> Malcolm
>
> On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> <mm...@cavulus.com> wrote:
> >
> > Hi Prateek,
> >
> > Sorry, meant to include these versions with my email; I'm running
> > Samza 0.14 and Hadoop 2.6.1. I'm running three containers across 3
> > node managers, each with 16GB and 8 vcores. The other two containers
> > are requesting 1 vcore each; even with the AMs running, that should be
> > 4 for them in total, leaving plenty of processing power available.
> >
> > The error is in the application attempt diagnostics field: "The YARN
> > cluster is unable to run your job due to unsatisfiable resource
> > requirements. You asked for mem: 2048, and cpu: 2." I do not see this
> > error with the same memory request, but a cpu count request of 1.
> >
> > Here are the configuration options pertaining to resource allocation:
> >
> > <?xml version="1.0"?>
> > <configuration>
> >   <property>
> >     <name>yarn.resourcemanager.scheduler.class</name>
> >
>  <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> >   </property>
> >   <property>
> >     <name>yarn.nodemanager.vmem-check-enabled</name>
> >     <value>false</value>
> >   </property>
> >   <property>
> >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> >     <value>2.1</value>
> >   </property>
> >   <property>
> >     <name>yarn.nodemanager.resource.memory-mb</name>
> >     <value>14336</value>
> >   </property>
> >   <property>
> >     <name>yarn.scheduler.minimum-allocation-mb</name>
> >     <value>256</value>
> >   </property>
> >   <property>
> >     <name>yarn.scheduler.maximum-allocation-mb</name>
> >     <value>14336</value>
> >   </property>
> >   <property>
> >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> >     <value>1</value>
> >   </property>
> >   <property>
> >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> >     <value>16</value>
> >   </property>
> >   <property>
> >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> >     <value>8</value>
> >   </property>
> >   <property>
> >     <name>yarn.resourcemanager.cluster-id</name>
> >     <value>processor-cluster</value>
> >   </property>
> > </configuration>
> >
> > Cheers,
> > Malcolm
> >
> > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <pr...@gmail.com>
> wrote:
> > >
> > > Hi Malcolm,
> > >
> > > Just setting that configuration should be sufficient. We haven't seen
> this
> > > issue before. What Samza/YARN versions are you using? Can you also
> include
> > > the logs from where you get the error and your yarn configuration?
> > >
> > > - Prateek
> > >
> > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> mmcfarland@cavulus.com>
> > > wrote:
> > >
> > > > Hey Folks,
> > > >
> > > > I'm having some issues getting multiple cores for containers in yarn.
> > > > I seem to have my YARN settings correct, and the RM interface says
> > > > that I have 24vcores available. However, when I set the
> > > > cluster-manager.container.cpu.cores Samza setting to anything other
> > > > than 1, I get a message about how the container is requesting more
> > > > resources than it can allocate. With 1 core, everything is fine. Is
> > > > there another Samza option I need to set?
> > > >
> > > > Cheers,
> > > > Malcolm
> > > >
> > > >
> > > > --
> > > > Malcolm McFarland
> > > > Cavulus
> > > >
> >
> >
> >
> > --
> > Malcolm McFarland
> > Cavulus
> > 1-800-760-6915
> > mmcfarland@cavulus.com
> >
> >
> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > unauthorized or improper disclosure, copying, distribution, or use of
> > the contents of this message is prohibited. The information contained
> > in this message is intended only for the personal and confidential use
> > of the recipient(s) named above. If you have received this message in
> > error, please notify the sender immediately and delete the original
> > message.
>
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of
> the contents of this message is prohibited. The information contained
> in this message is intended only for the personal and confidential use
> of the recipient(s) named above. If you have received this message in
> error, please notify the sender immediately and delete the original
> message.
>

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

One other detail: I'm running YARN on ECS in AWS. Has anybody seen
issues with core allocation in this environment? I'm seeing this in
the samza log:

"Got AM register response. The YARN RM supports container requests
with max-mem: 14336, max-cpu: 1"

How does samza determine this? Looking at the Samza source on Github,
it appears to be information that's passed back to the AM when it
starts up.

Cheers,
Malcolm

On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
<mm...@cavulus.com> wrote:
>
> Hi Prateek,
>
> Sorry, meant to include these versions with my email; I'm running
> Samza 0.14 and Hadoop 2.6.1. I'm running three containers across 3
> node managers, each with 16GB and 8 vcores. The other two containers
> are requesting 1 vcore each; even with the AMs running, that should be
> 4 for them in total, leaving plenty of processing power available.
>
> The error is in the application attempt diagnostics field: "The YARN
> cluster is unable to run your job due to unsatisfiable resource
> requirements. You asked for mem: 2048, and cpu: 2." I do not see this
> error with the same memory request, but a cpu count request of 1.
>
> Here are the configuration options pertaining to resource allocation:
>
> <?xml version="1.0"?>
> <configuration>
>   <property>
>     <name>yarn.resourcemanager.scheduler.class</name>
>     <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
>   </property>
>   <property>
>     <name>yarn.nodemanager.vmem-check-enabled</name>
>     <value>false</value>
>   </property>
>   <property>
>     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>     <value>2.1</value>
>   </property>
>   <property>
>     <name>yarn.nodemanager.resource.memory-mb</name>
>     <value>14336</value>
>   </property>
>   <property>
>     <name>yarn.scheduler.minimum-allocation-mb</name>
>     <value>256</value>
>   </property>
>   <property>
>     <name>yarn.scheduler.maximum-allocation-mb</name>
>     <value>14336</value>
>   </property>
>   <property>
>     <name>yarn.scheduler.minimum-allocation-vcores</name>
>     <value>1</value>
>   </property>
>   <property>
>     <name>yarn.scheduler.maximum-allocation-vcores</name>
>     <value>16</value>
>   </property>
>   <property>
>     <name>yarn.nodemanager.resource.cpu-vcores</name>
>     <value>8</value>
>   </property>
>   <property>
>     <name>yarn.resourcemanager.cluster-id</name>
>     <value>processor-cluster</value>
>   </property>
> </configuration>
>
> Cheers,
> Malcolm
>
> On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <pr...@gmail.com> wrote:
> >
> > Hi Malcolm,
> >
> > Just setting that configuration should be sufficient. We haven't seen this
> > issue before. What Samza/YARN versions are you using? Can you also include
> > the logs from where you get the error and your yarn configuration?
> >
> > - Prateek
> >
> > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <mm...@cavulus.com>
> > wrote:
> >
> > > Hey Folks,
> > >
> > > I'm having some issues getting multiple cores for containers in yarn.
> > > I seem to have my YARN settings correct, and the RM interface says
> > > that I have 24vcores available. However, when I set the
> > > cluster-manager.container.cpu.cores Samza setting to anything other
> > > than 1, I get a message about how the container is requesting more
> > > resources than it can allocate. With 1 core, everything is fine. Is
> > > there another Samza option I need to set?
> > >
> > > Cheers,
> > > Malcolm
> > >
> > >
> > > --
> > > Malcolm McFarland
> > > Cavulus
> > >
>
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarland@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of
> the contents of this message is prohibited. The information contained
> in this message is intended only for the personal and confidential use
> of the recipient(s) named above. If you have received this message in
> error, please notify the sender immediately and delete the original
> message.



-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of
the contents of this message is prohibited. The information contained
in this message is intended only for the personal and confidential use
of the recipient(s) named above. If you have received this message in
error, please notify the sender immediately and delete the original
message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Malcolm McFarland <mm...@cavulus.com>.

Hi Prateek,

Sorry, meant to include these versions with my email; I'm running
Samza 0.14 and Hadoop 2.6.1. I'm running three containers across 3
node managers, each with 16GB and 8 vcores. The other two containers
are requesting 1 vcore each; even with the AMs running, that should be
4 for them in total, leaving plenty of processing power available.

The error is in the application attempt diagnostics field: "The YARN
cluster is unable to run your job due to unsatisfiable resource
requirements. You asked for mem: 2048, and cpu: 2." I do not see this
error with the same memory request, but a cpu count request of 1.

Here are the configuration options pertaining to resource allocation:

<?xml version="1.0"?>
<configuration>
  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
  </property>
  <property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>2.1</value>
  </property>
  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>14336</value>
  </property>
  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>256</value>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>14336</value>
  </property>
  <property>
    <name>yarn.scheduler.minimum-allocation-vcores</name>
    <value>1</value>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-vcores</name>
    <value>16</value>
  </property>
  <property>
    <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>8</value>
  </property>
  <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>processor-cluster</value>
  </property>
</configuration>

Cheers,
Malcolm

On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <pr...@gmail.com> wrote:
>
> Hi Malcolm,
>
> Just setting that configuration should be sufficient. We haven't seen this
> issue before. What Samza/YARN versions are you using? Can you also include
> the logs from where you get the error and your yarn configuration?
>
> - Prateek
>
> On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <mm...@cavulus.com>
> wrote:
>
> > Hey Folks,
> >
> > I'm having some issues getting multiple cores for containers in yarn.
> > I seem to have my YARN settings correct, and the RM interface says
> > that I have 24vcores available. However, when I set the
> > cluster-manager.container.cpu.cores Samza setting to anything other
> > than 1, I get a message about how the container is requesting more
> > resources than it can allocate. With 1 core, everything is fine. Is
> > there another Samza option I need to set?
> >
> > Cheers,
> > Malcolm
> >
> >
> > --
> > Malcolm McFarland
> > Cavulus
> >

-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarland@cavulus.com

This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of
the contents of this message is prohibited. The information contained
in this message is intended only for the personal and confidential use
of the recipient(s) named above. If you have received this message in
error, please notify the sender immediately and delete the original
message.

Re: Running w/ multiple CPUs/container on YARN

Posted by Prateek Maheshwari <pr...@gmail.com>.

Hi Malcolm,

Just setting that configuration should be sufficient. We haven't seen this
issue before. What Samza/YARN versions are you using? Can you also include
the logs from where you get the error and your yarn configuration?

- Prateek

On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <mm...@cavulus.com>
wrote:

> Hey Folks,
>
> I'm having some issues getting multiple cores for containers in yarn.
> I seem to have my YARN settings correct, and the RM interface says
> that I have 24vcores available. However, when I set the
> cluster-manager.container.cpu.cores Samza setting to anything other
> than 1, I get a message about how the container is requesting more
> resources than it can allocate. With 1 core, everything is fine. Is
> there another Samza option I need to set?
>
> Cheers,
> Malcolm
>
>
> --
> Malcolm McFarland
> Cavulus
>