You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by "rajila2008 ." <ra...@gmail.com> on 2017/07/24 22:16:23 UTC

YARN - level/depth of monitoring info - newbie question

Hi all,

Does YARN provide application level info ?!

For example : there is a map-reduce job persisting its outcome in a NoSql
datastore by executing an INSERT command.
Can YARN provide the execution time for the INSERT ?  without the
application itself is not logging the info anywhere.

There's some argument at workplace , Dev team asking prod-support to find
such info thru YARN logs.

I believe "YARN's resource reporting"  is similar to "unix top" command,
but at cluster level .   "top" give system level info , not how many
INSERTs a job executed.
Similarly YARN will not give application specific info like the number of
INSERT op OR record count OR array size with a job ,  the application needs
to log such info as needed.

Could anyone please clarify if my understanding correct ?!

Regards,
Rajila

Re: YARN - level/depth of monitoring info - newbie question

Posted by Naganarasimha Garla <na...@apache.org>.
Hi Rajila,
Sorry for the delayed reply,
You can refer to
http://hadoop.apache.org/docs/r3.0.0-alpha4/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Counters

or more detailed info is available in the book* "Hadoop- The Definitive
Guide, 4th Edition" -> Chapter 9 MapReduce features" *-> "*Counters"* -> *"User
Defined Java Counter". *

Regards,
+ Naga

On Wed, Jul 26, 2017 at 9:27 AM, rajila2008 . <ra...@gmail.com> wrote:

> Thank you Naga & Sunil .
>
> Naga, Would like to know more about the counters ;  Are they a cluster
> wide resource managed at a central location - so they can be
> tracked/verified later ?!
>
> Please advise
>
> Thanks,
> Rajila
>
> On Tue, Jul 25, 2017 at 7:01 PM, Naganarasimha Garla <
> naganarasimha_gr@apache.org> wrote:
>
>> Hi Rajila,
>>               One option you can think of is using custom "counters" and
>> have a logic to increment them when ever you insert or have any custom
>> logic. These counters can be got from the MR interfaces and even in the web
>> ui even after the job has finished.
>>
>> Regards,
>> + Naga
>>
>> On Tue, Jul 25, 2017 at 7:12 PM, Sunil Govind <su...@gmail.com>
>> wrote:
>>
>>> Hi Rajila,
>>>
>>> From YARN side, you will be able to get detailed information about the
>>> application. And that application could be MapReduce or anything. But in
>>> side that mapreduce app, what kind of operation is done, its specific to
>>> that application (here its mapreduce).
>>>
>>> YARN could only be able to give you time/memory/cpu usage w.r.t app or
>>> atmost at node level.
>>>
>>> Thanks
>>> Sunil
>>>
>>> On Tue, Jul 25, 2017 at 3:46 AM rajila2008 . <ra...@gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Does YARN provide application level info ?!
>>>>
>>>> For example : there is a map-reduce job persisting its outcome in a
>>>> NoSql datastore by executing an INSERT command.
>>>> Can YARN provide the execution time for the INSERT ?  without the
>>>> application itself is not logging the info anywhere.
>>>>
>>>> There's some argument at workplace , Dev team asking prod-support to
>>>> find such info thru YARN logs.
>>>>
>>>> I believe "YARN's resource reporting"  is similar to "unix top"
>>>> command, but at cluster level .   "top" give system level info , not how
>>>> many INSERTs a job executed.
>>>> Similarly YARN will not give application specific info like the number
>>>> of INSERT op OR record count OR array size with a job ,  the application
>>>> needs to log such info as needed.
>>>>
>>>> Could anyone please clarify if my understanding correct ?!
>>>>
>>>> Regards,
>>>> Rajila
>>>>
>>>
>>
>

Re: YARN - level/depth of monitoring info - newbie question

Posted by "rajila2008 ." <ra...@gmail.com>.
Thank you Naga & Sunil .

Naga, Would like to know more about the counters ;  Are they a cluster wide
resource managed at a central location - so they can be tracked/verified
later ?!

Please advise

Thanks,
Rajila

On Tue, Jul 25, 2017 at 7:01 PM, Naganarasimha Garla <
naganarasimha_gr@apache.org> wrote:

> Hi Rajila,
>               One option you can think of is using custom "counters" and
> have a logic to increment them when ever you insert or have any custom
> logic. These counters can be got from the MR interfaces and even in the web
> ui even after the job has finished.
>
> Regards,
> + Naga
>
> On Tue, Jul 25, 2017 at 7:12 PM, Sunil Govind <su...@gmail.com>
> wrote:
>
>> Hi Rajila,
>>
>> From YARN side, you will be able to get detailed information about the
>> application. And that application could be MapReduce or anything. But in
>> side that mapreduce app, what kind of operation is done, its specific to
>> that application (here its mapreduce).
>>
>> YARN could only be able to give you time/memory/cpu usage w.r.t app or
>> atmost at node level.
>>
>> Thanks
>> Sunil
>>
>> On Tue, Jul 25, 2017 at 3:46 AM rajila2008 . <ra...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> Does YARN provide application level info ?!
>>>
>>> For example : there is a map-reduce job persisting its outcome in a
>>> NoSql datastore by executing an INSERT command.
>>> Can YARN provide the execution time for the INSERT ?  without the
>>> application itself is not logging the info anywhere.
>>>
>>> There's some argument at workplace , Dev team asking prod-support to
>>> find such info thru YARN logs.
>>>
>>> I believe "YARN's resource reporting"  is similar to "unix top" command,
>>> but at cluster level .   "top" give system level info , not how many
>>> INSERTs a job executed.
>>> Similarly YARN will not give application specific info like the number
>>> of INSERT op OR record count OR array size with a job ,  the application
>>> needs to log such info as needed.
>>>
>>> Could anyone please clarify if my understanding correct ?!
>>>
>>> Regards,
>>> Rajila
>>>
>>
>

Re: YARN - level/depth of monitoring info - newbie question

Posted by Naganarasimha Garla <na...@apache.org>.
Hi Rajila,
              One option you can think of is using custom "counters" and
have a logic to increment them when ever you insert or have any custom
logic. These counters can be got from the MR interfaces and even in the web
ui even after the job has finished.

Regards,
+ Naga

On Tue, Jul 25, 2017 at 7:12 PM, Sunil Govind <su...@gmail.com>
wrote:

> Hi Rajila,
>
> From YARN side, you will be able to get detailed information about the
> application. And that application could be MapReduce or anything. But in
> side that mapreduce app, what kind of operation is done, its specific to
> that application (here its mapreduce).
>
> YARN could only be able to give you time/memory/cpu usage w.r.t app or
> atmost at node level.
>
> Thanks
> Sunil
>
> On Tue, Jul 25, 2017 at 3:46 AM rajila2008 . <ra...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> Does YARN provide application level info ?!
>>
>> For example : there is a map-reduce job persisting its outcome in a NoSql
>> datastore by executing an INSERT command.
>> Can YARN provide the execution time for the INSERT ?  without the
>> application itself is not logging the info anywhere.
>>
>> There's some argument at workplace , Dev team asking prod-support to find
>> such info thru YARN logs.
>>
>> I believe "YARN's resource reporting"  is similar to "unix top" command,
>> but at cluster level .   "top" give system level info , not how many
>> INSERTs a job executed.
>> Similarly YARN will not give application specific info like the number of
>> INSERT op OR record count OR array size with a job ,  the application needs
>> to log such info as needed.
>>
>> Could anyone please clarify if my understanding correct ?!
>>
>> Regards,
>> Rajila
>>
>

Re: YARN - level/depth of monitoring info - newbie question

Posted by Sunil Govind <su...@gmail.com>.
Hi Rajila,

From YARN side, you will be able to get detailed information about the
application. And that application could be MapReduce or anything. But in
side that mapreduce app, what kind of operation is done, its specific to
that application (here its mapreduce).

YARN could only be able to give you time/memory/cpu usage w.r.t app or
atmost at node level.

Thanks
Sunil

On Tue, Jul 25, 2017 at 3:46 AM rajila2008 . <ra...@gmail.com> wrote:

> Hi all,
>
> Does YARN provide application level info ?!
>
> For example : there is a map-reduce job persisting its outcome in a NoSql
> datastore by executing an INSERT command.
> Can YARN provide the execution time for the INSERT ?  without the
> application itself is not logging the info anywhere.
>
> There's some argument at workplace , Dev team asking prod-support to find
> such info thru YARN logs.
>
> I believe "YARN's resource reporting"  is similar to "unix top" command,
> but at cluster level .   "top" give system level info , not how many
> INSERTs a job executed.
> Similarly YARN will not give application specific info like the number of
> INSERT op OR record count OR array size with a job ,  the application needs
> to log such info as needed.
>
> Could anyone please clarify if my understanding correct ?!
>
> Regards,
> Rajila
>

Re: YARN - level/depth of monitoring info - newbie question

Posted by Sunil Govind <su...@gmail.com>.
Hi Rajila,