You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by Sumit Mohanty <sm...@hortonworks.com> on 2014/05/13 21:58:27 UTC

Exposing log file locations for live and inactive containers

For each container being activated, the slider agent knows the log
location. If the component instance (within the container) fails to
activate properly or goes down then the user may need access to the log
locations for debugging. *This assumes that appropriate yarn config is in
place to have the log files hang around for a while.*

What do you think about the capability to have the agent report back these
information (log locations etc.) and have them published. There needs to be
a limit on how long the information should be available as number of failed
containers may grow over time. An obvious limit could be the yarn
configuration of how long the logs are saved on the host.

Should we publish them as:

{conatiner_id: {
    "hostname":"name of the host",
    "agent_log": "folder path to agent log",
    "app_log": "folder path to app log" }}

We can add more diagnostics information as needed.

Any suggestion on a good location to publish this?

-Sumit

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Exposing log file locations for live and inactive containers

Posted by Jon Maron <jm...@hortonworks.com>.
On May 13, 2014, at 5:17 PM, Jon Maron <jm...@hortonworks.com> wrote:

> 
> On May 13, 2014, at 4:43 PM, Sumit Mohanty <sm...@hortonworks.com> wrote:
> 
>> Makes sense to add the component name.
>> 
>> {conatiner_id: {
>>   *"componentname":"name of the component",*
>>   "hostname":"name of the host",
>>   "agent_log": "folder path to agent log",
>>   "app_log": "folder path to app log" }}
>> 
>> These are local paths on the host and hence are transient. We can also add
>> the location in HDFS if thats available.
> 
> So the expectation is that the user will has access to the host box on which the container is running and will log in to retrieve the information?
> 
>> 
>> /ws/v1/slider/diagnostics/container_id seems fine as long as
>> /ws/v1/slider/diagnostics
>> can list all the containers.
> 
> probably should be /ws/v1/slider/diagnostics/appName to retrieve the above structure for the given application’s containers.

Never mind.  Not necessary given the interaction with the specific application’s AM.

> 
>> 
>> Not sure if we need to make it available at root. Might be too busy.
>> 
>> -Sumit
>> 
>> 
>> On Tue, May 13, 2014 at 1:27 PM, Jon Maron <jm...@hortonworks.com> wrote:
>> 
>>> 
>>> On May 13, 2014, at 3:58 PM, Sumit Mohanty <sm...@hortonworks.com>
>>> wrote:
>>> 
>>>> For each container being activated, the slider agent knows the log
>>>> location. If the component instance (within the container) fails to
>>>> activate properly or goes down then the user may need access to the log
>>>> locations for debugging. *This assumes that appropriate yarn config is in
>>>> place to have the log files hang around for a while.*
>>>> 
>>>> What do you think about the capability to have the agent report back
>>> these
>>>> information (log locations etc.) and have them published. There needs to
>>> be
>>>> a limit on how long the information should be available as number of
>>> failed
>>>> containers may grow over time. An obvious limit could be the yarn
>>>> configuration of how long the logs are saved on the host.
>>>> 
>>>> Should we publish them as:
>>>> 
>>>> {conatiner_id: {
>>>>  "hostname":"name of the host",
>>>>  "agent_log": "folder path to agent log",
>>>>  "app_log": "folder path to app log" }}
>>> 
>>> - would probably help to add “component:” or “role:” property to ease
>>> correlation with the given component
>>> - are the paths in HDFS or do they reference an exposed endpoint that
>>> allows for the download of the logs (e.g. http://host:port
>>> /ws/v1/slider/diagnostics/someLogName)?
>>> 
>>>> 
>>>> We can add more diagnostics information as needed.
>>>> 
>>>> Any suggestion on a good location to publish this?
>>> 
>>> Perhaps a root resource that lists the information for the cluster is
>>> where you’d start:  /ws/v1/slider/diagnostics.  Individual entries (with
>>> perhaps more detail) could be retrieved as
>>> /ws/v1/slider/diagnostics/container_id (these hrefs should probably be
>>> available as properties in the root resource)
>>> 
>>>> 
>>>> -Sumit
>>>> 
>>>> --
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or entity
>>> to
>>>> which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the reader
>>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>>> any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>> immediately
>>>> and delete it from your system. Thank You.
>>> 
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>> 
>> 
>> -- 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to 
>> which it is addressed and may contain information that is confidential, 
>> privileged and exempt from disclosure under applicable law. If the reader 
>> of this message is not the intended recipient, you are hereby notified that 
>> any printing, copying, dissemination, distribution, disclosure or 
>> forwarding of this communication is strictly prohibited. If you have 
>> received this communication in error, please contact the sender immediately 
>> and delete it from your system. Thank You.
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Exposing log file locations for live and inactive containers

Posted by Jon Maron <jm...@hortonworks.com>.
On May 13, 2014, at 4:43 PM, Sumit Mohanty <sm...@hortonworks.com> wrote:

> Makes sense to add the component name.
> 
> {conatiner_id: {
>    *"componentname":"name of the component",*
>    "hostname":"name of the host",
>    "agent_log": "folder path to agent log",
>    "app_log": "folder path to app log" }}
> 
> These are local paths on the host and hence are transient. We can also add
> the location in HDFS if thats available.

So the expectation is that the user will has access to the host box on which the container is running and will log in to retrieve the information?

> 
> /ws/v1/slider/diagnostics/container_id seems fine as long as
> /ws/v1/slider/diagnostics
> can list all the containers.

probably should be /ws/v1/slider/diagnostics/appName to retrieve the above structure for the given application’s containers.

> 
> Not sure if we need to make it available at root. Might be too busy.
> 
> -Sumit
> 
> 
> On Tue, May 13, 2014 at 1:27 PM, Jon Maron <jm...@hortonworks.com> wrote:
> 
>> 
>> On May 13, 2014, at 3:58 PM, Sumit Mohanty <sm...@hortonworks.com>
>> wrote:
>> 
>>> For each container being activated, the slider agent knows the log
>>> location. If the component instance (within the container) fails to
>>> activate properly or goes down then the user may need access to the log
>>> locations for debugging. *This assumes that appropriate yarn config is in
>>> place to have the log files hang around for a while.*
>>> 
>>> What do you think about the capability to have the agent report back
>> these
>>> information (log locations etc.) and have them published. There needs to
>> be
>>> a limit on how long the information should be available as number of
>> failed
>>> containers may grow over time. An obvious limit could be the yarn
>>> configuration of how long the logs are saved on the host.
>>> 
>>> Should we publish them as:
>>> 
>>> {conatiner_id: {
>>>   "hostname":"name of the host",
>>>   "agent_log": "folder path to agent log",
>>>   "app_log": "folder path to app log" }}
>> 
>> - would probably help to add “component:” or “role:” property to ease
>> correlation with the given component
>> - are the paths in HDFS or do they reference an exposed endpoint that
>> allows for the download of the logs (e.g. http://host:port
>> /ws/v1/slider/diagnostics/someLogName)?
>> 
>>> 
>>> We can add more diagnostics information as needed.
>>> 
>>> Any suggestion on a good location to publish this?
>> 
>> Perhaps a root resource that lists the information for the cluster is
>> where you’d start:  /ws/v1/slider/diagnostics.  Individual entries (with
>> perhaps more detail) could be retrieved as
>> /ws/v1/slider/diagnostics/container_id (these hrefs should probably be
>> available as properties in the root resource)
>> 
>>> 
>>> -Sumit
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>> immediately
>>> and delete it from your system. Thank You.
>> 
>> 
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>> 
> 
> -- 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader 
> of this message is not the intended recipient, you are hereby notified that 
> any printing, copying, dissemination, distribution, disclosure or 
> forwarding of this communication is strictly prohibited. If you have 
> received this communication in error, please contact the sender immediately 
> and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Exposing log file locations for live and inactive containers

Posted by Sumit Mohanty <sm...@hortonworks.com>.
Makes sense to add the component name.

{conatiner_id: {
    *"componentname":"name of the component",*
    "hostname":"name of the host",
    "agent_log": "folder path to agent log",
    "app_log": "folder path to app log" }}

These are local paths on the host and hence are transient. We can also add
the location in HDFS if thats available.

/ws/v1/slider/diagnostics/container_id seems fine as long as
/ws/v1/slider/diagnostics
can list all the containers.

 Not sure if we need to make it available at root. Might be too busy.

-Sumit


On Tue, May 13, 2014 at 1:27 PM, Jon Maron <jm...@hortonworks.com> wrote:

>
> On May 13, 2014, at 3:58 PM, Sumit Mohanty <sm...@hortonworks.com>
> wrote:
>
> > For each container being activated, the slider agent knows the log
> > location. If the component instance (within the container) fails to
> > activate properly or goes down then the user may need access to the log
> > locations for debugging. *This assumes that appropriate yarn config is in
> > place to have the log files hang around for a while.*
> >
> > What do you think about the capability to have the agent report back
> these
> > information (log locations etc.) and have them published. There needs to
> be
> > a limit on how long the information should be available as number of
> failed
> > containers may grow over time. An obvious limit could be the yarn
> > configuration of how long the logs are saved on the host.
> >
> > Should we publish them as:
> >
> > {conatiner_id: {
> >    "hostname":"name of the host",
> >    "agent_log": "folder path to agent log",
> >    "app_log": "folder path to app log" }}
>
> - would probably help to add “component:” or “role:” property to ease
> correlation with the given component
> - are the paths in HDFS or do they reference an exposed endpoint that
> allows for the download of the logs (e.g. http://host:port
> /ws/v1/slider/diagnostics/someLogName)?
>
> >
> > We can add more diagnostics information as needed.
> >
> > Any suggestion on a good location to publish this?
>
> Perhaps a root resource that lists the information for the cluster is
> where you’d start:  /ws/v1/slider/diagnostics.  Individual entries (with
> perhaps more detail) could be retrieved as
> /ws/v1/slider/diagnostics/container_id (these hrefs should probably be
> available as properties in the root resource)
>
> >
> > -Sumit
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Exposing log file locations for live and inactive containers

Posted by Jon Maron <jm...@hortonworks.com>.
On May 13, 2014, at 3:58 PM, Sumit Mohanty <sm...@hortonworks.com> wrote:

> For each container being activated, the slider agent knows the log
> location. If the component instance (within the container) fails to
> activate properly or goes down then the user may need access to the log
> locations for debugging. *This assumes that appropriate yarn config is in
> place to have the log files hang around for a while.*
> 
> What do you think about the capability to have the agent report back these
> information (log locations etc.) and have them published. There needs to be
> a limit on how long the information should be available as number of failed
> containers may grow over time. An obvious limit could be the yarn
> configuration of how long the logs are saved on the host.
> 
> Should we publish them as:
> 
> {conatiner_id: {
>    "hostname":"name of the host",
>    "agent_log": "folder path to agent log",
>    "app_log": "folder path to app log" }}

- would probably help to add “component:” or “role:” property to ease correlation with the given component
- are the paths in HDFS or do they reference an exposed endpoint that allows for the download of the logs (e.g. http://host:port/ws/v1/slider/diagnostics/someLogName)?

> 
> We can add more diagnostics information as needed.
> 
> Any suggestion on a good location to publish this?

Perhaps a root resource that lists the information for the cluster is where you’d start:  /ws/v1/slider/diagnostics.  Individual entries (with perhaps more detail) could be retrieved as /ws/v1/slider/diagnostics/container_id (these hrefs should probably be available as properties in the root resource)

> 
> -Sumit
> 
> -- 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader 
> of this message is not the intended recipient, you are hereby notified that 
> any printing, copying, dissemination, distribution, disclosure or 
> forwarding of this communication is strictly prohibited. If you have 
> received this communication in error, please contact the sender immediately 
> and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.