You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by "David.Serafini" <Da...@target.com> on 2017/03/09 20:45:51 UTC

AM log file retention

I'm testing Slider 0.91.
The log files for the AM container disappear after a day or so (according to the YARN UI), even though the application is still running the the AM container is not restarted.

Is this a YARN problem or a Slider problem? 
How do I fix it?

thanks,
<dbs>


Re: [EXTERNAL] Re: AM log file retention

Posted by Gour Saha <gs...@hortonworks.com>.
Sorry my bad. It is yarn.nodemanager.log.retain-seconds. Copy paste error. 

2. You see that error because the version of hadoop you have does not show logs of a running app. 

3. Are there any other apps which are in stopped state already? If yes, then can you run the yarn logs cmd on it?

-Gour


> On Mar 13, 2017, at 2:27 PM, David.Serafini <Da...@target.com> wrote:
> 
> 1. the AM container has an empty log dir. The timestamp on the dir is the time when the job was launched.
> The app container (on a different node) has some files in the log dir (command*json, errors*txt, output*txt, status_command*, slider-agent*).   Nothing helpful.
> 
> 2. that command throws an error:  See below.  If I add -show_application_log_info, it shows the two containers  (AM and app) and a message: Application State: Running.
> 
> 3. I'll try it later. I don't want to stop the app right now.
> 
> 4. in yarn-site.xml, 
>    yarn.log-aggregation-enable is true
>    yarn.log-aggregation.retain-seconds is 2592000
>    yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds is 3600
>    yarn.nodemanager.log-aggregation.debug-enabled is false
>    yarn.nodemanager.log-aggregation.num-log-files-per-app is 30
>    yarn.nodemanager.log.retain-second is 604800   #(is this a typo? Should it be retain-seconds (plural)?)
> 
> 
> The "yarn -logs -applicationId ..." error is :
> Unable to parse json from webservice. Error:
> java.lang.Exception: Error parsing JSON object.
> Exception in thread "main" java.io.IOException: javax.ws.rs.WebApplicationException: java.lang.Exception: Error parsing JSON object.
>    at org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:439)
>    at org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1207)
>    at org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsFromRunningApplication(LogsCLI.java:469)
>    at org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:979)
>    at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:300)
>    at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:107)
>    at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:327)
> Caused by: javax.ws.rs.WebApplicationException: java.lang.Exception: Error parsing JSON object.
>    at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider.readFrom(JSONObjectProvider.java:93)
>    at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$App.readFrom(JSONObjectProvider.java:65)
>    at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553)
>    at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506)
>    at org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:428)
>    ... 6 more
> Caused by: java.lang.Exception: Error parsing JSON object.
>    ... 11 more
> Caused by: org.codehaus.jettison.json.JSONException: A JSONObject text must begin with '{' at character 1 of null
>    at org.codehaus.jettison.json.JSONTokener.syntaxError(JSONTokener.java:439)
>    at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:169)
>    at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:266)
>    at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider.readFrom(JSONObjectProvider.java:91)
>    ... 10 more
> 
> 
> On 3/13/17, 1:50 PM, <gs...@hortonworks.com> wrote:
> 
>    Please provide some additional info -
> 
>    1. Can you login to the AM container node and look under the container log
>    dir and see if the logs files are there?
> 
>    2. If you don¹t see the log files in step 1 above, can you run the below
>    yarn cmd-line? Do you see the logs?
>    yarn logs -applicationId <app_id>
> 
> 
>    3. If you don¹t see any logs in step 2 above, then if possible can you
>    stop the app and then run the cmd in step 2 again? Do you see the logs?
> 
>    4. What are the following properties set to in your cluster?
>    yarn.log-aggregation-enable
> 
>    yarn.log-aggregation.retain-seconds
>    yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
> 
>    yarn.nodemanager.log-aggregation.debug-enabled
>    yarn.nodemanager.log-aggregation.num-log-files-per-app
> 
>    -Gour
> 
>>    On 3/9/17, 2:01 PM, "David.Serafini" <Da...@target.com> wrote:
>> 
>> hortonworks 2.5.3.   $(hadoop version) says:
>> 
>> Hadoop 2.7.3.2.5.3.0-37
>> Subversion git@github.com:hortonworks/hadoop.git -r
>> 9828acfdec41a121f0121f556b09e2d112259e92
>> Compiled by jenkins on 2016-11-29T18:06Z
>> Compiled with protoc 2.5.0
>> 
>> 
>> On 3/9/17, 1:31 PM, <gs...@hortonworks.com> wrote:
>> 
>>   Which version of hadoop are you using?
>> 
>>   -Gour
>> 
>>>> On Mar 9, 2017, at 12:46 PM, David.Serafini
>>> <Da...@target.com> wrote:
>>> 
>>> I'm testing Slider 0.91.
>>> The log files for the AM container disappear after a day or so
>> (according to the YARN UI), even though the application is still running
>> the the AM container is not restarted.
>>> 
>>> Is this a YARN problem or a Slider problem?
>>> How do I fix it?
>>> 
>>> thanks,
>>> <dbs>
> 
> 
> 
> 
> 

Re: [EXTERNAL] Re: AM log file retention

Posted by "David.Serafini" <Da...@target.com>.
1. the AM container has an empty log dir. The timestamp on the dir is the time when the job was launched.
The app container (on a different node) has some files in the log dir (command*json, errors*txt, output*txt, status_command*, slider-agent*).   Nothing helpful.

2. that command throws an error:  See below.  If I add -show_application_log_info, it shows the two containers  (AM and app) and a message: Application State: Running.

3. I'll try it later. I don't want to stop the app right now.

4. in yarn-site.xml, 
    yarn.log-aggregation-enable is true
    yarn.log-aggregation.retain-seconds is 2592000
    yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds is 3600
    yarn.nodemanager.log-aggregation.debug-enabled is false
    yarn.nodemanager.log-aggregation.num-log-files-per-app is 30
    yarn.nodemanager.log.retain-second is 604800   #(is this a typo? Should it be retain-seconds (plural)?)


The "yarn -logs -applicationId ..." error is :
Unable to parse json from webservice. Error:
java.lang.Exception: Error parsing JSON object.
Exception in thread "main" java.io.IOException: javax.ws.rs.WebApplicationException: java.lang.Exception: Error parsing JSON object.
	at org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:439)
	at org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1207)
	at org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsFromRunningApplication(LogsCLI.java:469)
	at org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:979)
	at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:300)
	at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:107)
	at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:327)
Caused by: javax.ws.rs.WebApplicationException: java.lang.Exception: Error parsing JSON object.
	at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider.readFrom(JSONObjectProvider.java:93)
	at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$App.readFrom(JSONObjectProvider.java:65)
	at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553)
	at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506)
	at org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:428)
	... 6 more
Caused by: java.lang.Exception: Error parsing JSON object.
	... 11 more
Caused by: org.codehaus.jettison.json.JSONException: A JSONObject text must begin with '{' at character 1 of null
	at org.codehaus.jettison.json.JSONTokener.syntaxError(JSONTokener.java:439)
	at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:169)
	at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:266)
	at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider.readFrom(JSONObjectProvider.java:91)
	... 10 more


On 3/13/17, 1:50 PM, <gs...@hortonworks.com> wrote:

    Please provide some additional info -
    
    1. Can you login to the AM container node and look under the container log
    dir and see if the logs files are there?
    
    2. If you don¹t see the log files in step 1 above, can you run the below
    yarn cmd-line? Do you see the logs?
    yarn logs -applicationId <app_id>
    
    
    3. If you don¹t see any logs in step 2 above, then if possible can you
    stop the app and then run the cmd in step 2 again? Do you see the logs?
    
    4. What are the following properties set to in your cluster?
    yarn.log-aggregation-enable
    
    yarn.log-aggregation.retain-seconds
    yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
    
    yarn.nodemanager.log-aggregation.debug-enabled
    yarn.nodemanager.log-aggregation.num-log-files-per-app
    
    -Gour
    
    On 3/9/17, 2:01 PM, "David.Serafini" <Da...@target.com> wrote:
    
    >hortonworks 2.5.3.   $(hadoop version) says:
    > 
    >Hadoop 2.7.3.2.5.3.0-37
    >Subversion git@github.com:hortonworks/hadoop.git -r
    >9828acfdec41a121f0121f556b09e2d112259e92
    >Compiled by jenkins on 2016-11-29T18:06Z
    >Compiled with protoc 2.5.0
    >  
    >
    >On 3/9/17, 1:31 PM, <gs...@hortonworks.com> wrote:
    >
    >    Which version of hadoop are you using?
    >    
    >    -Gour
    >    
    >    > On Mar 9, 2017, at 12:46 PM, David.Serafini
    ><Da...@target.com> wrote:
    >    > 
    >    > I'm testing Slider 0.91.
    >    > The log files for the AM container disappear after a day or so
    >(according to the YARN UI), even though the application is still running
    >the the AM container is not restarted.
    >    > 
    >    > Is this a YARN problem or a Slider problem?
    >    > How do I fix it?
    >    > 
    >    > thanks,
    >    > <dbs>
    >    > 
    >    
    >    
    >
    
    
    



Re: [EXTERNAL] Re: AM log file retention

Posted by Gour Saha <gs...@hortonworks.com>.
Please provide some additional info -

1. Can you login to the AM container node and look under the container log
dir and see if the logs files are there?

2. If you don¹t see the log files in step 1 above, can you run the below
yarn cmd-line? Do you see the logs?
yarn logs -applicationId <app_id>


3. If you don¹t see any logs in step 2 above, then if possible can you
stop the app and then run the cmd in step 2 again? Do you see the logs?

4. What are the following properties set to in your cluster?
yarn.log-aggregation-enable

yarn.log-aggregation.retain-seconds
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds

yarn.nodemanager.log-aggregation.debug-enabled
yarn.nodemanager.log-aggregation.num-log-files-per-app

-Gour

On 3/9/17, 2:01 PM, "David.Serafini" <Da...@target.com> wrote:

>hortonworks 2.5.3.   $(hadoop version) says:
> 
>Hadoop 2.7.3.2.5.3.0-37
>Subversion git@github.com:hortonworks/hadoop.git -r
>9828acfdec41a121f0121f556b09e2d112259e92
>Compiled by jenkins on 2016-11-29T18:06Z
>Compiled with protoc 2.5.0
>  
>
>On 3/9/17, 1:31 PM, <gs...@hortonworks.com> wrote:
>
>    Which version of hadoop are you using?
>    
>    -Gour
>    
>    > On Mar 9, 2017, at 12:46 PM, David.Serafini
><Da...@target.com> wrote:
>    > 
>    > I'm testing Slider 0.91.
>    > The log files for the AM container disappear after a day or so
>(according to the YARN UI), even though the application is still running
>the the AM container is not restarted.
>    > 
>    > Is this a YARN problem or a Slider problem?
>    > How do I fix it?
>    > 
>    > thanks,
>    > <dbs>
>    > 
>    
>    
>


Re: [EXTERNAL] Re: AM log file retention

Posted by "David.Serafini" <Da...@target.com>.
hortonworks 2.5.3.   $(hadoop version) says:
 
Hadoop 2.7.3.2.5.3.0-37
Subversion git@github.com:hortonworks/hadoop.git -r 9828acfdec41a121f0121f556b09e2d112259e92
Compiled by jenkins on 2016-11-29T18:06Z
Compiled with protoc 2.5.0
  

On 3/9/17, 1:31 PM, <gs...@hortonworks.com> wrote:

    Which version of hadoop are you using?
    
    -Gour
    
    > On Mar 9, 2017, at 12:46 PM, David.Serafini <Da...@target.com> wrote:
    > 
    > I'm testing Slider 0.91.
    > The log files for the AM container disappear after a day or so (according to the YARN UI), even though the application is still running the the AM container is not restarted.
    > 
    > Is this a YARN problem or a Slider problem? 
    > How do I fix it?
    > 
    > thanks,
    > <dbs>
    > 
    
    


Re: AM log file retention

Posted by Gour Saha <gs...@hortonworks.com>.
Which version of hadoop are you using?

-Gour

> On Mar 9, 2017, at 12:46 PM, David.Serafini <Da...@target.com> wrote:
> 
> I'm testing Slider 0.91.
> The log files for the AM container disappear after a day or so (according to the YARN UI), even though the application is still running the the AM container is not restarted.
> 
> Is this a YARN problem or a Slider problem? 
> How do I fix it?
> 
> thanks,
> <dbs>
>