You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by "David.Serafini" <Da...@target.com> on 2017/03/09 20:45:51 UTC
AM log file retention
I'm testing Slider 0.91.
The log files for the AM container disappear after a day or so (according to the YARN UI), even though the application is still running the the AM container is not restarted.
Is this a YARN problem or a Slider problem?
How do I fix it?
thanks,
<dbs>
Re: [EXTERNAL] Re: AM log file retention
Posted by Gour Saha <gs...@hortonworks.com>.
Sorry my bad. It is yarn.nodemanager.log.retain-seconds. Copy paste error.
2. You see that error because the version of hadoop you have does not show logs of a running app.
3. Are there any other apps which are in stopped state already? If yes, then can you run the yarn logs cmd on it?
-Gour
> On Mar 13, 2017, at 2:27 PM, David.Serafini <Da...@target.com> wrote:
>
> 1. the AM container has an empty log dir. The timestamp on the dir is the time when the job was launched.
> The app container (on a different node) has some files in the log dir (command*json, errors*txt, output*txt, status_command*, slider-agent*). Nothing helpful.
>
> 2. that command throws an error: See below. If I add -show_application_log_info, it shows the two containers (AM and app) and a message: Application State: Running.
>
> 3. I'll try it later. I don't want to stop the app right now.
>
> 4. in yarn-site.xml,
> yarn.log-aggregation-enable is true
> yarn.log-aggregation.retain-seconds is 2592000
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds is 3600
> yarn.nodemanager.log-aggregation.debug-enabled is false
> yarn.nodemanager.log-aggregation.num-log-files-per-app is 30
> yarn.nodemanager.log.retain-second is 604800 #(is this a typo? Should it be retain-seconds (plural)?)
>
>
> The "yarn -logs -applicationId ..." error is :
> Unable to parse json from webservice. Error:
> java.lang.Exception: Error parsing JSON object.
> Exception in thread "main" java.io.IOException: javax.ws.rs.WebApplicationException: java.lang.Exception: Error parsing JSON object.
> at org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:439)
> at org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1207)
> at org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsFromRunningApplication(LogsCLI.java:469)
> at org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:979)
> at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:300)
> at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:107)
> at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:327)
> Caused by: javax.ws.rs.WebApplicationException: java.lang.Exception: Error parsing JSON object.
> at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider.readFrom(JSONObjectProvider.java:93)
> at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$App.readFrom(JSONObjectProvider.java:65)
> at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553)
> at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506)
> at org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:428)
> ... 6 more
> Caused by: java.lang.Exception: Error parsing JSON object.
> ... 11 more
> Caused by: org.codehaus.jettison.json.JSONException: A JSONObject text must begin with '{' at character 1 of null
> at org.codehaus.jettison.json.JSONTokener.syntaxError(JSONTokener.java:439)
> at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:169)
> at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:266)
> at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider.readFrom(JSONObjectProvider.java:91)
> ... 10 more
>
>
> On 3/13/17, 1:50 PM, <gs...@hortonworks.com> wrote:
>
> Please provide some additional info -
>
> 1. Can you login to the AM container node and look under the container log
> dir and see if the logs files are there?
>
> 2. If you don¹t see the log files in step 1 above, can you run the below
> yarn cmd-line? Do you see the logs?
> yarn logs -applicationId <app_id>
>
>
> 3. If you don¹t see any logs in step 2 above, then if possible can you
> stop the app and then run the cmd in step 2 again? Do you see the logs?
>
> 4. What are the following properties set to in your cluster?
> yarn.log-aggregation-enable
>
> yarn.log-aggregation.retain-seconds
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>
> yarn.nodemanager.log-aggregation.debug-enabled
> yarn.nodemanager.log-aggregation.num-log-files-per-app
>
> -Gour
>
>> On 3/9/17, 2:01 PM, "David.Serafini" <Da...@target.com> wrote:
>>
>> hortonworks 2.5.3. $(hadoop version) says:
>>
>> Hadoop 2.7.3.2.5.3.0-37
>> Subversion git@github.com:hortonworks/hadoop.git -r
>> 9828acfdec41a121f0121f556b09e2d112259e92
>> Compiled by jenkins on 2016-11-29T18:06Z
>> Compiled with protoc 2.5.0
>>
>>
>> On 3/9/17, 1:31 PM, <gs...@hortonworks.com> wrote:
>>
>> Which version of hadoop are you using?
>>
>> -Gour
>>
>>>> On Mar 9, 2017, at 12:46 PM, David.Serafini
>>> <Da...@target.com> wrote:
>>>
>>> I'm testing Slider 0.91.
>>> The log files for the AM container disappear after a day or so
>> (according to the YARN UI), even though the application is still running
>> the the AM container is not restarted.
>>>
>>> Is this a YARN problem or a Slider problem?
>>> How do I fix it?
>>>
>>> thanks,
>>> <dbs>
>
>
>
>
>
Re: [EXTERNAL] Re: AM log file retention
Posted by "David.Serafini" <Da...@target.com>.
1. the AM container has an empty log dir. The timestamp on the dir is the time when the job was launched.
The app container (on a different node) has some files in the log dir (command*json, errors*txt, output*txt, status_command*, slider-agent*). Nothing helpful.
2. that command throws an error: See below. If I add -show_application_log_info, it shows the two containers (AM and app) and a message: Application State: Running.
3. I'll try it later. I don't want to stop the app right now.
4. in yarn-site.xml,
yarn.log-aggregation-enable is true
yarn.log-aggregation.retain-seconds is 2592000
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds is 3600
yarn.nodemanager.log-aggregation.debug-enabled is false
yarn.nodemanager.log-aggregation.num-log-files-per-app is 30
yarn.nodemanager.log.retain-second is 604800 #(is this a typo? Should it be retain-seconds (plural)?)
The "yarn -logs -applicationId ..." error is :
Unable to parse json from webservice. Error:
java.lang.Exception: Error parsing JSON object.
Exception in thread "main" java.io.IOException: javax.ws.rs.WebApplicationException: java.lang.Exception: Error parsing JSON object.
at org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:439)
at org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1207)
at org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsFromRunningApplication(LogsCLI.java:469)
at org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:979)
at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:300)
at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:107)
at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:327)
Caused by: javax.ws.rs.WebApplicationException: java.lang.Exception: Error parsing JSON object.
at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider.readFrom(JSONObjectProvider.java:93)
at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$App.readFrom(JSONObjectProvider.java:65)
at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553)
at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506)
at org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:428)
... 6 more
Caused by: java.lang.Exception: Error parsing JSON object.
... 11 more
Caused by: org.codehaus.jettison.json.JSONException: A JSONObject text must begin with '{' at character 1 of null
at org.codehaus.jettison.json.JSONTokener.syntaxError(JSONTokener.java:439)
at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:169)
at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:266)
at com.sun.jersey.json.impl.provider.entity.JSONObjectProvider.readFrom(JSONObjectProvider.java:91)
... 10 more
On 3/13/17, 1:50 PM, <gs...@hortonworks.com> wrote:
Please provide some additional info -
1. Can you login to the AM container node and look under the container log
dir and see if the logs files are there?
2. If you don¹t see the log files in step 1 above, can you run the below
yarn cmd-line? Do you see the logs?
yarn logs -applicationId <app_id>
3. If you don¹t see any logs in step 2 above, then if possible can you
stop the app and then run the cmd in step 2 again? Do you see the logs?
4. What are the following properties set to in your cluster?
yarn.log-aggregation-enable
yarn.log-aggregation.retain-seconds
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
yarn.nodemanager.log-aggregation.debug-enabled
yarn.nodemanager.log-aggregation.num-log-files-per-app
-Gour
On 3/9/17, 2:01 PM, "David.Serafini" <Da...@target.com> wrote:
>hortonworks 2.5.3. $(hadoop version) says:
>
>Hadoop 2.7.3.2.5.3.0-37
>Subversion git@github.com:hortonworks/hadoop.git -r
>9828acfdec41a121f0121f556b09e2d112259e92
>Compiled by jenkins on 2016-11-29T18:06Z
>Compiled with protoc 2.5.0
>
>
>On 3/9/17, 1:31 PM, <gs...@hortonworks.com> wrote:
>
> Which version of hadoop are you using?
>
> -Gour
>
> > On Mar 9, 2017, at 12:46 PM, David.Serafini
><Da...@target.com> wrote:
> >
> > I'm testing Slider 0.91.
> > The log files for the AM container disappear after a day or so
>(according to the YARN UI), even though the application is still running
>the the AM container is not restarted.
> >
> > Is this a YARN problem or a Slider problem?
> > How do I fix it?
> >
> > thanks,
> > <dbs>
> >
>
>
>
Re: [EXTERNAL] Re: AM log file retention
Posted by Gour Saha <gs...@hortonworks.com>.
Please provide some additional info -
1. Can you login to the AM container node and look under the container log
dir and see if the logs files are there?
2. If you don¹t see the log files in step 1 above, can you run the below
yarn cmd-line? Do you see the logs?
yarn logs -applicationId <app_id>
3. If you don¹t see any logs in step 2 above, then if possible can you
stop the app and then run the cmd in step 2 again? Do you see the logs?
4. What are the following properties set to in your cluster?
yarn.log-aggregation-enable
yarn.log-aggregation.retain-seconds
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
yarn.nodemanager.log-aggregation.debug-enabled
yarn.nodemanager.log-aggregation.num-log-files-per-app
-Gour
On 3/9/17, 2:01 PM, "David.Serafini" <Da...@target.com> wrote:
>hortonworks 2.5.3. $(hadoop version) says:
>
>Hadoop 2.7.3.2.5.3.0-37
>Subversion git@github.com:hortonworks/hadoop.git -r
>9828acfdec41a121f0121f556b09e2d112259e92
>Compiled by jenkins on 2016-11-29T18:06Z
>Compiled with protoc 2.5.0
>
>
>On 3/9/17, 1:31 PM, <gs...@hortonworks.com> wrote:
>
> Which version of hadoop are you using?
>
> -Gour
>
> > On Mar 9, 2017, at 12:46 PM, David.Serafini
><Da...@target.com> wrote:
> >
> > I'm testing Slider 0.91.
> > The log files for the AM container disappear after a day or so
>(according to the YARN UI), even though the application is still running
>the the AM container is not restarted.
> >
> > Is this a YARN problem or a Slider problem?
> > How do I fix it?
> >
> > thanks,
> > <dbs>
> >
>
>
>
Re: [EXTERNAL] Re: AM log file retention
Posted by "David.Serafini" <Da...@target.com>.
hortonworks 2.5.3. $(hadoop version) says:
Hadoop 2.7.3.2.5.3.0-37
Subversion git@github.com:hortonworks/hadoop.git -r 9828acfdec41a121f0121f556b09e2d112259e92
Compiled by jenkins on 2016-11-29T18:06Z
Compiled with protoc 2.5.0
On 3/9/17, 1:31 PM, <gs...@hortonworks.com> wrote:
Which version of hadoop are you using?
-Gour
> On Mar 9, 2017, at 12:46 PM, David.Serafini <Da...@target.com> wrote:
>
> I'm testing Slider 0.91.
> The log files for the AM container disappear after a day or so (according to the YARN UI), even though the application is still running the the AM container is not restarted.
>
> Is this a YARN problem or a Slider problem?
> How do I fix it?
>
> thanks,
> <dbs>
>
Re: AM log file retention
Posted by Gour Saha <gs...@hortonworks.com>.
Which version of hadoop are you using?
-Gour
> On Mar 9, 2017, at 12:46 PM, David.Serafini <Da...@target.com> wrote:
>
> I'm testing Slider 0.91.
> The log files for the AM container disappear after a day or so (according to the YARN UI), even though the application is still running the the AM container is not restarted.
>
> Is this a YARN problem or a Slider problem?
> How do I fix it?
>
> thanks,
> <dbs>
>