You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Nikhil Mulley (JIRA)" <ji...@apache.org> on 2015/03/27 04:32:53 UTC

[jira] [Commented] (YARN-3403) Nodemanager dies after a small typo in mapred-site.xml is induced

    [ https://issues.apache.org/jira/browse/YARN-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383285#comment-14383285 ] 

Nikhil Mulley commented on YARN-3403:
-------------------------------------

The more stack trace is here:  this is reproducible.

---
2015-03-26 20:04:43,690 FATAL org.apache.hadoop.conf.Configuration: error parsing conf mapred-site.xml
org.xml.sax.SAXParseException; systemId: file:/etc/hadoop/conf/mapred-site.xml; lineNumber: 316; columnNumber: 3; The element type "property" must be terminated by the matching end-tag "</property>".
        at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
        at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2171)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2242)
        at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2195)
        at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)
        at org.apache.hadoop.conf.Configuration.get(Configuration.java:858)
        at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:877)
        at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1278)
        at org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:65)
        at org.apache.hadoop.io.compress.zlib.ZlibFactory.getZlibCompressorType(ZlibFactory.java:82)
        at org.apache.hadoop.io.compress.DefaultCodec.getCompressorType(DefaultCodec.java:74)
        at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148)
        at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:163)
        at org.apache.hadoop.io.file.tfile.Compression$Algorithm.getCompressor(Compression.java:274)
        at org.apache.hadoop.io.file.tfile.BCFile$Writer$WBlockState.<init>(BCFile.java:129)
        at org.apache.hadoop.io.file.tfile.BCFile$Writer.prepareDataBlock(BCFile.java:430)
        at org.apache.hadoop.io.file.tfile.TFile$Writer.initDataBlock(TFile.java:642)
        at org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:533)
        at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.writeVersion(AggregatedLogFormat.java:276)
        at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.<init>(AggregatedLogFormat.java:272)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:108)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:166)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:140)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:354)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-03-26 20:04:43,691 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Aggregation did not complete for application application_1426202183036_103251
2015-03-26 20:04:43,691 ERROR org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[LogAggregationService #2,5,main] threw an Throwable, but we are shutting down, so ignoring this
java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: file:/etc/hadoop/conf/mapred-site.xml; lineNumber: 316; columnNumber: 3; The element type "property" must be terminated by the matching end-tag "</property>".
--

> Nodemanager dies after a small typo in mapred-site.xml is induced
> -----------------------------------------------------------------
>
>                 Key: YARN-3403
>                 URL: https://issues.apache.org/jira/browse/YARN-3403
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Nikhil Mulley
>            Priority: Critical
>
> Hi,
> We have noticed that with a small typo in terms of xml config (mapred-site.xml) can cause the nodemanager go down completely without stopping/restarting it externally.
> I find it little weird that editing the config files on the filesystem, could cause the running slave daemon yarn nodemanager shutdown.
> In this case, I had a ending tag '/' missed in a property and that induced the nodemanager go down in a cluster. 
> Why would nodemanager reload the configs while it is running? Are not they picked up when they are started? Even if they are automated to pick up the new configs dynamically, I think the xmllint/config checker should come in before the nodemanager is asked to reload/restart.
>  
> ---
> java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: file:/etc/hadoop/conf/mapred-site.xml; lineNumber: 228; columnNumber: 3; The element type "value" must be terminated by the matching end-tag "</value>".
>        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)
> ---
> Please shed light on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)