You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Xiao Chen (JIRA)" <ji...@apache.org> on 2017/08/02 21:57:02 UTC
[jira] [Created] (HADOOP-14727) Socket not closed properly when
reading Configurations with BlockReaderRemote
Xiao Chen created HADOOP-14727:
----------------------------------
Summary: Socket not closed properly when reading Configurations with BlockReaderRemote
Key: HADOOP-14727
URL: https://issues.apache.org/jira/browse/HADOOP-14727
Project: Hadoop Common
Issue Type: Bug
Components: conf
Affects Versions: 3.0.0-alpha4, 2.9.0
Reporter: Xiao Chen
Priority: Blocker
This is caught by Cloudera's internal testing over the alpha3 release.
We got report that some hosts ran out of FDs. Triaging that, found out both oozie server and Yarn JobHistoryServer have tons of sockets on {{CLOSE_WAIT}} state.
[~haibochen] helped then narrow down a consistent reproduction by simply visiting the JHS webui, and clicking through a job and its logs.
I then look at the {{BlockReaderRemote}} and related code. After adding a debug log whenever a {{Peer}} is created/closed/in/out {{PeerCache}}, it looks like all the {{CLOSE_WAIT}} sockets are created from this call stack:
{noformat}
2017-08-02 13:58:59,901 INFO org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: ____ associated peer NioInetPeer(Socket[addr=/10.17.196.28,port=20002,localport=42512]) with blockreader org.apache.hadoop.hdfs.client.impl.BlockReaderRemote@717ce109
java.lang.Exception: test
at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:745)
at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:385)
at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:636)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:566)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:807)
at java.io.DataInputStream.read(DataInputStream.java:149)
at com.ctc.wstx.io.StreamBootstrapper.ensureLoaded(StreamBootstrapper.java:482)
at com.ctc.wstx.io.StreamBootstrapper.resolveStreamEncoding(StreamBootstrapper.java:306)
at com.ctc.wstx.io.StreamBootstrapper.bootstrapInput(StreamBootstrapper.java:167)
at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:573)
at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:647)
at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:366)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2649)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2697)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2662)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2545)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1076)
at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1126)
at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1344)
at org.apache.hadoop.mapreduce.counters.Limits.init(Limits.java:45)
at org.apache.hadoop.mapreduce.counters.Limits.reset(Limits.java:130)
at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:363)
at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:105)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.loadJob(HistoryFileManager.java:473)
at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.loadJob(CachedHistoryStorage.java:180)
at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.access$000(CachedHistoryStorage.java:52)
at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:103)
at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:100)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
at com.google.common.cache.LocalCache$LocalManualCache.getUnchecked(LocalCache.java:4834)
at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:193)
at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:220)
at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.requireJob(AppController.java:416)
at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.attempts(AppController.java:277)
at org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.attempts(HsController.java:152)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:162)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1552)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
{noformat}
I was able to further confirm this theory by backing out the 4 recent commits to {{Configuration}} on alpha3 and no longer seeing {{CLOSE_WAIT}} sockets.
- HADOOP-14501.
- HADOOP-14399.
- HADOOP-14216. Addendum
- HADOOP-14216.
It's not clear to me who's responsible to close the InputStream though.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org