You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "anishek (JIRA)" <ji...@apache.org> on 2017/02/06 06:42:42 UTC
[jira] [Commented] (HADOOP-12404) Disable caching for
JarURLConnection to avoid sharing JarFile with other users when loading
resource from URL in Configuration class.
[ https://issues.apache.org/jira/browse/HADOOP-12404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853591#comment-15853591 ]
anishek commented on HADOOP-12404:
----------------------------------
[~zxu] you mentioned
{quote}
So we saw the following exception in a heavy-load system at rare situation which cause a hive job failed
{quote}
can you please tell me why would this only occur rarely and that too only with heavy-loaded system. Shouldnt this happen always ?
> Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class.
> -------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-12404
> URL: https://issues.apache.org/jira/browse/HADOOP-12404
> Project: Hadoop Common
> Issue Type: Improvement
> Components: conf
> Reporter: zhihai xu
> Assignee: zhihai xu
> Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HADOOP-12404.000.patch
>
>
> Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class.
> Currently {{Configuration#parse}} will call {{url.openStream}} to get the InputStream for {{DocumentBuilder}} to parse.
> Based on the JDK source code, the calling sequence is
> url.openStream => [handler.openConnection.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/Handler.java] => [new JarURLConnection|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/JarURLConnection.java#JarURLConnection] => JarURLConnection.connect => [factory.get(getJarFileURL(), getUseCaches())|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/JarFileFactory.java] => [URLJarFile.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/URLJarFile.java#URLJarFile.getJarFile%28java.net.URL%2Csun.net.www.protocol.jar.URLJarFile.URLJarFileCloseController%29]=>[JarFile.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/jar/JarFile.java#JarFile.getInputStream%28java.util.zip.ZipEntry%29]=>ZipFile.getInputStream
> If {{URLConnection#getUseCaches}} is true (by default), URLJarFile will be shared for the same URL. If the shared URLJarFile is closed by other users, all the InputStream returned by URLJarFile#getInputStream will be closed based on the [document|http://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipFile.html#getInputStream(java.util.zip.ZipEntry)]
> So we saw the following exception in a heavy-load system at rare situation which cause a hive job failed
> {code}
> 2014-10-21 23:44:41,856 ERROR org.apache.hadoop.hive.ql.exec.Task: Ended
> Job = job_1413909398487_3696 with exception
> 'java.lang.RuntimeException(java.io.IOException: Stream closed)'
> java.lang.RuntimeException: java.io.IOException: Stream closed
> at
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2484)
> at
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2337)
> at
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2254)
> at org.apache.hadoop.conf.Configuration.get(Configuration.java:861)
> at
> org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:2030)
> at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:479)
> at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:469)
> at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:187)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:582)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j
> ava:1614)
> at
> org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:580)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:598)
> at
> org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExe
> cHelper.java:288)
> at
> org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExe
> cHelper.java:547)
> at
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426)
> at
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:919)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation
> .java:145)
> at
> org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.
> java:69)
> at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.jav
> a:200)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j
> ava:1614)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:
> 502)
> at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:
> 213)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
> 145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
> 615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Stream closed
> at
> java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:67)
> at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:142)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at
> com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStr
> eam.read(XMLEntityManager.java:2902)
> at
> com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:
> 302)
> at
> com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScan
> ner.java:1753)
> at
> com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntity
> Scanner.java:1426)
> at
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Frag
> mentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2807)
> at
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocu
> mentScannerImpl.java:606)
> at
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNS
> DocumentScannerImpl.java:117)
> at
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scan
> Document(XMLDocumentFragmentScannerImpl.java:510)
> at
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Co
> nfiguration.java:848)
> at
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Co
> nfiguration.java:777)
> at
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:1
> 41)
> at
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:2
> 43)
> at
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentB
> uilderImpl.java:347)
> at
> javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
> at
> org.apache.hadoop.conf.Configuration.parse(Configuration.java:2325)
> at
> org.apache.hadoop.conf.Configuration.parse(Configuration.java:2313)
> at
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2384)
> {code}
> Also we can save a little bit memory, with [JarURLConnection's caches|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/JarFileFactory.java#JarFileFactory.getCachedJarFile%28java.net.URL%29] disabled.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org