You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Pavan Srinivas (JIRA)" <ji...@apache.org> on 2016/01/25 08:29:40 UTC
[jira] [Created] (HADOOP-12739) Deadlock with OrcInputFormat split
threads and Jets3t connections, since, NativeS3FileSystem does not release
connections with seek()
Pavan Srinivas created HADOOP-12739:
---------------------------------------
Summary: Deadlock with OrcInputFormat split threads and Jets3t connections, since, NativeS3FileSystem does not release connections with seek()
Key: HADOOP-12739
URL: https://issues.apache.org/jira/browse/HADOOP-12739
Project: Hadoop Common
Issue Type: Bug
Reporter: Pavan Srinivas
Recently, we came across a deadlock situation with OrcInputFormat while computing splits.
- In Orc, for split computation, it needs file listing and file sizes.
- Multiple threads are invoked for listing the files and if the data is located in S3, NativeS3FileSystem is used.
- NativeS3FileSystem in turn uses JetS3t Lib to talk to AWS and maintain connection pool.
- When # of threads from OrcInputFormat exceeds JetS3t's max # of connections, a deadlock occurs. stack trace:
{code}
"ORC_GET_SPLITS #5" daemon prio=10 tid=0x00007f8568108800 nid=0x1e29 in Object.wait() [0x00007f8565696000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000df9ed450> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518)
- locked <0x00000000df9ed450> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:370)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestGet(RestStorageService.java:929)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2007)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:1944)
at org.jets3t.service.S3Service.getObject(S3Service.java:2625)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:254)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at org.apache.hadoop.fs.s3native.$Proxy12.retrieve(Unknown Source)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.reopen(NativeS3FileSystem.java:269)
- locked <0x00000000db01eec0> (a org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:258)
- locked <0x00000000db01eec0> (a org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream)
at org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:98)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:63)
- locked <0x00000000db01ee70> (a org.apache.hadoop.fs.FSDataInputStream)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:329)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:292)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:197)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:857)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.run(OrcInputFormat.java:747)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Locked ownable synchronizers:
- <0x00000000dae7bcb8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
{code}
A complete *jstack* dump of the process is attached with.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)