You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Pavan Srinivas (JIRA)" <ji...@apache.org> on 2016/01/25 08:29:40 UTC

[jira] [Created] (HADOOP-12739) Deadlock with OrcInputFormat split threads and Jets3t connections, since, NativeS3FileSystem does not release connections with seek()

Pavan Srinivas created HADOOP-12739:
---------------------------------------

             Summary: Deadlock with OrcInputFormat split threads and Jets3t connections, since, NativeS3FileSystem does not release connections with seek()
                 Key: HADOOP-12739
                 URL: https://issues.apache.org/jira/browse/HADOOP-12739
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Pavan Srinivas


Recently, we came across a deadlock situation with OrcInputFormat while computing splits. 

- In Orc, for split computation, it needs file listing and file sizes. 
- Multiple threads are invoked for listing the files and if the data is located in S3, NativeS3FileSystem is used. 
- NativeS3FileSystem in turn uses JetS3t Lib to talk to AWS and maintain connection pool. 
- When # of threads from OrcInputFormat exceeds JetS3t's max # of connections, a deadlock occurs. stack trace: 

{code}
"ORC_GET_SPLITS #5" daemon prio=10 tid=0x00007f8568108800 nid=0x1e29 in Object.wait() [0x00007f8565696000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000df9ed450> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
	at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518)
	- locked <0x00000000df9ed450> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
	at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416)
	at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153)
	at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
	at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
	at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:370)
	at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestGet(RestStorageService.java:929)
	at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2007)
	at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:1944)
	at org.jets3t.service.S3Service.getObject(S3Service.java:2625)
	at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:254)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at org.apache.hadoop.fs.s3native.$Proxy12.retrieve(Unknown Source)
	at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.reopen(NativeS3FileSystem.java:269)
	- locked <0x00000000db01eec0> (a org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream)
	at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:258)
	- locked <0x00000000db01eec0> (a org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream)
	at org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:98)
	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:63)
	- locked <0x00000000db01ee70> (a org.apache.hadoop.fs.FSDataInputStream)
	at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:329)
	at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:292)
	at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:197)
	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:857)
	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.run(OrcInputFormat.java:747)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)

   Locked ownable synchronizers:
	- <0x00000000dae7bcb8> (a java.util.concurrent.ThreadPoolExecutor$Worker)

{code}

A complete *jstack* dump of the process is attached with. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)