You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Eric Jensen (JIRA)" <ji...@apache.org> on 2008/11/12 01:09:44 UTC

[jira] Created: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

arguments corrupted between ruby and java under concurrent load
---------------------------------------------------------------

                 Key: THRIFT-200
                 URL: https://issues.apache.org/jira/browse/THRIFT-200
             Project: Thrift
          Issue Type: Bug
          Components: Library (Java), Library (Ruby)
         Environment: latest snapshot of thrift git HEAD

running debian etch

client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]

server is running java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)

            Reporter: Eric Jensen
         Attachments: thrift_tcpdump.out

We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.

our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.

struct View {
  1: bool complete,
  2: list<i32> view = [],
  3: i32 count = -1,
}

service Cache {
View search( 1:list<list<string>> query,
	                   2:i32 requested,
	                   3:i32 offset,
	                   4:i32 sinceId,
	                   5:i32 maxId,
	                   6:bool count ) 
	                   
  async void populate( 1:list<list<string>> query,
                       2:list<i32> view,
                       3:i32 completeSinceId,
                       4:i32 maxIndexedId )
}

our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))

between 4:35:01 and 04 we get 

Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
      at java.util.ArrayList.<init>(ArrayList.java:112)
      at Cache$search_args.read(Cache.java:314)
      at Cache$Processor$search.process(Cache.java:150)
      at Cache$Processor.process(Cache.java:142)
      at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
      at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
      at java.lang.Thread.run(Thread.java:619)

which on the client side appears as:

Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090

we've also gotten another OOM, although it is not in the attached tcpdump:

Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
        at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
        at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
        at Cache$search_args.read(Cache.java:344)
        at Cache$Processor$search.process(Cache.java:165)
        at Cache$Processor.process(Cache.java:157)
        at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
        at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)

and later we get two of these in the same second:

Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
SEVERE: Unexpected exception while invoking!
java.lang.NegativeArraySizeException
        at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
        at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
        at Cache$search_args.read(Cache.java:324)
        at Cache$Processor$search.process(Cache.java:150)
        at Cache$Processor.process(Cache.java:142)
        at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
        at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)
Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
SEVERE: Unexpected exception while invoking!
java.lang.NegativeArraySizeException
        at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
        at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
        at Cache$populate_args.read(Cache.java:707)
        at Cache$Processor$populate.process(Cache.java:167)
        at Cache$Processor.process(Cache.java:142)
        at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
        at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)

which on the client side results in: 

Thrift::Exception: end of file reached

We've also seen the NegativeArraySizeException in readStringBody from our populate method.

We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):

Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))

Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
SEVERE: Error occurred during processing of message.
java.lang.NegativeArraySizeException
        at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
        at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
        at Cache$search_args.read(Cache.java:344)
        at Cache$Processor$search.process(Cache.java:165)
        at Cache$Processor.process(Cache.java:157)
        at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)

I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649476#action_12649476 ] 

Bryan Duxbury commented on THRIFT-200:
--------------------------------------

So, it sounds like this issue is really about the Ruby libs doing the right thing when attempting to serialize weird stuff. I think at least it should close the connection. This would keep the server from mixing data from two different invocations.

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Eric Jensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646779#action_12646779 ] 

Eric Jensen commented on THRIFT-200:
------------------------------------

with TNonblockingServer I get the following.  TSimpleServer still works but obviously is way too slow

Exception in thread "Thread-18" java.lang.OutOfMemoryError: Java heap space
        at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
        at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
        at Cache$search_args.read(Cache.java:324)
        at Cache$Processor$search.process(Cache.java:150)
        at Cache$Processor.process(Cache.java:142)
        at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
        at com.facebook.thrift.server.TNonblockingServer.requestInvoke(Unknown Source)
        at com.facebook.thrift.server.TNonblockingServer$SelectThread.handleRead(Unknown Source)
        at com.facebook.thrift.server.TNonblockingServer$SelectThread.select(Unknown Source)
        at com.facebook.thrift.server.TNonblockingServer$SelectThread.run(Unknown Source)

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Eric Jensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649488#action_12649488 ] 

Eric Jensen commented on THRIFT-200:
------------------------------------

i would expect an error from thrift indicating i sent it garbage, rather than it trying to allocate arrays of arbitrary size.  perhaps type checking in ruby should be the default too?

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646965#action_12646965 ] 

Bryan Duxbury commented on THRIFT-200:
--------------------------------------

Aha, did you say timeouts? And you're reconnecting for every request?

Are the clients timing out in Rails or in the Java server? If they're timing out in Rails faster than they can be processed in the Java server, then you are consistently overloading your Java server with an impossible queue of clients. If the clients would wait until they got their turn, then the Java server would throttle the whole process. If instead, you're timing out and reconnecting again later, the Java server still has those previous clients queued up internally. This would explain the RejectedExecutionExceptions in TThreadPoolServer, because the thread pool is getting filled up. It explains the OutOfMemoryError in TNonblockingServer because the server will allocate a request buffer to each client connection until it runs out of memory.

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Eric Jensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646969#action_12646969 ] 

Eric Jensen commented on THRIFT-200:
------------------------------------

as per above, the timeout errors appear on the client side in ruby:

Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090

but those seem to be caused by the OutOfMemoryError's in the java server (i.e., it dies and doesn't respond, so the client times out)

we're not overwhelming the server at all, as these requests are averaging about 1.5ms, so with the threading we should be able to do some number of thousands of requests per second.  

if you read through those stack traces, the OutOfMemoryError is actually the same underlying problem as the NegativeArraySizeException, it is trying to allocate an array of invalid size (either negative or huge).  it's not our jvm running out of heap in general, just that thread trying to allocate more memory than there is.  the server keeps serving requests after such an error.  

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Eric Jensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646760#action_12646760 ] 

Eric Jensen commented on THRIFT-200:
------------------------------------

it's a rails app running in fcgid that persists the thrift connection across requests.  no threading, separate processes...can't be using the same connection concurrently AFAIK.

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Eric Jensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647046#action_12647046 ] 

Eric Jensen commented on THRIFT-200:
------------------------------------

That was indeed the problem.  I cannot reproduce it after switching to to_s.  I'm thinking thrift should do something to protect itself against this kind of error...i would expect garbage strings in this case, not OutOfMemoryErrors

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Eric Jensen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Jensen updated THRIFT-200:
-------------------------------

    Attachment: thrift_tcpdump.out

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647044#action_12647044 ] 

Bryan Duxbury commented on THRIFT-200:
--------------------------------------

Try it with:

{{{
Thrift.type_checking = true
}}}

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646742#action_12646742 ] 

Bryan Duxbury commented on THRIFT-200:
--------------------------------------

What's your ruby client look like? Are you querying concurrently in threads or in processes? If you're using threads, are you protecting your client with mutexes? The ruby client libraries will not protect themselves from different threads interrupting each other and writing garbage mid stream.

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647044#action_12647044 ] 

bryanduxbury edited comment on THRIFT-200 at 11/12/08 12:30 PM:
-----------------------------------------------------------------

Try it with:

{code}
Thrift.type_checking = true
{code}

      was (Author: bryanduxbury):
    Try it with:

{{{
Thrift.type_checking = true
}}}
  
> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Eric Jensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646774#action_12646774 ] 

Eric Jensen commented on THRIFT-200:
------------------------------------

more info.  when i use TThreadPoolServer i get the following:

Exception in thread "Thread-16" java.util.concurrent.RejectedExecutionException
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1759)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
        at com.facebook.thrift.server.TThreadPoolServer.serve(Unknown Source)

when i use TSimpleServer it works just fine


> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Eric Jensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647015#action_12647015 ] 

Eric Jensen commented on THRIFT-200:
------------------------------------

I believe I have found the cause of this.  In my rails app I do my_string.chars.downcase, because String#downcase is not safe for unicode.  Unfortunately, this returns a ActiveSupport::Multibyte::Chars object, which is duck-typed like a String, but probably does not serialize the same when it goes over thrift.  I have swtiched to doing my_string.chars.downcase.to_s and can't reproduce the problem again, yet...still testing.

If this is the problem, I'm not sure why this would only be a problem under concurrency, but certainly thrift should do better validation so that inadvertently passing it the incorrect type doesn't corrupt the entire data stream.  

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (THRIFT-200) arguments corrupted between ruby and java under concurrent load

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury resolved THRIFT-200.
----------------------------------

    Resolution: Invalid

> arguments corrupted between ruby and java under concurrent load
> ---------------------------------------------------------------
>
>                 Key: THRIFT-200
>                 URL: https://issues.apache.org/jira/browse/THRIFT-200
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java), Library (Ruby)
>         Environment: latest snapshot of thrift git HEAD
> running debian etch
> client is running ruby 1.8.5 (2006-08-25) [x86_64-linux]
> server is running java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>            Reporter: Eric Jensen
>         Attachments: thrift_tcpdump.out
>
>
> We have a ruby client talking to a java server and are seeing falsely huge or negative string/array sizes causing Exceptions in the java server when it is under load.  We can't reproduce this problem when querying it in serial, but with about 10 requests per second we get this corruption of our args.  Many traces are below and I will attach a tcpdump.  Any help would be appreciated...we're reasonably stuck now.
> our thrift interface is below.  (we have had the same problems with and without the "async", although the trace we provide includes it.)  we are calling the search method, then if that doesn't find the required results calling populate.  we do set a timeout of 200ms on the sockets for both the server and client.  we could pass empty arrays and some nil values, but never nil arrays or entries in those arrays.  we have about 30 clients talking to the service during these errors.
> struct View {
>   1: bool complete,
>   2: list<i32> view = [],
>   3: i32 count = -1,
> }
> service Cache {
> View search( 1:list<list<string>> query,
> 	                   2:i32 requested,
> 	                   3:i32 offset,
> 	                   4:i32 sinceId,
> 	                   5:i32 maxId,
> 	                   6:bool count ) 
> 	                   
>   async void populate( 1:list<list<string>> query,
>                        2:list<i32> view,
>                        3:i32 completeSinceId,
>                        4:i32 maxIndexedId )
> }
> our client code is basically Thrift::BinaryProtocol.new(Thrift::FramedTransport.new(Thrift::Socket.new('www4', 9090, 200)))
> between 4:35:01 and 04 we get 
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>       at java.util.ArrayList.<init>(ArrayList.java:112)
>       at Cache$search_args.read(Cache.java:314)
>       at Cache$Processor$search.process(Cache.java:150)
>       at Cache$Processor.process(Cache.java:142)
>       at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>       at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> which on the client side appears as:
> Thrift::Exception: Socket: Timed out reading 4 bytes from www4:9090
> we've also gotten another OOM, although it is not in the attached tcpdump:
> Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> and later we get two of these in the same second:
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:324)
>         at Cache$Processor$search.process(Cache.java:150)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Nov 11, 2008 4:35:32 PM com.facebook.thrift.server.TNonblockingServer$FrameBuffer invoke
> SEVERE: Unexpected exception while invoking!
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$populate_args.read(Cache.java:707)
>         at Cache$Processor$populate.process(Cache.java:167)
>         at Cache$Processor.process(Cache.java:142)
>         at com.facebook.thrift.server.TNonblockingServer$FrameBuffer.invoke(Unknown Source)
>         at com.facebook.thrift.server.THsHaServer$Invocation.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> which on the client side results in: 
> Thrift::Exception: end of file reached
> We've also seen the NegativeArraySizeException in readStringBody from our populate method.
> We also tried TThreadPoolServer with the following client code and then got the same problem (with different trace of course):
> Thrift::BinaryProtocol.new(Thrift::BufferedTransport.new(Thrift::Socket.new('www4', 9090, 200))
> Nov 11, 2008 6:49:35 PM com.facebook.thrift.server.TThreadPoolServer$WorkerProcess run
> SEVERE: Error occurred during processing of message.
> java.lang.NegativeArraySizeException
>         at com.facebook.thrift.protocol.TBinaryProtocol.readStringBody(Unknown Source)
>         at com.facebook.thrift.protocol.TBinaryProtocol.readString(Unknown Source)
>         at Cache$search_args.read(Cache.java:344)
>         at Cache$Processor$search.process(Cache.java:165)
>         at Cache$Processor.process(Cache.java:157)
>         at com.facebook.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> I can't confirm 100% whether the timeouts in the log are the cause or effect.  Is it possible that timing out and then re-opening the connection causes this somehow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.