You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Vijay (JIRA)" <ji...@apache.org> on 2012/09/13 05:12:08 UTC

[jira] [Commented] (CASSANDRA-4573) HSHA doesn't handle large messages gracefully

    [ https://issues.apache.org/jira/browse/CASSANDRA-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454602#comment-13454602 ] 

Vijay commented on CASSANDRA-4573:
----------------------------------

Hi Tyler, I am not able to re-produce it so far. I am running 2GB/400MB on AWS M4XL....

[ec2-user@ip-10-82-21-221 ~]$ grep -i ThriftServer.java /mnt/log/cassandra/system.log 
 INFO [main] 2012-09-11 21:52:43,702 ThriftServer.java (line 112) Binding thrift service to localhost/127.0.0.1:9160
 INFO [main] 2012-09-11 21:52:43,704 ThriftServer.java (line 121) Using TFastFramedTransport with a max frame size of 15728640 bytes.
 INFO [main] 2012-09-11 21:52:43,710 ThriftServer.java (line 191) Using custom half-sync/half-async thrift server on localhost/127.0.0.1 : 9160
 INFO [Thread-2] 2012-09-11 21:52:43,720 ThriftServer.java (line 200) Listening for thrift clients...
[ec2-user@ip-10-82-21-221 ~]$ 


The Timeout happens both in Sync and HSHA servers (randomly and i am not able to reproduce both cases reliably) and the only thing which i can notice is that the client (pycassa) runs 100% CPU most of the time... other than that everything else looks normal.
                
> HSHA doesn't handle large messages gracefully
> ---------------------------------------------
>
>                 Key: CASSANDRA-4573
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4573
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Tyler Hobbs
>            Assignee: Vijay
>         Attachments: repro.py
>
>
> HSHA doesn't seem to enforce any kind of max message length, and when messages are too large, it doesn't fail gracefully.
> With debug logs enabled, you'll see this:
> {{DEBUG 13:13:31,805 Unexpected state 16}}
> Which seems to mean that there's a SelectionKey that's valid, but isn't ready for reading, writing, or accepting.
> Client-side, you'll get this thrift error (while trying to read a frame as part of {{recv_batch_mutate}}):
> {{TTransportException: TSocket read 0 bytes}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira