You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2010/11/30 02:07:13 UTC

[jira] Created: (CASSANDRA-1788) reduce copies on read, write paths

reduce copies on read, write paths
----------------------------------

                 Key: CASSANDRA-1788
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Jonathan Ellis
            Assignee: Jonathan Ellis
            Priority: Minor
             Fix For: 0.7.0


Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:

- constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
- which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
- which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
- which is what we write to the socket

For deserialize we perform a similar orgy of copies:

- IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
- ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
- finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body

Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Issue Comment Edited] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068633#comment-13068633 ] 

Brandon Williams edited comment on CASSANDRA-1788 at 7/20/11 8:58 PM:
----------------------------------------------------------------------

So, rebased v6 is even rarer now.  I now have to do reads/writes to the cluster to trigger it, where before it looked like the occasional gossip message would cause it.  And even doing reads/writes, it's still quite rare: out of the 333k (three node cluster, rf=1, 1M total) inserts/reads to the patched node, only 16 occurrences.  When the patched node is the only coordinator, it never produces an exception on reads, however for writes it increases the amount of exceptions, nearly 60 out of 1M inserts.  I suspect there is a problem in ITC or Message where it's not reading something correctly, but difficult to trigger. I confirmed with wireshark the other nodes are sending correct messages.

      was (Author: brandon.williams):
    So, rebased v6 is even rarer now.  I now have to do reads/writes to the cluster to trigger it, where before it looked like the occasional gossip message would cause it.  And even doing reads/writes, it's still quite rare: out of the 333k (three node cluster, rf=1, 1M total) inserts/reads to the patched node, only 16 occurrences.  When the patched node is the only coordinator, it never produces an exception on reads, however for writes it increases the amount of exceptions, nearly 60 out of 1M inserts.  I suspect there is a problem in ITC or Message where it's not reading something correctly, but difficult to trigger.
  
> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-1788:
----------------------------------------

    Comment: was deleted

(was: v7 is rebased to apply to current trunk.  It magically has no problems now, but I didn't change anything.)

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067247#comment-13067247 ] 

Brandon Williams commented on CASSANDRA-1788:
---------------------------------------------

With v5 (#2) I'm still periodically receiving bad magic in a mixed cluster:

{noformat}
ERROR 19:54:07,747 Fatal exception in thread Thread[Thread-13,5,main]
java.io.IOError: java.io.IOException: invalid protocol header
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:125)
Caused by: java.io.IOException: invalid protocol header
        at org.apache.cassandra.net.MessagingService.validateMagic(MessagingService.java:467)
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:107)
{noformat}

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788-v5.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079089#comment-13079089 ] 

Hudson commented on CASSANDRA-1788:
-----------------------------------

Integrated in Cassandra #1001 (See [https://builds.apache.org/job/Cassandra/1001/])
    Reduce copies on read/write paths.
Patch by jbellis reviewed by brandonwilliams for CASSANDRA-1788

brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153678
Files : 
* /cassandra/trunk/test/unit/org/apache/cassandra/streaming/SerializationsTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/net/MessagingService.java
* /cassandra/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
* /cassandra/trunk/test/unit/org/apache/cassandra/service/SerializationsTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
* /cassandra/trunk/src/java/org/apache/cassandra/net/Header.java
* /cassandra/trunk/src/java/org/apache/cassandra/net/Message.java
* /cassandra/trunk/test/unit/org/apache/cassandra/net/MessageSerializer.java
* /cassandra/trunk/test/unit/org/apache/cassandra/db/SerializationsTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/net/CompactEndpointSerializationHelper.java
* /cassandra/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnection.java
* /cassandra/trunk/test/unit/org/apache/cassandra/net


> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788-v7.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976293#action_12976293 ] 

Jonathan Ellis commented on CASSANDRA-1788:
-------------------------------------------

rebased, this time committing the MS.instance encapsulation separately

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 0001-setup.txt, 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment:     (was: 1788-v5.txt)

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-1788:
----------------------------------------

    Attachment: 1788-v7.txt

v7 rebased for trunk again.  Now mysteriously has no errors.  My guess is that something in CASSANDRA-1405 fixed it.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788-v7.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067356#comment-13067356 ] 

Jonathan Ellis commented on CASSANDRA-1788:
-------------------------------------------

Hmm. Not skipping enough "garbage bytes" would cause this (since Java zeros out all arrays on creation).  But the node-with-patch-applied is getting lengths generated by nodes-without-patch-applied, which should be fine.

So, still baffled.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788-v5.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064992#comment-13064992 ] 

Jonathan Ellis edited comment on CASSANDRA-1788 at 7/14/11 12:54 AM:
---------------------------------------------------------------------

v5 attached for trunk.

Message format should be identical to before:
<MAGIC><packheader><length of packbody><packbody>

packbody:
<id><serialized header><body length><body>

This does NOT require a message version bump.

      was (Author: jbellis):
    v5 attached for trunk.

Message format should be identical to before:
<MAGIC><packheader><length of packbody>

packbody:
<id><serialized header><body length><body>

This does NOT require a message version bump.
  
> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-1788:
----------------------------------------

    Attachment: 1788-v7.txt

v7 is rebased to apply to current trunk.  It magically has no problems now, but I didn't change anything.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788-v7.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067247#comment-13067247 ] 

Brandon Williams edited comment on CASSANDRA-1788 at 7/18/11 7:56 PM:
----------------------------------------------------------------------

With v5 (#2) I'm still periodically receiving bad magic in a mixed cluster (on the node with the patch applied):

{noformat}
ERROR 19:54:07,747 Fatal exception in thread Thread[Thread-13,5,main]
java.io.IOError: java.io.IOException: invalid protocol header
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:125)
Caused by: java.io.IOException: invalid protocol header
        at org.apache.cassandra.net.MessagingService.validateMagic(MessagingService.java:467)
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:107)
{noformat}

      was (Author: brandon.williams):
    With v5 (#2) I'm still periodically receiving bad magic in a mixed cluster:

{noformat}
ERROR 19:54:07,747 Fatal exception in thread Thread[Thread-13,5,main]
java.io.IOError: java.io.IOException: invalid protocol header
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:125)
Caused by: java.io.IOException: invalid protocol header
        at org.apache.cassandra.net.MessagingService.validateMagic(MessagingService.java:467)
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:107)
{noformat}
  
> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788-v5.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 1788-v5.txt

v5 attached for trunk.

Message format should be identical to before:
<MAGIC><packheader><length of packbody>

packbody:
<id><serialized header><body length><body>

This does NOT require a message version bump.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Reviewer: brandon.williams

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 1788-v6.txt

rebased to v6

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965783#action_12965783 ] 

Gary Dusbabek commented on CASSANDRA-1788:
------------------------------------------

Could you save another copy on send if packIt returned a Message augmented with header information that would then be placed in the OutboundTcpConnection queue?  The consumer there would then take the message and serialize it directly to the stream.  The obvious disadvantage I see is that it shifts the serialization work onto the writer thread, which might be undesirable.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1788.txt
>
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 1788-v5.txt

Part of the old sendOneWay (the "packbody" copy) looks like this:

{code}
            DataOutputBuffer buffer = new DataOutputBuffer();
            buffer.writeUTF(id);
            Message.serializer().serialize(message, buffer, message.getVersion());
            data = buffer.getData();
{code}

byte[] data is NOT restricted to just the serialized bytes in the buffer -- it will include any unused bytes at the end, as well.

v5 skips garbage bytes like this for backwards compatibility.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788-v5.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068633#comment-13068633 ] 

Brandon Williams commented on CASSANDRA-1788:
---------------------------------------------

So, rebased v6 is even rarer now.  I now have to do reads/writes to the cluster to trigger it, where before it looked like the occasional gossip message would cause it.  And even doing reads/writes, it's still quite rare: out of the 333k (three node cluster, rf=1, 1M total) inserts/reads to the patched node, only 16 occurrences.  When the patched node is the only coordinator, it never produces an exception on reads, however for writes it increases the amount of exceptions, nearly 60 out of 1M inserts.  I suspect there is a problem in ITC or Message where it's not reading something correctly, but difficult to trigger.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 1788-v3.txt

v3 w/ Gary's trick of copying just the Header part to maintain compatibility.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1788-v2.txt, 1788-v3.txt, 1788.txt
>
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Fix Version/s: 1.0

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12968775#action_12968775 ] 

Gary Dusbabek commented on CASSANDRA-1788:
------------------------------------------

with v4 I still see errors on the rc1 node.

INFO 09:10:43,847 [WRITE-/127.0.0.2] org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:115) error writing to /127.0.0.2

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 1788-v6.txt

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment:     (was: 0002-remove-copies-from-network-path.txt)

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079027#comment-13079027 ] 

Jonathan Ellis commented on CASSANDRA-1788:
-------------------------------------------

+1

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788-v7.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969912#action_12969912 ] 

Jonathan Ellis commented on CASSANDRA-1788:
-------------------------------------------

Rebooted v4 by first adding MessageSerializerTest with a bytesToHex string of the old-style bytes-on-wire to make sure I'm not breaking it.  Then when I add the new code I'm testing that new code can read old bytes, as well as old code reading new bytes.  Everything comes up clean.  I think I need another set of eyes on this.

(01 is large because I encapculated MS.instance in a getter to break an initialization-cycle problem.)

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Remaining Estimate: 24h
     Original Estimate: 24h

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-1788:
----------------------------------------

    Attachment:     (was: 1788-v7.txt)

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment:     (was: 0001-setup.txt)

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 1788-v4.txt

v4 removes a bogus write(-1) call left over from v1.  With that fix I have a 0.7 + v4 node working with a rc1 node, I can direct stress.py at either one and have writes go through to the other correctly.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 1788.txt

As planned, removes last copy on write and first on read.  Message.serializer calls are inlined and MessageSerializer is removed to avoid implying that it actually makes sense to call outside of sendOneWay / IncomingTcpConnection.

Bonus fix: removed copying DataOutputBuffer.asByteArray, replacing with wrap() of existing buffer with appropriate limit.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1788.txt
>
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment:     (was: 1788-v6.txt)

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v6.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 0002-remove-copies-from-network-path.txt
                0001-setup.txt

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 0002-remove-copies-from-network-path.txt
                0001-setup.txt

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 0001-setup.txt, 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1788:
--------------------------------------

    Attachment: 1788-v2.txt

Excellent!  v2 attached.

Question is, is it worth breaking compatibility with rc1?  (v1 does not.)  I would lean towards yes personally.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1788-v2.txt, 1788.txt
>
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (CASSANDRA-1788) reduce copies on read, write paths

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067270#comment-13067270 ] 

Brandon Williams commented on CASSANDRA-1788:
---------------------------------------------

Here's a clue: the bad magic being passed is always zero.

> reduce copies on read, write paths
> ----------------------------------
>
>                 Key: CASSANDRA-1788
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788-v5.txt, 1788.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message:
> - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply)
> - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize()
> - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
> - which is what we write to the socket
> For deserialize we perform a similar orgy of copies:
> - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it
> - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part
> - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body
> Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira