You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Joshua McKenzie (JIRA)" <ji...@apache.org> on 2014/04/04 23:21:16 UTC

[jira] [Comment Edited] (CASSANDRA-3668) Parallel streaming for sstableloader

    [ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960435#comment-13960435 ] 

Joshua McKenzie edited comment on CASSANDRA-3668 at 4/4/14 9:19 PM:
--------------------------------------------------------------------

A quick update on this - going the route of multiple StreamSessions per StreamPlan is going to require some restructuring.  The current design assumes a single socket for streaming and changing to multiple StreamSessions means multiple ConnectionHandlers, all of which assume ownership of polling the readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm working on promoting IncomingMessageHandler and OutgoingMessageHandler into higher-level abstractions that are responsible for polling the socket and dispatching to various StreamSessions based on deserialized session indices on the inbound or following the current PriorityQueue polling mechanism for the outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even over a single socket as my prelim parallelized stream testing is peaking at ~ 55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing returns as we get higher.  Compared to the 24MB/s I'm benchmarking on a single connection it's still a respectable increase.


was (Author: joshuamckenzie):
A quick update on this - going the route of multiple StreamSessions per StreamPlan with the current architecture is going to require some restructuring.  The current design assumes a single socket for streaming and multiple StreamSessions means multiple ConnectionHandlers, all of which assume ownership of polling the readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm working on promoting IncomingMessageHandler and OutgoingMessageHandler into higher-level abstractions that are responsible for polling the socket and dispatching to various StreamSessions based on deserialized session indices on the inbound or following the current PriorityQueue polling mechanism for the outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even over a single socket as my prelim parallelized stream testing is peaking at ~ 55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing returns as we get higher.  Compared to the 24MB/s I'm benchmarking on a single connection it's still a respectable increase.

> Parallel streaming for sstableloader
> ------------------------------------
>
>                 Key: CASSANDRA-3668
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>            Reporter: Manish Zope
>            Assignee: Joshua McKenzie
>            Priority: Minor
>              Labels: streaming
>             Fix For: 2.1 beta2
>
>         Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3688-reply_before_closing_writer.txt, sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance of the sstableloader is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the problem with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its Working fine.
> Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)