You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Nirmal Ranganathan (JIRA)" <ji...@apache.org> on 2010/07/09 21:12:50 UTC

[jira] Commented: (CASSANDRA-1189) Refactor streaming

    [ https://issues.apache.org/jira/browse/CASSANDRA-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886839#action_12886839 ] 

Nirmal Ranganathan commented on CASSANDRA-1189:
-----------------------------------------------

I'm just reiterating the current code to help in refactoring decisions. 
- Stream_Request is invoked from StorageService during init and decommission on nodes and in AntiEntropyService to perform streaming repair. 
- Destination invokes STREAM_REQUEST (Compiles a set of ranges that it needs from the source)
- Source responds with STREAM_INITIATE (Prepares the request ranges and sends a list of pending files)
- Destination acknowledge with STREAM_INITIATE_DONE (Adds to list of pending files per node)
- Source  starts streaming the first file from the list of files it has prepared for that Destination node.
- Destination receives the file returns a Stream_Status.
- Source based on status, restreams file or streams the next file until complete.

Limitations:
- Source can transfer multiple streams, but only one per destination node.
- Destination can receive multiple streams, but again only one per source node.
- The ack process for pending files breaks for multiple streams or overlapping streams.
- List of pending files streamed needs to be in order, the current scheme maintains that order
- No session maintained on a per stream basis, session is loosely based on node.
- Each streamed file has no header to describe the file, that metadata was transferred during STREAM_INITIATE and the destination goes off that.

> Refactor streaming
> ------------------
>
>                 Key: CASSANDRA-1189
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1189
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7
>            Reporter: Gary Dusbabek
>            Assignee: Gary Dusbabek
>            Priority: Critical
>             Fix For: 0.7
>
>
> The current architecture is buggy because it makes the assumption that only one stream can be in process between two nodes at a given time, and stream send order never changes.  Because of this, the ACK process gets fouled up when other services wish to stream files.
> The process is somewhat contorted too (request, initiate, initiate done, send).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.