You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Vijay (JIRA)" <ji...@apache.org> on 2012/09/22 20:26:07 UTC

[jira] [Created] (CASSANDRA-4705) Speculative execution for CL_ONE

Vijay created CASSANDRA-4705:
--------------------------------

             Summary: Speculative execution for CL_ONE
                 Key: CASSANDRA-4705
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 1.2.0
            Reporter: Vijay
            Assignee: Vijay
            Priority: Minor


When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 

It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.

CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0

1) May be we need to use metrics-core to record various Percentiles
2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466390#comment-13466390 ] 

Vijay edited comment on CASSANDRA-4705 at 9/30/12 2:02 PM:
-----------------------------------------------------------

I pushed the prototype code into https://github.com/Vijay2win/cassandra/commit/62bbabfc41ba8e664eb63ba50110e5f5909b2a87

Looks like metrics-core exposes 75, 95, 97, 99 and 99.9 Percentile's, with my tests 75P is too low, and 99 is too high to make a difference, whereas 95P long tail looks better (Moving average doesn't make much of a difference too). It also supports ALL, AUTO, NONE (current behavior) as per jonathan's comment above.

But I still think we should also support hard coded value in addition to the auto :)

Note: have to make the speculative_retry part of the schema but currently if you want to test it out it is a code change in CFMetaData
                
      was (Author: vijay2win@yahoo.com):
    I pushed the prototype code into https://github.com/Vijay2win/cassandra/commit/62bbabfc41ba8e664eb63ba50110e5f5909b2a87

Looks like metrics-core exposes 75, 95, 97, 99 and 99.9 Percentile's, with my tests 75P is too low, and 99 is too high to make a difference, whereas 95P long tail looks better (Moving average doesn't make much of a difference too). 

I still think we should also support hard coded value in addition to the auto :)

Note: have to make the speculative_retry part of the schema but currently if you want to test it out it is a code change in CFMetaData
                  
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Chris Burroughs (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470461#comment-13470461 ] 

Chris Burroughs commented on CASSANDRA-4705:
--------------------------------------------

> Looks like metrics-core exposes 75, 95, 97, 99 and 99.9

Reporters have a limited set (ie you can't generate new values that will pop up in jmx on the fly), but in code you should be able to get at any percentile you want: https://github.com/codahale/metrics/blob/2.x-maintenance/metrics-core/src/main/java/com/yammer/metrics/stats/Snapshot.java#L54
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4705) Speculative execution for Reads

Posted by "Vijay (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vijay updated CASSANDRA-4705:
-----------------------------

    Attachment: 0001-CASSANDRA-4705-v3.patch

Hi Jonathan, I think the attached patch covers all the previous concerns, except the timeout to be in micros. I created #5014 to convert the timeouts to be microseconds. Thanks!
                
> Speculative execution for Reads
> -------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch, 0001-CASSANDRA-4705-v3.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503260#comment-13503260 ] 

Vijay commented on CASSANDRA-4705:
----------------------------------

Hi Jonathan, Sorry for the delay.

{quote}
Would it make more sense to have getReadLatencyRate and UpdateSampleLatencies into SR? that way we could replace case statements with polymorphism.
{quote}
The problem is that we have to calculate the expensive percentile calculation Async using a scheduled TPE, We can avoid the switch by introducing additional SRFactory which will initialize the TPE as per CF changes in the settings? Let me know.

{quote}
Why does preprocess return a boolean now?
{quote}
The current patch uses the boolean to understand if the processing was done or not.... its used by RCB after the patch when there are more than 1 responses received by the co-ordinator from the same host (When SR is on and the actual read response gets back at the same time as the speculated response), we should not count that towards the consistency level.

{quote}
How does/should SR interact with RR? Using ALL + RRR
{quote}
Currently we are doing additional read to double check if we need to write, I thought the goal for ALL will eliminate that and do additional write instead... Most cases it will be a memtable update :)
I can think of 2 options:
1) Just document the ALL case and live with the additional writes, user might not be a big issue for most cases and for the rest they can switch to the default behavior.
2) We can queue the repair Mutations, in the Async thread we can check if there are duplicate mutations pending... if yes then we can just ignore the duplicates this can be done by doing sendRR and adding the CF to be repaired in a HashSet (it takes additional memory footprint).

Should we move this discussion to a different ticket?

Let me know, Thanks!
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Peter Schuller (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462450#comment-13462450 ] 

Peter Schuller commented on CASSANDRA-4705:
-------------------------------------------

99% based on what time period? If period it too short, you won't get the full impact since you'll pollute the track record. If it's too large, consider the traffic increase resulting from a prolonged hiccup. Will you be able to hide typical GC pauses? Then you better have the window be higher than 250 ms. What about full gc:s? How do you determine what the p99 is given a node with multiple replica sets shared with it? If a single node goes into full gc, how do you make latency be un-affected while still capping the number of backup requests at a reasonable number? If you don't cap it, the optimization is more dangerous than useful, since it just means you'll fall over under various hard-to-predict emergent situations if you expect to take advantage of less reads when provisioning your cluster. What's an appropriate cap? How do you scale that with RF and consistency level? How do you explain this to the person who has to figure out how much capacity is needed for a cluster?

In our case, we pretty much run all our clusters with RR turned fully up - not necessarily for RR purposes, but for the purpose of more deterministic behavior. You don't want things falling over when a replica goas down. If you don't have the iops/CPU to take all replicas having to process all requests for a replica set, you're at risk of falling over (i.e., you don't scale, because failures are common in large clusters) - unless you over-provision, but then you might as well go all data reads to begin with.

I am not arguing against the idea of backup requests, but I *strongly* recommend simply going for the trivial and obvious route of full data reads *first* and getting the obvious pay-off with no increase in complexity (I would even argue it's a *decrease* in complexity in terms of the behavior of the system as a whole, especially from the perspective of a human understanding emergent cluster behavior) - and then slowly develop something like this, with very careful thought to all the edge cases and implications of it.

I'm in favor of long-term *predictable* performance. Full data reads is a very very easy way to achieve that, and vastly better latency, in many cases (the bandwidth saturation case pretty much being the major exception; CPU savings aren't really relevant with Cassandra's model if you expect to survive nodes being down). It's also very easy for a human to understand the behavior when looking at graphs of system behavior in some event, and trying to predict what will happen, or explain what did happen.

I really think the drawbacks of full data reads are being massively over-estimated and the implications of lack of data reads massively under-estimated.

                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462762#comment-13462762 ] 

Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------

I don't like the idea of making users manually specify thresholds.  They will usually get it wrong, and we have latency histograms that should let us do a better job automagically.

But I could see the value of a setting to allow disabling it when you know your CF has a bunch of different query types being thrown at it.  Something like speculative_retry = {off, automatic, full} where full is Peter's full data reads to each replica.
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508068#comment-13508068 ] 

Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------

Okay, let's leave UpdateSampleLatencies alone (although as style I'd prefer to inline it as an anyonymous Runnable).

Thinking more about the core functionality:

- a RetryType of "one pre-emptive redundant data read" would be a useful alternative to ALL.  (If supporting both makes things more complex, I would vote for just supporting the single extra read.)  E.g., for a CL.ONE read it would perform two data reads; for CL.QUORUM it would perform two data reads and a digest read.  Put another way, it would do the same exta data read Xpercentile would, but it would do it ahead of the threshold timeout.
- ISTM we should continue to use RDR for normal (non-RR) SR reads, and just accept the first data reply that comes back without comparing it to others.  This makes the most sense to me semantically, and keeps CL.ONE reads lightweight.
- I think it's incorrect (again, in the non-RR case) to perform a data read against the same host we sent a digest read to.  Consider CL.QUORUM: I send a data read to replica X and a digest to replica Y.  X is slow to respond.  Doing a data read to Y won't help, since I need both to meet my CL.  I have to do my SR read to replica Z, if one exists and is alive.
- We should probably extend this to doing extra digest reads for CL > ONE, when we get the data read back quickly but the digest read is slow.
- SR + RR is the tricky part... this is where SR could result in data and digests from the same host.  So ideally, we want the ability to compare (potentially) multiple data reads, *and* multiple digests, *and* track the source for CL purposes, which neither RDR nor RRR is equipped to do.  Perhaps we should just force all reads to data reads for SR + RR [or even for all RR reads], to simplify this.

Finally,
- millis may be too coarse a grain here, especially for Custom settings.  Currently an in-memory read will typically be under 2ms and it's quite possible we can get that down to 1 if we can purge some of the latency between stages.  Might as well use micros since Timer gives it to us for free, right?
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461478#comment-13461478 ] 

Vijay commented on CASSANDRA-4705:
----------------------------------

No, DSnitch watches for the latency but doesn't do the later.... It wont speculate/execute duplicate requests to another host, if the response times are > x%. 

I think this patch will be in addition to dsnitch, something like Jonathan posted in 2540

{quote}
I like the approach described in http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/Berkeley-Latency-Mar2012.pdf of doing "backup" requests if the original doesn't reply within N% of normal.
{quote}
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vijay updated CASSANDRA-4705:
-----------------------------

    Attachment: 0001-CASSANDRA-4705-v2.patch

Attached patch does 
            - ALL
            - Xpercentile
            - Xms
            - NONE;


Optionally we might also need to rename RowRepairResolver to RowAllDataResolver or something.
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470470#comment-13470470 ] 

Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------

Thanks Chris!
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462515#comment-13462515 ] 

Vijay commented on CASSANDRA-4705:
----------------------------------

{quote}
99% based on what time period? If period it too short, you won't get the full impact since you'll pollute the track record. If it's too large, consider the traffic increase resulting from a prolonged hiccup
{quote}
Thats the hardest problem which i am trying to solve right now :) Actually surprisingly (to me) the code itself is not complicated to send a backup request.

{quote}
Will you be able to hide typical GC pauses?
{quote}
Worst case we send some extra requests which IMO is ok for few milliseconds.

Most times while working on AWS the network is usually not that predictable, and with MR clusters we where reluctant to enable RR. 
This is not something new to me, we did something like this back in NFLX (we never named it fancy :)) in the client (http://netflix.github.com/astyanax/javadoc/com/netflix/astyanax/Execution.html#executeAsync()) to retry independent of the default rpc_timeout.

{quote}
 am not arguing against the idea of backup requests, but I strongly recommend simply going for the trivial and obvious route of full data reads
{quote}
I am neutral about this, originally the idea was to move the above logic which was done in the client back in to the server.

{quote}
Here's a good example of complexity implication that I just thought of 
...
{quote}
How about we provide a override for the users with multiple kinds of request? we can override via CF setting which will be something like timeout... wait for x seconds before sending a secondary request.
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4705) Speculative execution for Reads

Posted by "Vijay (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vijay updated CASSANDRA-4705:
-----------------------------

    Summary: Speculative execution for Reads  (was: Speculative execution for CL_ONE)
    
> Speculative execution for Reads
> -------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Peter Schuller (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462456#comment-13462456 ] 

Peter Schuller commented on CASSANDRA-4705:
-------------------------------------------

Here's a good example of complexity implication that I just thought of (and it's stuff like this I'm worried about w.r.t. complexity): How do you split requests into "groups" within which to do latency profiling? If you don't, you'll easily end up having the expensive requests always be processed multiple times because they always hit the backup path (because they are expensive and thus latent). So you could very easily "eat up" all your intended benefit by having the very expensive requests take the backup path. Without knowledge of the nature of the requests, and since we cannot reliably just assume a homogenous request pattern, you would probably need some non-trivial way of classifying requests and having it relate to these statistics to keep.

In some cases, having it be a per-cf setting might be enough. In other cases that's not feasable - for example maybe you're doing slicing on large rows, and maybe it's impossible to determine based on an incoming requests whether it's expensive or not (the range may be high but result in only a single column, for example).

What if you don't care about the latency of the "legitimately expensive" requests, but about the cheap ones? And what if those "legitimately expensive" requests consumes your 1% (p99), such that none of the "cheaper" requests are subject to backup requests? Now you get none of the benefit, but you still take the brunt of the cost you'd have if you just went with full data reads.

I'm sure there are many other concerns I'm not thinking of; this was meant as an example of how it can be hard to make this actually work the way it's intended.

                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472535#comment-13472535 ] 

Vijay commented on CASSANDRA-4705:
----------------------------------

Cool, let me work on the patch soon... 
{quote}
are both doubles?
{quote}
Well it will be long in ms, :)
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vijay updated CASSANDRA-4705:
-----------------------------

    Issue Type: Improvement  (was: Bug)
    
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466559#comment-13466559 ] 

Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------

Well, we have a pretty short list of possibilities from metrics...  I guess we could add auto95, auto97, auto99 options?
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vijay updated CASSANDRA-4705:
-----------------------------

    Attachment: 0001-CASSANDRA-4705.patch

Hi Jonathan, the custom value is kind of better in the cases where users can say, My SLA is 20 MS and i want co-ordinator to retry the reads after 15 MS, or more aggressively retry after 10 MS.

Attached patch supports the following:

- ALL
- auto95 (Default)
- auto98
- auto99
- auto999
- autoMean
- NONE (current behavior)

                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472552#comment-13472552 ] 

Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------

our history has been that sooner or later someone always wants fractional ms, but I'm fine w/ long (or int) :)
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461805#comment-13461805 ] 

Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------

FTR I'm not sure CL.ONE is going to be substantially easier than generalizing to all CL.
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503260#comment-13503260 ] 

Vijay edited comment on CASSANDRA-4705 at 11/23/12 6:28 PM:
------------------------------------------------------------

Hi Jonathan, Sorry for the delay.

{quote}
Would it make more sense to have getReadLatencyRate and UpdateSampleLatencies into SR? that way we could replace case statements with polymorphism.
{quote}
The problem is that we have to calculate the expensive percentile calculation Async using a scheduled TPE, We can avoid the switch by introducing additional SRFactory which will initialize the TPE as per CF changes in the settings? Let me know.

{quote}
Why does preprocess return a boolean now?
{quote}
The current patch uses the boolean to understand if the processing was done or not.... its used by RCB after the patch when there are more than 1 responses received by the co-ordinator from the same host (When SR is on and the actual read response gets back at the same time as the speculated response), we should not count that towards the consistency level.

{quote}
How does/should SR interact with RR? Using ALL + RRR
{quote}
Currently we are doing additional read to double check if we need to write, I thought the goal for ALL will eliminate that and do additional write instead... Most cases it will be a memtable update :)
I can think of 2 options:
1) Just document the ALL case and live with the additional writes, might not be a big issue for most cases and for the rest user can switch to the default behavior.
2) We can queue the repair Mutations, in the Async thread we can check if there are duplicate mutations pending... if yes then we can just ignore the duplicates this can be done by doing sendRR and adding the CF to be repaired in a HashSet (it takes additional memory footprint).

Should we move this discussion to a different ticket?

Let me know, Thanks!
                
      was (Author: vijay2win@yahoo.com):
    Hi Jonathan, Sorry for the delay.

{quote}
Would it make more sense to have getReadLatencyRate and UpdateSampleLatencies into SR? that way we could replace case statements with polymorphism.
{quote}
The problem is that we have to calculate the expensive percentile calculation Async using a scheduled TPE, We can avoid the switch by introducing additional SRFactory which will initialize the TPE as per CF changes in the settings? Let me know.

{quote}
Why does preprocess return a boolean now?
{quote}
The current patch uses the boolean to understand if the processing was done or not.... its used by RCB after the patch when there are more than 1 responses received by the co-ordinator from the same host (When SR is on and the actual read response gets back at the same time as the speculated response), we should not count that towards the consistency level.

{quote}
How does/should SR interact with RR? Using ALL + RRR
{quote}
Currently we are doing additional read to double check if we need to write, I thought the goal for ALL will eliminate that and do additional write instead... Most cases it will be a memtable update :)
I can think of 2 options:
1) Just document the ALL case and live with the additional writes, user might not be a big issue for most cases and for the rest they can switch to the default behavior.
2) We can queue the repair Mutations, in the Async thread we can check if there are duplicate mutations pending... if yes then we can just ignore the duplicates this can be done by doing sendRR and adding the CF to be repaired in a HashSet (it takes additional memory footprint).

Should we move this discussion to a different ticket?

Let me know, Thanks!
                  
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501877#comment-13501877 ] 

Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------

avro is just used for upgrading from 1.0 schemas, so shouldn't need to touch that anymore.

Would it make more sense to have getReadLatencyRate and UpdateSampleLatencies into SR?  that way we could replace case statements with polymorphism.

Can you split the AbstractReadExecutor refactor out from the speculative execution code?  That would make it easier to isolate the changes in review.

Why does preprocess return a boolean now?

How does/should SR interact with RR?  Using ALL + RRR means we're probably going to do a lot of unnecessary "repair" writes in a high-update environment (i.e., it would be normal for one replica to be slightly behind others on a read), which is probably not what we want.  Also unclear to me what happens when we use RDR and do a SR when we've also requested extra digests for RR, and we get a data read and a digest from the same replica.
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466390#comment-13466390 ] 

Vijay commented on CASSANDRA-4705:
----------------------------------

I pushed the prototype code into https://github.com/Vijay2win/cassandra/commit/62bbabfc41ba8e664eb63ba50110e5f5909b2a87

Looks like metrics-core exposes 75, 95, 97, 99 and 99.9 Percentile's, with my tests 75P is too low, and 99 is too high to make a difference, whereas 95P long tail looks better (Moving average doesn't make much of a difference too). 

I still think we should also support hard coded value in addition to the auto :)

Note: have to make the speculative_retry part of the schema but currently if you want to test it out it is a code change in CFMetaData
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472500#comment-13472500 ] 

Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------

So I guess we could support {ALL, Xpercentile, Yms, NONE} where X and Y are both doubles?
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>         Attachments: 0001-CASSANDRA-4705.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461464#comment-13461464 ] 

Brandon Williams commented on CASSANDRA-4705:
---------------------------------------------

bq. It would be nice to watch for latency and execute an additional request to a different node

Isn't this what the dsnitch does to some degree?
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

Posted by "Lior Golan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462510#comment-13462510 ] 

Lior Golan commented on CASSANDRA-4705:
---------------------------------------

How about just letting the user configure a threshold, above which a backup request will be sent?
This can be an easy way to start with this feature (saving the need to estimate the p99 point).
It will allow what Peter is suggesting above (just set the threshold to 0), and will allow the user to tune the tradeoff between latency and throughput.

It would be cool to be able to set this threshold on a per request basis, similar to how CL is specified.

But thinking about this a bit more - isn't such a feature better implemented at the client library level? Implementing this at the client library level will also allow handling cases where the StorageProxy is down (i.e. GC at the coordinator), and would make it easier to specify at the per request level (no need to pollute the protocol with this setting)
                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the requests. When a node goes down or when a node is too busy the client has to wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a different node, if the response is not received within average/99% of the response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira