You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Kelvin Kakugawa (JIRA)" <ji...@apache.org> on 2011/01/14 01:32:46 UTC

[jira] Created: (CASSANDRA-1985) read repair on CL.ONE regression

read repair on CL.ONE regression
--------------------------------

                 Key: CASSANDRA-1985
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.7.0, 0.7.1, 0.8
            Reporter: Kelvin Kakugawa
            Assignee: Kelvin Kakugawa
             Fix For: 0.7.1, 0.8


read repair w/ CL.ONE had a regression.

The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1985:
--------------------------------------

    Affects Version/s:     (was: 0.7.0)
                           (was: 0.8)
        Fix Version/s:     (was: 0.8)

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1
>
>         Attachments: 1985-v2.txt, CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981874#action_12981874 ] 

Jonathan Ellis commented on CASSANDRA-1985:
-------------------------------------------

The second RR is this one:

{code}
                RepairCallback<Row> handler = repair(command, endpoints);
...
                repairResponseHandlers.add(handler);
...
            for (RepairCallback<Row> handler : repairResponseHandlers)
            {
                try
                {
                    Row row = handler.get();
{code}

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0, 0.7.1, 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1, 0.8
>
>         Attachments: CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-1985:
---------------------------------------

    Attachment: CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch

ensure RR happens in the background.

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0, 0.7.1, 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1, 0.8
>
>         Attachments: CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981846#action_12981846 ] 

Kelvin Kakugawa commented on CASSANDRA-1985:
--------------------------------------------

Yes, you're right, that does schedule it, once.

The process for CL.ONE is:
1) schedule RR for data+digest and watch for a DigestMismatchException,
2) catch DME and call repair() to do a RR for data-only.

However, the handler for the second RR (that repair() returns) is never used.  So, even though it's collecting all the data repair messages, the RRR's resolve() never gets called.


> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0, 0.7.1, 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1, 0.8
>
>         Attachments: CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981900#action_12981900 ] 

Jonathan Ellis commented on CASSANDRA-1985:
-------------------------------------------

I get it now: the callback from the repair() call in RepairRunner is the one that we don't resolve.

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0, 0.7.1, 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1, 0.8
>
>         Attachments: CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982160#action_12982160 ] 

Hudson commented on CASSANDRA-1985:
-----------------------------------

Integrated in Cassandra-0.7 #162 (See [https://hudson.apache.org/hudson/job/Cassandra-0.7/162/])
    fix read repair on CL.ONE regression
patch by jbellis; reviewed by tjake for CASSANDRA-1985


> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: Kelvin Kakugawa
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.1
>
>         Attachments: 1985-v2.txt, CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981898#action_12981898 ] 

Kelvin Kakugawa commented on CASSANDRA-1985:
--------------------------------------------

Yes, that's correct for read CL > ONE.  A quorum / all read goes through that path.  However, the CL.ONE case does not go through that code path.

The branch in the code is in fetchRows(...) when it checks for randomlyReadRepair(...).  If the targets > handler.blockfor, it does a background repair via RepairRunner in service.StorageProxy.  i.e. it won't go through the block of code you pasted, because a DigestMismatchException won't be thrown for CL.ONE.

Now, let's look at RepairRunner : runMayThrow.  It calls repair(command, endpoints), but the RepairCallback<row> that is returned by repair(...) is dropped on the floor.  So, resolve is never called on that RepairCallback's ReadResponseResolver.

The above error was found via my own set of distributed tests.

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0, 0.7.1, 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1, 0.8
>
>         Attachments: CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981626#action_12981626 ] 

Jonathan Ellis commented on CASSANDRA-1985:
-------------------------------------------

It's possible that we missed something, but we we did test read repair post-982.  This is the part that does the resolve:

                if (repairs.contains(command))
                    repairExecutor.schedule(new RepairRunner(readCallback.resolver, command, endpoints), DatabaseDescriptor.getRpcTimeout(), TimeUnit.MILLISECONDS);


> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0, 0.7.1, 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1, 0.8
>
>         Attachments: CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981874#action_12981874 ] 

Jonathan Ellis edited comment on CASSANDRA-1985 at 1/14/11 1:51 PM:
--------------------------------------------------------------------

The "second RR" (that is, the second read request, for performing repair when a mismatch was detected by the digest read) is this one:

{code}
                RepairCallback<Row> handler = repair(command, endpoints);
...
                repairResponseHandlers.add(handler);
...
            for (RepairCallback<Row> handler : repairResponseHandlers)
            {
                try
                {
                    Row row = handler.get();
{code}

      was (Author: jbellis):
    The second RR is this one:

{code}
                RepairCallback<Row> handler = repair(command, endpoints);
...
                repairResponseHandlers.add(handler);
...
            for (RepairCallback<Row> handler : repairResponseHandlers)
            {
                try
                {
                    Row row = handler.get();
{code}
  
> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0, 0.7.1, 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1, 0.8
>
>         Attachments: CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981952#action_12981952 ] 

T Jake Luciani commented on CASSANDRA-1985:
-------------------------------------------

+1

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1
>
>         Attachments: 1985-v2.txt, CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-1985.
---------------------------------------

    Resolution: Fixed
      Reviewer: tjake
      Assignee: Jonathan Ellis  (was: Kelvin Kakugawa)

committed

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: Kelvin Kakugawa
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.1
>
>         Attachments: 1985-v2.txt, CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1985:
--------------------------------------

    Attachment: 1985-v2.txt

v2 keeps the resolve off the response stage, which we want to keep very low latency.

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0, 0.7.1, 0.8
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7.1, 0.8
>
>         Attachments: 1985-v2.txt, CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1985) read repair on CL.ONE regression

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1985:
--------------------------------------

    Remaining Estimate: 2h
     Original Estimate: 2h

> read repair on CL.ONE regression
> --------------------------------
>
>                 Key: CASSANDRA-1985
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1985
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: Kelvin Kakugawa
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.1
>
>         Attachments: 1985-v2.txt, CASSANDRA-1985-0001-fix-CL.ONE-read-repair-regression.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> read repair w/ CL.ONE had a regression.
> The RepairCallback was dropped (in the background for CL.ONE), so ReadResponseResolver : resolve() was never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.