You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Daniel Doubleday (Created) (JIRA)" <ji...@apache.org> on 2012/02/06 18:17:59 UTC

[jira] [Created] (CASSANDRA-3862) RowCache misses Updates

RowCache misses Updates
-----------------------

Key: CASSANDRA-3862
URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 1.0.7
Reporter: Daniel Doubleday

While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.

During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.

This seems to happen:

- Reader tries to read row A for the first time doing a getTopLevelColumns
- Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
- Reader puts the row in the cache which is now missing the update

I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.

The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.

To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.

I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Daniel Doubleday (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202833#comment-13202833 ] 

Daniel Doubleday commented on CASSANDRA-3862:
---------------------------------------------

Just had a look at it and maybe I got it wrong but:

CFS.getRawCachedRow returns null for a sentinel and CFS.updateRowCache calls this.
Isn't updateRowCache supposed to add changes to the sentinel so that cacheRow can detect the race?
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214482#comment-13214482 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

I don't think v5 works. All sentinels are empty CF, so all sentinels will be equal (in SerializingCache.contentsEqual()). Which means we can have the following sequence of actions:
* a read r1 comes, the cache is empty, it sets sentinel s1 and start reading from disk
* a write w comes and invalidate s1.
* a read r2 comes, the cache is (now) empty, it sets sentinel s2 and start reading from disk
* r1 finish reading from disk having missed w. It'll do the replace, but since all sentinel are equals this will succeed (even though the current sentinel is the one of the second read) and we'll end up having missed w.

That's the reason of the sentinel IDs of v4.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206870#comment-13206870 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

Attaching v3. This mostly fix a but of the previous version where sentinels were not handled correctly in cacheRow(). I've also switch back to getRawCachedRow.
I'm not fully sure what you proposed to split exactly, but v3 does split cacheRow() in the hope of increasing clarity. 
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862-v2.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Resolved) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne resolved CASSANDRA-3862.
-----------------------------------------

    Resolution: Fixed

Committed, thanks
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862-v8.txt, 3862.patch, 3862_v3.patch, 3862_v8_addon.txt, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment: 3862-cleanup.txt

Attached cleanup patch that applies on top of v3.  Most of the changes are adding docstrings/comments and cleaning up typos.

A minor change to the code was to make cacheRow take just cfId and filter, removing the redundant filter.key as a parameter.

I also renamed cacheRow to getThroughCache.  Still not 100% happy with that, but my goal is to make the distinction between readAndCache more obvious.

Finally, I've modified the logic in invalidateCachedRow according to the reasoning in this comment:

{noformat}
.       // This method is used to (1) drop obsolete entries from a copying cache after the row in question was updated
        // and to (2) make sure we're not wasting cache space on rows that don't exist anymore post-compaction.
        // Sentinels complicate this because it means we've caught a read thread in the process of loading
        // the cache, and we don't know (in case 2) if it will do so with rows from before the compaction or after,
        // so we need to loop until the load completes.
{noformat}

(I also negated the loop condition, which looked like an oversight.)
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Issue Comment Edited) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216083#comment-13216083 ] 

Jonathan Ellis edited comment on CASSANDRA-3862 at 2/24/12 11:19 PM:
---------------------------------------------------------------------

v7 attached.

bq. Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily

True, but when I tried this I ended up with a LOT of casting cache values to CF.  I think it might be the lesser of evils the way it is.

bq. the counter special case is not needed anymore

Updated.

bq. if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading

Makes sense, updated.

bq. Is it really an improvement to use UUIDs (over an AtomicLong)? 

I'd rather have the reduced contention on instantiation than the 8 bytes of space (during the sentinel lifetime -- this goes away once the sentinel is replaced by the data CF).
                
      was (Author: jbellis):
    v7 attached.

bq. Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily

True, but when I tried this I ended up with a LOT of casting cache values to CF.  I think it might be the lesser of evils the way it is.

bq. the counter special case is not needed anymore

Updated.

bq. if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading

Makes sense, updated.

bq. Is it really an improvement to use UUIDs (over an AtomicLong)? 

I'd rather have the reduced contention on instantiation than the 4 bytes of space (during the sentinel lifetime -- this goes away once the sentinel is replaced by the data CF).
                  
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3862:
----------------------------------------

    Attachment: 3862_v8_addon.txt

v8 lgtm mostly except for the 3 remarks at the beginning of my previous comment. Added simple patch on top of v8 that does the proposed modifications.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862-v8.txt, 3862.patch, 3862_v3.patch, 3862_v8_addon.txt, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment: 3862-v6.txt

v6 pulls RCS out to a separate file and adds a uuid version and equals/hashcode methods.  SerializingCacheProvider uses a custom CF serializer that is RCS-aware.  SerializingCache.replace is simplified to use RCS.equals.  CAS loop is extended to non-serializing cache: since cache/write race is extremely rare, I'd rather take the occasional re-read penalty, than increase the overhead of every row in the cache by making them RCS objects permanently.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3862:
----------------------------------------

    Attachment: 3862_v8_addon.txt

Oups, you're right. Patch updated.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862-v8.txt, 3862.patch, 3862_v3.patch, 3862_v8_addon.txt, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3862:
----------------------------------------

    Attachment: 3862_v3.patch
    
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862-v2.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment: 3862-v5.txt

Attached v5 with a simpler approach: for serializing cache, getThroughCache does a classic CAS loop with a sentinel vs the write's invalidate.

v5 also adds a containsCachedRow method to CFS so that callers that don't care about the value don't force a deserialize in the serializing cache case.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Daniel Doubleday (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201610#comment-13201610 ] 

Daniel Doubleday commented on CASSANDRA-3862:
---------------------------------------------

Hmokay ... don't want to abuse Jira as an educational forum but maybe as a reward for the bugreport :-) ... are you saying that a reader could see a memtable view where flushing memtables are gone (flushed) and sstables don't contain the flushed memtables?

If that's the case than yes the cache would lose an update. But that what also imply that a read could miss an update without caching being in place at all no?

Otherwise (and that's how I read the code) given that the memtable switch will only happen after the merge the reader will read all updates because they are either in (flushing) memtables or in sstables and the cache will be in fact valid.


                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406783#comment-13406783 ] 

Hudson commented on CASSANDRA-3862:
-----------------------------------

Integrated in Cassandra #1646 (See [https://builds.apache.org/job/Cassandra/1646/])
    restore pre-CASSANDRA-3862 approach to removing expired tombstones during compaction (Revision fbb5ec0374e1a5f1b24680f1604b6e9201fb535f)
fix build - re-add CompactionController.removeDeletedInCache for commit fbb5ec0374e1a5f1b24680f1604b6e9201fb535f restore pre-CASSANDRA-3862 approach to removing expired tombstones during compaction (Revision 086c06ad7fb211de6be877c3c1ea2ee4f86c6d7e)

     Result = ABORTED
jbellis : 
Files : 
* src/java/org/apache/cassandra/db/compaction/CompactionIterable.java
* CHANGES.txt

dbrosius : 
Files : 
* src/java/org/apache/cassandra/db/compaction/CompactionController.java

                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862-v8.txt, 3862.patch, 3862_v3.patch, 3862_v8_addon.txt, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Daniel Doubleday (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Doubleday updated CASSANDRA-3862:
----------------------------------------

    Attachment: include_memtables_in_rowcache_read.patch

Dunno if there's a better way to do it...
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3862:
----------------------------------------

    Attachment: 3862-v2.patch

You're right, thanks for catching that. Attached v2 fixed this (I realized that when we hit the cache during range_slice queries we don't update the statistics, which I'm not sure is what we want, but it's unrelated to that issue so haven't changed it).
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862-v2.patch, 3862.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201436#comment-13201436 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

I believe you are absolutely right that this is a bug.

Unfortunately I don't think including the memtables during cache reads really solves it. If you miss an update, it won't ever get added to the cached row, but the update itself will be flushed at some point and thus not be in any memtable anymore.

One partial solution I see could be that when a read 'reads for caching', it starts adding some sentinel object in the cache for the given row key. That sentinel would need to be an actual (empty) row but marked with the fact it's only a sentinel. When a write look if the row is cache, if it's a sentinel we would add the write to the sentinel. Once the read returns and we actually put the row in cache, we would it (atomically) with the content of the sentinel. A read that check the cache and see a sentinel would just skip the cache (and would not put it's result into the cache). Adapting that to the serializingCache is trivial.

Unfortunately, this is not perfect because this would screw counters. Though I guess for counters we could do the same thing as we would do for the serializingCache, i.e, if a read that 'reads for caching' see that the sentinel is not empty, we would just not cache the result (i.e, a row would be cache only if we are sure no write were done concurrently to the read).
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3862:
----------------------------------------

    Attachment: 3862.patch

Patch attached with my "sentinel" idea (The patch is against 1.1 currently). I *think* this fixes the problem, and this deal with counters.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Daniel Doubleday (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201499#comment-13201499 ] 

Daniel Doubleday commented on CASSANDRA-3862:
---------------------------------------------

bq.  If you miss an update, it won't ever get added to the cached row, but the update itself will be flushed at some point and thus not be in any memtable anymore.

Very true ...

How about adopting the strategy we apply with CASSANDRA-2864:

- Writers dont update the cache at all
- Readers merge cache with memtables
- Upon flush merge memtables with cache
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

             Reviewer: jbellis
    Affects Version/s:     (was: 1.0.7)
                       0.6
        Fix Version/s: 1.1.0
             Assignee: Sylvain Lebresne
    
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217254#comment-13217254 ] 

Jonathan Ellis commented on CASSANDRA-3862:
-------------------------------------------

shouldn't {{if (data == null)}} in the finally block be {{if (sentinelSuccess && data == null)}} ?
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862-v8.txt, 3862.patch, 3862_v3.patch, 3862_v8_addon.txt, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201519#comment-13201519 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

{quote}
How about adopting the strategy we apply with CASSANDRA-2864:

* Writers dont update the cache at all
* Readers merge cache with memtables
* Upon flush merge memtables with cache
{quote}
The problem with that is that I don't see how we can make that work for counters at all. I also think it would be nice not having to merge on reads if we can avoid it (even if it's in-memory, it still uses CPU).

As a side note, I also suspect it's not bulletproof in theory, as a memtable could be fully flushed while a 'read to be cached' happens and with a bad timing during that, we could still miss an update. Of course, that kind of timing have almost no chance to happen. But in the case where a user triggers a flush manually, a memtable with only a handful of columns could be flushed very quickly, and I suspect the behavior could be observed. However unlikely that is, it'd be better if we can fix this problem once and for all.

I'll probably give a shot to my 'sentinel' proposal described above, I don't think it's too much code.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment: 3862-cleanup.txt
    
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment:     (was: 3862-v6.txt)
    
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215535#comment-13215535 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

Remarks on v6:
* Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily. We should probably create a CachedRow class extended by both Sentinel (that would really just be an identifier, no metadata needed) and ColumnFamily and use that as cache values. It'll be cleaner and more importantly more type safe (a cache lookup won't be able to ignore by mistake that it could get a sentinel).
* Not adding stuffs to the sentinel also mean that in getThroughCache the counter special case is not needed anymore.
* In getThroughCache, if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading. Better let the next client read cache the data than getting a crappy latency on the current read.
* Is it really an improvement to use UUIDs (over an AtomicLong)? I have nothing against UUID per se but it takes twice the space (and we serialize them) and without having benchmarked it, I'm willing to bet are much faster to generate. And let's be honest, the risk of overflow with an AtomicLong is science-fiction (or to be precise, at 1 millions sentinels created per seconds (which is *way* more than we'll ever see), you'd need more than 100,000 year of uptime to overflow).

                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment: 3862-v6.txt
    
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202145#comment-13202145 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

To be precise, what I'm saying is that (at least in theory) the following scenario would be possible:
* A read-for-cache read the memtables grabing updates
* then it start reading the sstables
* while the previous happens, a new update arrives. The memtable is then flushed and happens to be fully flushed *before* our read-for-cache completes.

In that case, the new update won't be part of the cached row (ever) because during the flush (when we would merge the memtable to the cache) the row was not in the cache yet. That may seem far fetched but consider a simple implementation of you proposition, where the 'upon flush merge memtables with cache' phase happens in the same loop over rows that is used for flushing. It is actually possible for a new write to be "flushed" within a few milliseconds of being received by the node: if the update triggers the memtable threshold *and* sorts at the very beginning of the memtable. But don't get me wrong, it would probably be possible to deal with that problem, but it feels a bit complicated and error prone.


                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205891#comment-13205891 ] 

Jonathan Ellis commented on CASSANDRA-3862:
-------------------------------------------

Looks to me like we might be able to simplify things by splitting the "initialize row cache" code (which can assume the cache is empty, and does not need a filter) out from the "look up a cached row and cache it if it is not present" method.

Nit: although not perfect, IMO "getRawCachedRow" is a better method name than "getCachedRowNoStats" -- the important thing to convey is that we're only inspecting the cache's contents, not changing them.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862-v2.patch, 3862.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment: 3862-v8.txt

v8 attached w/ long sentinel and IRowCacheEntry.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862-v8.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217258#comment-13217258 ] 

Jonathan Ellis commented on CASSANDRA-3862:
-------------------------------------------

+1
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862-v8.txt, 3862.patch, 3862_v3.patch, 3862_v8_addon.txt, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3862:
----------------------------------------

    Attachment: 3862-v4.patch

Actually then handling of the copying patch by the preceding patches is wrong.  When a put arrives and there is a sentinel, the patch does not add the put to the sentinel correctly. But thinking about it, for the copying cache, we should avoid having writes check the current value in the cache, because that have a non-negligible performance impact. What we should do is let invalidate actually invalidate sentinels. The only problem we're faced with if we do that, is that when a read-for-caching returns, it must make sure his own sentinel hasn't been invalidated. And in particular it must be careful of the case where the sentinel has been invalidated and another read has set another sentinel.

Anyway, attaching a v4 (that include the comments cleanups) that choose that strategy instead (and thus is (hopefully) not buggy even in the copying cache case). Note that it means that reads must be able to identify sentinels uniquely (not based on the content), so the code assign a unique ID to sentinel and use that for comparison.

                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207637#comment-13207637 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

The cleanup lgtm.

For the change to invalidateCacheRow however, I wonder if it's worth it. By waiting when we found a sentinel, we may have writes waiting on a read to complete, which could involve a non negligible latency spike. On the other side, if we just leave the sentinel in that case, the only risk we take is that a read may end up putting tombstone in the cache that are already expired. But it doesn't seem like a big deal, especially given that it will very rarely happen.

But in any case, you're right about negating the loop condition. 
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment: 3862-7.txt

v7 attached.

bq. Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily

True, but when I tried this I ended up with a LOT of casting cache values to CF.  I think it might be the lesser of evils the way it is.

bq. the counter special case is not needed anymore

Updated.

bq. if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading

Makes sense, updated.

bq. Is it really an improvement to use UUIDs (over an AtomicLong)? 

I'd rather have the reduced contention on instantiation than the 8 bytes of space (during the sentinel lifetime -- this goes away once the sentinel is replaced by the data CF).
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3862:
--------------------------------------

    Attachment:     (was: 3862-cleanup.txt)
    
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3862:
----------------------------------------

    Attachment:     (was: 3862_v8_addon.txt)
    
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862-v8.txt, 3862.patch, 3862_v3.patch, 3862_v8_addon.txt, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217127#comment-13217127 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

* In SerializingCache, remove misses a "not" in the while condition (this date back from one of my earlier patch). We don't need that new remove method anymore though so it's probably as simple to just remove it from the patch.
* In CFS.getThroughCache, the following line
{noformat}
boolean sentinelSuccess = !CacheService.instance.rowCache.putIfAbsent(key, sentinel);
{noformat}
should not be negated.
* Also in CFS.getThroughCache, we won't remove the sentinel if there is an exception during the read. It's not a big deal but it doesn't cost much to prevent it from happening.

bq. True, but when I tried this I ended up with a LOT of casting cache values to CF. I think it might be the lesser of evils the way it is.

In that case, I think I wouldn't mind too much casts and I would prefer getting the type safety of knowing that a method that take a ColumnFamily can't ever get a sentinel (and to make it explicit when you need to care about sentinel or not), but that's a bit subjective. There would also be some small wins like the fact we wouldn't need to save the cfId when serializing a sentinel.

bq. I'd rather have the reduced contention on instantiation than the 8 bytes of space

My point was that the UUID don't reduce contention. UUID.randomUUID() uses SecureRandom.nextBytes() that is synchronized (and thus likely entail a much bigger degradation in face of contention than an AtomicLong) and probably a bit CPU intensive. For reference, I did a quick micro-benchmark having 50 threads generating 10,000 ids simultaneously using both methods, using an AtomicLong is two orders of magnitude faster.

                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207713#comment-13207713 ] 

Jonathan Ellis commented on CASSANDRA-3862:
-------------------------------------------

bq. By waiting when we found a sentinel, we may have writes waiting on a read to complete, which could involve a non negligible latency spike

You're right, that's a worse negative than leaving tombstones in the cache.  I'm fine with changing it back if you update the comments accordingly. :)
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-3862) RowCache misses Updates

Posted by "Jonathan Ellis (Issue Comment Edited) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216083#comment-13216083 ] 

Jonathan Ellis edited comment on CASSANDRA-3862 at 2/24/12 11:19 PM:
---------------------------------------------------------------------

v7 attached.

bq. Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily

True, but when I tried this I ended up with a LOT of casting cache values to CF.  I think it might be the lesser of evils the way it is.

bq. the counter special case is not needed anymore

Updated.

bq. if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading

Makes sense, updated.

bq. Is it really an improvement to use UUIDs (over an AtomicLong)? 

I'd rather have the reduced contention on instantiation than the 4 bytes of space (during the sentinel lifetime -- this goes away once the sentinel is replaced by the data CF).
                
      was (Author: jbellis):
    v7 attached.

bq. Since we don't add stuffs to the sentinel, it has no reason to be a subclass of ColumnFamily

True, but when I tried this I ended up with a LOT of casting cache values to CF.  I think it might be the lesser of evils the way it is.

bq. the counter special case is not needed anymore

Updated.

bq. if we fail to replace the sentinel, I think we should still better return the data rather than looping and re-reading

Makes sense, updated.

bq. Is it really an improvement to use UUIDs (over an AtomicLong)? 

I'd rather have the reduced contention on instantiation than the 8 bytes of space (during the sentinel lifetime -- this goes away once the sentinel is replaced by the data CF).
                  
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not eagerly read during write (because we want fast writes) so the writer cannot perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately didn't dig after I got no answer since I assumed that I just missed something. In a way I still do but haven't found any locking mechanism that makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I restart the server the expected column is there. It's just missing from the cache.
> To test I have created a patch that merges memtables with the row cache. With the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira