You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Matthew F. Dennis (JIRA)" <ji...@apache.org> on 2011/04/05 23:27:06 UTC

[jira] [Created] (CASSANDRA-2420) row cache / streaming aren't aware of each other

row cache / streaming aren't aware of each other
------------------------------------------------

                 Key: CASSANDRA-2420
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.7.4
            Reporter: Matthew F. Dennis


SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.

However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.

The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-2420:
----------------------------------------

    Attachment: 0001-Handle-the-row-cache-for-streamed-row-v2.patch

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020020#comment-13020020 ] 

Jonathan Ellis commented on CASSANDRA-2420:
-------------------------------------------

nit: s/higly/highly/ in the logged warning

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021023#comment-13021023 ] 

Sylvain Lebresne commented on CASSANDRA-2420:
---------------------------------------------

Committed to 0.8 and trunk.
Was should we do about 0.7 ? I realized that we do not differentiate between the different reason for streaming in 0.7, so the simplest way to deal with this would probably be to just blindly invalidate the cache. Sounds reasonable ?

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020983#comment-13020983 ] 

Hudson commented on CASSANDRA-2420:
-----------------------------------

Integrated in Cassandra-0.8 #13 (See [https://hudson.apache.org/hudson/job/Cassandra-0.8/13/])
    Update row cache post streaming
patch by slebresne; reviewed by jbellis for CASSANDRA-2420


> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-2420:
----------------------------------------

    Attachment: 0001-Handle-the-row-cache-for-streamed-row.patch

There is a very simple patch for this issue. It consists in invalidating the cache for each key we index. The downside is that this will invalidate all key that gets repaired, but updating the cache (instead of invalidating) implies reading on disk so doing this during the indexing or at the next read may not matter much. In any case, this is better that the current situation and after all .

I however attached a patch (against trunk for now) that 'do the right thing' and will update the cache in the case of repair instead of invalidating. I mentioned the first solution in case we consider that the 'right one' is too disruptive for 0.7 for instance (not that the patch is very complicated).

Note that the patch fixes a tiny unrelated issue: the writeStat are not updated during a write if the used cache has 'isPutCopying' (this could be fixed separately).


> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021059#comment-13021059 ] 

Hudson commented on CASSANDRA-2420:
-----------------------------------

Integrated in Cassandra-0.7 #437 (See [https://hudson.apache.org/hudson/job/Cassandra-0.7/437/])
    Invalidate cache for streamed rows
patch by slebresne; reviewed by jbellis for CASSANDRA-2420


> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5, 0.8
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch, 2420-for-0.7.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021024#comment-13021024 ] 

Jonathan Ellis commented on CASSANDRA-2420:
-------------------------------------------

Yes.

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018560#comment-13018560 ] 

Jonathan Ellis commented on CASSANDRA-2420:
-------------------------------------------

I would be more comfortable having LCR throw UnsupportedOperation if asked for full row, since You Shouldn't Do That.

Would prefer the updateCache case to be AES: ... default: invalidate and break; it's more obvious looking at it what the point is, and "unnecessary" invalidate calls will be harmless.

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-2420:
----------------------------------------

    Attachment: 2420-for-0.7.patch

Attaching simple patch targeting 0.7. I put it for review individually because it's different enough from previous patch (but it's a one-liner, so should be too long to review anyway)

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch, 2420-for-0.7.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020018#comment-13020018 ] 

Jonathan Ellis commented on CASSANDRA-2420:
-------------------------------------------

+1

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne reassigned CASSANDRA-2420:
-------------------------------------------

    Assignee: Sylvain Lebresne

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019400#comment-13019400 ] 

Sylvain Lebresne commented on CASSANDRA-2420:
---------------------------------------------

bq. I would be more comfortable having LCR throw UnsupportedOperation if asked for full row, since You Shouldn't Do That.

Updated patch defines getFullColumnFamily() only for AbstractCompactedRow. However I think it would be a bad idea to fail in the Builder, so the Builder now simply invalidate the cache if he is facing a big row (hence not fitting it in memory) and log a warning since if that happens "you're doing it wrong".

I've also changed the switch case in updateCache.

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020977#comment-13020977 ] 

Hudson commented on CASSANDRA-2420:
-----------------------------------

Integrated in Cassandra #854 (See [https://hudson.apache.org/hudson/job/Cassandra/854/])
    Merge CASSANDRA-2420 from 0.8


> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2420:
--------------------------------------

             Priority: Minor  (was: Major)
    Affects Version/s:     (was: 0.7.4)
                       0.6
        Fix Version/s: 0.7.5

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Priority: Minor
>             Fix For: 0.7.5
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2420) row cache / streaming aren't aware of each other

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021049#comment-13021049 ] 

Jonathan Ellis commented on CASSANDRA-2420:
-------------------------------------------

+1

> row cache / streaming aren't aware of each other
> ------------------------------------------------
>
>                 Key: CASSANDRA-2420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2420
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Matthew F. Dennis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.5
>
>         Attachments: 0001-Handle-the-row-cache-for-streamed-row-v2.patch, 0001-Handle-the-row-cache-for-streamed-row.patch, 2420-for-0.7.patch
>
>
> SSTableWriter.Builder.build() takes tables that resulted from streaming, repair, bootstrapping, et cetera and builds the indexes and bloom filters before "adding" it so the current node is aware of it.
> However, if there is data present in the cache for a row that is also present in the streamed table the row cache can over shadow the data in the newly built table.  In other words, until the row in row cache is removed from the cache (e.g. because it's pushed out because of size, the node is restarted, the cache is manually cleared) the data in the newly built table will never be returned to clients.
> The solution that seems most reasonable at this point is to have SSTableWriter.Builder.build() (or something below it) update the row cache if the row key in the table being built is also present in the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira