You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Pavel Yaskevich (JIRA)" <ji...@apache.org> on 2012/06/05 00:11:23 UTC

[jira] [Created] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Pavel Yaskevich created CASSANDRA-4305:
------------------------------------------

             Summary: CF serialization failure when working with custom secondary indices.
                 Key: CASSANDRA-4305
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.0.10
            Reporter: Pavel Yaskevich
            Assignee: Pavel Yaskevich
             Fix For: 1.0.11


Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).

{noformat}
ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
        at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
        at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
        at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
        at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        at java.lang.Thread.run(Thread.java:662)
{noformat}

After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288971#comment-13288971 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

It's easier to say than to get :) Although, we would make that serialization anyway and I think better sooner than later because don't really need it.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>              Labels: datastax_qa
>             Fix For: 1.0.11
>
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291708#comment-13291708 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

Ok, it is kind of pointless to argue about what can happen in the future but even from your examples it makes a lot of sense to guarantee RM integrity if we are to send it to yet another thread or require in CL, otherwise you very much risk persisting the corrupted data at some point (we don't have mechanism to reject modifications), because as the amount of processing in Table.apply grows it does so coherent with probability of unnoticed corruption e.g. when secondary index code would modify cf or columns by mistake racy with triggers/CL for example, which would lead to a very bad situation. Even if we are to somehow "optimize so that serialize the RM directly to the file (to avoid a copy)" we still need to convert it into writable form don't we? And thats were we would have to make hundred and five assertions just to notice that the calculated size matches the actual data size (like we do in FBUtilities.serialize()) because we would race with other components using the same mutation, e.g. we don't have a full control over indexing code anymore and even the corruption is not our mistake per se, we share a good part of guilt just because we let that happen due to the design decisions which in it's turn would make a negative impression overall.

bq. Furthermore, I have doubt that cloning the CF you're reusing before passing them to RM in your 2ndary index code will have a measurable impact on performance (though if you have numbers to show that it does make a noticeable difference, then it's a different discussion).

This is double standards, why do we try so hard not to make a one copy for serialization but instead require from secondary index to do a clone, of possibly, each CF and do that at the same stage of write path? I'm talking about cfs.indexManager.applyIndexUpdates() in Table.apply for example.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Comment Edited] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290082#comment-13290082 ] 

Pavel Yaskevich edited comment on CASSANDRA-4305 at 6/6/12 11:21 AM:
---------------------------------------------------------------------

bq. Granted, it's an implementation detail, but the more I think about it, the more I think it's one index implementers should be aware of and which makes sense if you think about it: if you mutate a RM post-apply, you're either going to duplicate part of the mutation as Sylvain says, or index data that didn't make it to the commitlog, either of which is Bad.

I disagree, after RM is send to the processing it's state should be persisted, there is no reason not to allow user to mutate his own objects e.g. when one want to populate CF with missing columns to create whole document (to store it into external component like Solr) with all indexed columns (old + new) so instead of copying everything (cf + columns + old columns), just missing columns could be fetched from the DB and added to the existing CF while secondary indexes are processed. Otherwise removing the need to make one copy in CL that would instead make multiple copies while secondary indices are processed.
                
      was (Author: xedin):
    bq. Granted, it's an implementation detail, but the more I think about it, the more I think it's one index implementers should be aware of and which makes sense if you think about it: if you mutate a RM post-apply, you're either going to duplicate part of the mutation as Sylvain says, or index data that didn't make it to the commitlog, either of which is Bad.

I disagree, after RM is send to the processing it's state should be persisted, there is no reason not to allow user to mutate his own objects e.g. when one want to populate CF with missing columns to create whole document with all indexed columns (old + new) so instead of copying everything (cf + columns + old columns), just missing columns could be fetched from the DB and added to the existing CF while secondary indexes are processed. Otherwise removing the need to make one copy in CL that would instead make multiple copies while secondary indices are processed.
                  
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290236#comment-13290236 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

bq. But if there is a window where thing can go wrong, you actually want it to be as large as possible. The goal is to catch bug as quickly as possible and fix them, not simply make sure they happen less often.

Concurrent modification could happen only when we have already submitted RM to CL and go on with secondary index processing down in Table.apply method, so if we serialize RM upfront before adding it to the CL queue it would guarantee that concurrent modifications wouldn't affect the state of CF objects that we actually wanted to persist in CL when we generated RM. Even if there would be other threads modifying internal CF objects, which I doubt because the CL is only source of concurrency there, everything would fail before submitting RM for further processing which is a good thing in this situation.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290142#comment-13290142 ] 

Jonathan Ellis commented on CASSANDRA-4305:
-------------------------------------------

bq. there is no reason not to allow user to mutate his own objects 

Sure there is.  Think of RM as a "builder" api, except we avoid actually copying to a new structure via build() for performance.  "Once you pass it to another thread, don't modify it" is a totally reasonable rule.

bq. we would make that serialization anyway 

Hmm, you're right, we currently do the extra copy anyway -- I guess to simplify the checksumming.  But my objection stands in principle; I don't want to rule out making that optimization in the future.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288968#comment-13288968 ] 

Jonathan Ellis commented on CASSANDRA-4305:
-------------------------------------------

Hmm, I'll need to think this through a bit...  This adds an extra copy to the CL path (to the byte[]), which we've been trying to avoid.

My first inclination is to say, "stop modifying RM after sending them to the CL and everything will work fine."
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>              Labels: datastax_qa
>             Fix For: 1.0.11
>
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288971#comment-13288971 ] 

Pavel Yaskevich edited comment on CASSANDRA-4305 at 6/4/12 10:43 PM:
---------------------------------------------------------------------

It's easier to say than to get :) Although, we would make that serialization anyway and I think better sooner than later because don't really need to hold RowMutation all the time before it really gets executed.
                
      was (Author: xedin):
    It's easier to say than to get :) Although, we would make that serialization anyway and I think better sooner than later because don't really need it.
                  
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>              Labels: datastax_qa
>             Fix For: 1.0.11
>
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290192#comment-13290192 ] 

Jonathan Ellis commented on CASSANDRA-4305:
-------------------------------------------

(Which is why, the handful of other times this has come up, we've always changed the offending code to not modify the RM concurrently.)
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290155#comment-13290155 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

bq. Sure there is. Think of RM as a "builder" api, except we avoid actually copying to a new structure via build() for performance. "Once you pass it to another thread, don't modify it" is a totally reasonable rule.

The flow in design that you describe is that you take RM as monolite structure but if you look at it from the angle of just being a container to be able persist the data inside it, than it becomes reasonable that user has all rights to modify objects that were given to that container after it was send to processing (probably to another thread) because RM should be responsible of keeping the state of internal structure constant, this applies clearly on indexes because if indexer wants to persist the state of the objects at some point in time and then go on processing them further (e.g. to store to some external component or return to the user) it shouldn't be forced to clone everything just because we don't preserve the state of RM (to avoid a single serialization).
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290205#comment-13290205 ] 

Sylvain Lebresne commented on CASSANDRA-4305:
---------------------------------------------

But if you modify the object concurrently, whenever the serialization is done, you may end up modifying the object during serialization which would trigger a bug anyway. Pushing the serialize in Table.apply doesn't make modifying the CF contained by the RM thread safe, it just shorten the window in which you can have a bug.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290177#comment-13290177 ] 

Jonathan Ellis commented on CASSANDRA-4305:
-------------------------------------------

bq. you look at it from the angle of just being a container to be able persist the data inside it, than it becomes reasonable that user has all rights to modify objects that were given to that container after it was send to processing 

It's deeply silly to make arguments based on abstract conceptions of "users."  This is an internal API, with a very narrow use case.  As I said initially, it's neither practical nor desirable to try to allow index implementers to remain ignorant of all Cassandra internals.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290184#comment-13290184 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

bq. It's deeply silly to make arguments based on abstract conceptions of "users." This is an internal API, with a very narrow use case. As I said initially, it's neither practical nor desirable to try to allow index implementers to remain ignorant of all Cassandra internals.

I don't make arguments on the conception of "users" indeed I make them on the particular design you have described in your previous comment. The silly is to advocate bug as implementation detail which could lead to hardly reproducible assertion errors in different parts of the system. I'm not trying to make RM reusable, nothing of that sort, the problem in RM is in fact that it should properly preserve the state of internal objects once it's created but it doesn't do that. Nobody was trying to re-use or re-apply RM in index code but instead was changing (adding columns) to some of the ColumnFamily objects that is passed by-reference to RM. That "changing" was done separately to be able to properly populate external structure (in that case Solr Document) without coping the whole contents of CF and *none* of those changes was intended to be persisted (included into RM) which lead to this concurrency bug from description where CL was trying to serialize the RM while indexer was concurrently modifying CF objects it previously passed to RM.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290082#comment-13290082 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

bq. Granted, it's an implementation detail, but the more I think about it, the more I think it's one index implementers should be aware of and which makes sense if you think about it: if you mutate a RM post-apply, you're either going to duplicate part of the mutation as Sylvain says, or index data that didn't make it to the commitlog, either of which is Bad.

I disagree, after RM is send to the processing it's state should be persisted, there is no reason not to allow user to mutate his own objects e.g. when one want to populate CF with missing columns to create whole document with all indexed columns (old + new) so instead of copying everything (cf + columns + old columns), just missing columns could be fetched from the DB and added to the existing CF while secondary indexes are processed. Otherwise removing the need to make one copy in CL that would instead make multiple copies while secondary indices are processed.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290199#comment-13290199 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

And that is yet another clue that it should be a immutable "container" from my previous comment and as we don't really want to make an object immutable itself (would require copying internals) the other way is to serialize it on Table.apply to preserve the state otherwise such errors would come up repeatedly.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290210#comment-13290210 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

Sorry I wasn't very clear about that but it seems to be lesser evil if we serialize it in Table.apply() before queuing for CL or in RowMutation.apply{Unsafe}() to pass it to Table.apply, the window for things to go wrong would be very tight comparing to failures deep in CL stage which would stall stage processing as we have seen it happening after assertion error.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290173#comment-13290173 ] 

Sylvain Lebresne commented on CASSANDRA-4305:
---------------------------------------------

The fact is, guaranteeing that one can safely reuse a RM after apply() potentially limits us on what optimization we can do internally (i.e. we, internally, may have to do copy to offer the guarantee).

Besides, there is also the preserializedBuffers business inside RM that also make RM object non-reusable. Applying an already applied mutation is a bug, because the commit log will use the old version of the preserialized buffer. 

I also don't think that cloning a RM in the index code will be hugely expensive (but I do agree that RM ought to have a cloneMe() method to make that simpler).
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289206#comment-13289206 ] 

Sylvain Lebresne commented on CASSANDRA-4305:
---------------------------------------------

What's the big difficulty with not modifying RM objects once you've called apply on them? And I really think RM object shouldn't be reused, doing so feels like a bug because it means you will likely re-apply the same mutations multiple time.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>              Labels: datastax_qa
>             Fix For: 1.0.11
>
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291615#comment-13291615 ] 

Sylvain Lebresne commented on CASSANDRA-4305:
---------------------------------------------

No, the main argument is that we are reluctant to make the guarantee that RM object and their CF objects are safe to be reused. And the reason for that is that maybe tomorrow we will add something to the commit log code that will require having more than just the serialized mutation, or maybe, as Jonathan said, we will be able to optimize so that serialize the RM directly to the file (to avoid a copy), or maybe add some other code to Table.apply() (something related to triggers for instance) that will want to pass the RM object to a separate thread.

The point is, assuming that a constucted RM (it's constructed as soon as you call apply basically) is immutable makes it more easy to reason with in a concurrent context. And the fact that we don't really need to use RM object concurrently today don't mean we won't do it tomorrow, so saying that it is OK to mutate them will likely make our life harder at some point. And if we don't guarantee that it's ok to mutate a constructed RM, then what is the point of making it ok today, since we may break it tomorrow?

Furthermore, I have doubt that cloning the CF you're reusing before passing them to RM in your 2ndary index code will have a measurable impact on performance (though if you have numbers to show that it does make a noticeable difference, then it's a different discussion).

                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290227#comment-13290227 ] 

Sylvain Lebresne commented on CASSANDRA-4305:
---------------------------------------------

bq.  the window for things to go wrong would be very tight comparing to failures deep in CL stage

But if there is a window where thing can go wrong, you actually want it to be as large as possible. The goal is to catch bug as quickly as possible and fix them, not simply make sure they happen less often.

So either we can make RM objects thread safe (which would require to freeze them somehow, but I think we agree would be too costly, much more costly than doing a simple clone of the RM and it's CF) or we should clearly say that user "shall not modify RM and the CF it contains concurrently at all" (and in that case, we certainly don't want to make bugs harder to find).
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290183#comment-13290183 ] 

Jonathan Ellis commented on CASSANDRA-4305:
-------------------------------------------

bq. I'll add a javadoc comment to Table.apply to this effect.

Done in 39aae3d8309e8cc4ba46feb24711f65796dbd88e
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290329#comment-13290329 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

I was talking about case number #1 and by using "concurrently" in relation to indexer i meant "concurretly with CL" which makes #1 to look more like #2 with only difference that RM is send to another thread instead of CF. So by serializing RM and sending serialized to CL queue we would be safe using it in both places. 
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290187#comment-13290187 ] 

Jonathan Ellis commented on CASSANDRA-4305:
-------------------------------------------

bq. indexer was concurrently modifying CF objects it previously passed to RM

Well, that's the whole crux of the argument, isn't it?

My contention is that "don't do that" is a reasonable API decision for us to make.  Trying to provide the kind of thread-safety you want imposes constraints on what we can do with the CommitLog internals that I'm not willing to live with.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290249#comment-13290249 ] 

Sylvain Lebresne commented on CASSANDRA-4305:
---------------------------------------------

Wait, there is two case:
# Either what you want to do in your 2ndary code is
{noformat}
ColumnFamily cf = ...
RowMutation rm = ...
rm.add(cf);
rm.apply();
cf.add(...);
{noformat}
and in that case, yes, moving the serialization to Table.apply() would allow that. But that's the mono-threaded case, there is no concurrent sharing (from your point of view I mean). 
# or what you want to do is concurrent sharing, something like:
{noformat}
ColumnFamily cf = ...
pass cf to some other thread t2 that will modify cf at some point
RowMutation rm = ...
rm.add(cf);
rm.apply();
{noformat}
and in that case, moving the serialization to Table.apply() does not help, because maybe t2 will modify cf while apply() is serializing the mutation in the initial thread.

Now maybe you're talking of the first case, but sentences like "indexer was concurrently modifying CF objects it previously passed to RM" made it sound like you're talking of case 2, so let's be clear on what we're talking about.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Yaskevich updated CASSANDRA-4305:
---------------------------------------

    Attachment: CASSANDRA-4305.patch
    
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>              Labels: datastax_qa
>             Fix For: 1.0.11
>
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291431#comment-13291431 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

Also if the main argument against this being that we don't want to use by-value CommitLogSegment.write(byte[]) than we can wrap row serialized mutation into ByteBuffer in LogRecordAdder and pass it to write method by-reference, we have to do serialization anyway so I see no reason to hold RM object around all the time for CL as we really want only a serialized value there.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Pavel Yaskevich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289331#comment-13289331 ] 

Pavel Yaskevich commented on CASSANDRA-4305:
--------------------------------------------

The difficulty is that RM object contains CF internal objects which are mutable and as we have secondary index API exposed to the users it is easy to shoot yourself in the foot but modifying one of the CF objects somewhere deep inside of custom secondary index class for example (like Solr per-row index did in DSE, I have figured), which would lead to unpredictable behavior and strange errors because of concurrent CL execution method. And as there is no way to enforce immutability inside of row mutation that would be a bad practice to punish users on the implementation detail they shouldn't be aware of when using secondary index API.
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>              Labels: datastax_qa
>             Fix For: 1.0.11
>
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.

Posted by "Mark Valdez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290489#comment-13290489 ] 

Mark Valdez commented on CASSANDRA-4305:
----------------------------------------

I am writing to inform you that I tested the SolrTest.jar on Cassandra's default(1.0.8) of DSE 2.0-1 and encountered the same errors with failed run. However after applying the patch on Cassandra 1.0.9 and then running SolrTest.jar on DSE 2.0-1 I received NO errors and the test ran for an indefinite period. Good job!
                
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted 87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1 [long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira