You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Michael Kjellman (JIRA)" <ji...@apache.org> on 2012/10/20 03:08:11 UTC

[jira] [Created] (CASSANDRA-4841) Exception thrown during nodetool repair -pr

Michael Kjellman created CASSANDRA-4841:
-------------------------------------------

             Summary: Exception thrown during nodetool repair -pr
                 Key: CASSANDRA-4841
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4841
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.1.6
         Environment: Ubuntu 12.04 x64, 32GB RAM
            Reporter: Michael Kjellman
            Priority: Critical


During a nodetool repair -pr an exception was thrown.

root@scl-cas04:~# nodetool repair -pr
Exception in thread "main" java.rmi.UnmarshalException: Error unmarshaling return header; nested exception is:
        java.io.EOFException
        at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:227)
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
        at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
        at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source)
        at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1017)
        at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:305)
        at $Proxy0.forceTableRepairPrimaryRange(Unknown Source)
        at org.apache.cassandra.tools.NodeProbe.forceTableRepairPrimaryRange(NodeProbe.java:209)
        at org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:1044)
        at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:834)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:213)
        ... 9 more

Logs:

WARN 20:53:27,135 Heap is 0.9710037912028483 full.  You may need to reduce memtable and/or cache sizes.  Cassandr
a will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold i
n cassandra.yaml if you don't want Cassandra to do this automatically

regardless of configuration issues a repair shouldn't crash a node.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4841) Exception thrown during nodetool repair -pr

Posted by "Michael Kjellman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481564#comment-13481564 ] 

Michael Kjellman commented on CASSANDRA-4841:
---------------------------------------------

jmx was dead, as was thrift and gossip. so the node was effectively in a crashed state imho.
                
> Exception thrown during nodetool repair -pr
> -------------------------------------------
>
>                 Key: CASSANDRA-4841
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4841
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6
>         Environment: Ubuntu 12.04 x64, 32GB RAM
>            Reporter: Michael Kjellman
>            Priority: Critical
>
> During a nodetool repair -pr an exception was thrown.
> root@scl-cas04:~# nodetool repair -pr
> Exception in thread "main" java.rmi.UnmarshalException: Error unmarshaling return header; nested exception is:
>         java.io.EOFException
>         at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:227)
>         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
>         at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
>         at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source)
>         at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1017)
>         at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:305)
>         at $Proxy0.forceTableRepairPrimaryRange(Unknown Source)
>         at org.apache.cassandra.tools.NodeProbe.forceTableRepairPrimaryRange(NodeProbe.java:209)
>         at org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:1044)
>         at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:834)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readByte(DataInputStream.java:267)
>         at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:213)
>         ... 9 more
> Logs:
> WARN 20:53:27,135 Heap is 0.9710037912028483 full.  You may need to reduce memtable and/or cache sizes.  Cassandr
> a will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold i
> n cassandra.yaml if you don't want Cassandra to do this automatically
> regardless of configuration issues a repair shouldn't crash a node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-4841) Exception thrown during nodetool repair -pr

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-4841.
---------------------------------------

    Resolution: Invalid

Sounds like you were GC storming then.  Grep for GCInspector or top compared with thread IDs could confirm this.

Often this happens because sstable structures outgrew the space Cassandra reserves for it in the heap.  You can reduce these with the (global) index_interval and (per-CF) bloom_filter_fp_chance settings.  Unfortunately Cassandra is not yet able to self-tune this.
                
> Exception thrown during nodetool repair -pr
> -------------------------------------------
>
>                 Key: CASSANDRA-4841
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4841
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6
>         Environment: Ubuntu 12.04 x64, 32GB RAM
>            Reporter: Michael Kjellman
>            Priority: Critical
>
> During a nodetool repair -pr an exception was thrown.
> root@scl-cas04:~# nodetool repair -pr
> Exception in thread "main" java.rmi.UnmarshalException: Error unmarshaling return header; nested exception is:
>         java.io.EOFException
>         at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:227)
>         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
>         at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
>         at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source)
>         at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1017)
>         at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:305)
>         at $Proxy0.forceTableRepairPrimaryRange(Unknown Source)
>         at org.apache.cassandra.tools.NodeProbe.forceTableRepairPrimaryRange(NodeProbe.java:209)
>         at org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:1044)
>         at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:834)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readByte(DataInputStream.java:267)
>         at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:213)
>         ... 9 more
> Logs:
> WARN 20:53:27,135 Heap is 0.9710037912028483 full.  You may need to reduce memtable and/or cache sizes.  Cassandr
> a will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold i
> n cassandra.yaml if you don't want Cassandra to do this automatically
> regardless of configuration issues a repair shouldn't crash a node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4841) Exception thrown during nodetool repair -pr

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481468#comment-13481468 ] 

Jonathan Ellis commented on CASSANDRA-4841:
-------------------------------------------

Doesn't this just mean "JMX got tired and timed out before the repair completed?"

Nothing here indicates that repair crashed a node.
                
> Exception thrown during nodetool repair -pr
> -------------------------------------------
>
>                 Key: CASSANDRA-4841
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4841
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6
>         Environment: Ubuntu 12.04 x64, 32GB RAM
>            Reporter: Michael Kjellman
>            Priority: Critical
>
> During a nodetool repair -pr an exception was thrown.
> root@scl-cas04:~# nodetool repair -pr
> Exception in thread "main" java.rmi.UnmarshalException: Error unmarshaling return header; nested exception is:
>         java.io.EOFException
>         at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:227)
>         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
>         at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
>         at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source)
>         at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1017)
>         at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:305)
>         at $Proxy0.forceTableRepairPrimaryRange(Unknown Source)
>         at org.apache.cassandra.tools.NodeProbe.forceTableRepairPrimaryRange(NodeProbe.java:209)
>         at org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:1044)
>         at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:834)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readByte(DataInputStream.java:267)
>         at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:213)
>         ... 9 more
> Logs:
> WARN 20:53:27,135 Heap is 0.9710037912028483 full.  You may need to reduce memtable and/or cache sizes.  Cassandr
> a will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold i
> n cassandra.yaml if you don't want Cassandra to do this automatically
> regardless of configuration issues a repair shouldn't crash a node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira