You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Jeff Lerman (JIRA)" <ji...@apache.org> on 2010/06/24 17:59:49 UTC

[jira] Created: (CASSANDRA-1224) Cassandra NPE on insert after one node goes down.

Cassandra NPE on insert after one node goes down.
-------------------------------------------------

Key: CASSANDRA-1224
URL: https://issues.apache.org/jira/browse/CASSANDRA-1224
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.6.1
Environment: Gentoo Linux
Reporter: Jeff Lerman
Priority: Minor

Hi all,

I posted this in a different thread and was instructed to create a new bug. As far as I can tell it is not too major of an issue as it may have been cause by us prematurely taking down a node.

I just had this happen in Cassandra 0.6.1. We're only running two nodes as of now and our second one was barely accepting any requests and only being replicated to for the most part. The load went up to 9 consistently so we investigated and noticed its "Load" on nodetool was 2x as large as our other instance. I went and cleared out the data and commitlogs, set autobootstrap to true and put it back in.

This is where our case gets funky...we noticed the other instance's load going up a lot and saw that the one I just readded was not doing much. After awhile of contemplating, I took down the second one again. Minutes later I found an open case about the anticompaction happening before full bootstrapping occurs. I found the data/stream dir on the working instance and saw that it was complete...but I had already taken down the second one! So I deleted the stream dir to save space and figured I'd start the process again tomorrow.

A few hours later I am getting these Internal errors on writes:

ERROR [pool-1-thread-287117] 2010-06-23 19:16:51,754 Cassandra.java (line 1492) Internal error processing insert
java.lang.NullPointerException

That was the entire trace. We tried to kill -3 Cassandra...waited hours and it never killed. Did a kill -6 but got no usable dump. Perhaps it is possible for someone to recreate this situation?

I also noticed that the virtual memory Cassandra was taking up tacked on the extra 10+GB for the stream file. It never released this either which is bad.

Thanks,

Jeff

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1224) Cassandra NPE on insert after one node goes down.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882205#action_12882205 ] 

Jonathan Ellis commented on CASSANDRA-1224:
-------------------------------------------

Apparently that is from

          LOGGER.error("Internal error processing insert", th);

in the generated Thrift code.

At a loss as to how that would not log the entire stack.

> Cassandra NPE on insert after one node goes down.
> -------------------------------------------------
>
>                 Key: CASSANDRA-1224
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1224
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.1
>         Environment: Gentoo  Linux
>            Reporter: Jeff Lerman
>            Priority: Minor
>
> Hi all,
> I posted this in a different thread and was instructed to create a new bug.  As far as I can tell it is not too major of an issue as it may have been cause by us prematurely taking down a node.
> I just had this happen in Cassandra 0.6.1. We're only running two nodes as of now and our second one was barely accepting any requests and only being replicated to for the most part. The load went up to 9 consistently so we investigated and noticed its "Load" on nodetool was 2x as large as our other instance. I went and cleared out the data and commitlogs, set autobootstrap to true and put it back in.
> This is where our case gets funky...we noticed the other instance's load going up a lot and saw that the one I just readded was not doing much. After awhile of contemplating, I took down the second one again. Minutes later I found an open case about the anticompaction happening before full bootstrapping occurs. I found the data/stream dir on the working instance and saw that it was complete...but I had already taken down the second one! So I deleted the stream dir to save space and figured I'd start the process again tomorrow.
> A few hours later I am getting these Internal errors on writes:
> ERROR [pool-1-thread-287117] 2010-06-23 19:16:51,754 Cassandra.java (line 1492) Internal error processing insert
> java.lang.NullPointerException
> That was the entire trace.   We tried to kill -3 Cassandra...waited hours and it never killed.  Did a kill -6 but got no usable dump.   Perhaps it is possible for someone to recreate this situation?
> I also noticed that the virtual memory Cassandra was taking up tacked on the extra 10+GB for the stream file.  It never released this either which is bad.
> Thanks,
> Jeff

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1224) Cassandra NPE on insert after one node goes down.

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882196#action_12882196 ] 

Jonathan Ellis commented on CASSANDRA-1224:
-------------------------------------------

Please provide the full stack trace.

> Cassandra NPE on insert after one node goes down.
> -------------------------------------------------
>
>                 Key: CASSANDRA-1224
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1224
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.1
>         Environment: Gentoo  Linux
>            Reporter: Jeff Lerman
>            Priority: Minor
>
> Hi all,
> I posted this in a different thread and was instructed to create a new bug.  As far as I can tell it is not too major of an issue as it may have been cause by us prematurely taking down a node.
> I just had this happen in Cassandra 0.6.1. We're only running two nodes as of now and our second one was barely accepting any requests and only being replicated to for the most part. The load went up to 9 consistently so we investigated and noticed its "Load" on nodetool was 2x as large as our other instance. I went and cleared out the data and commitlogs, set autobootstrap to true and put it back in.
> This is where our case gets funky...we noticed the other instance's load going up a lot and saw that the one I just readded was not doing much. After awhile of contemplating, I took down the second one again. Minutes later I found an open case about the anticompaction happening before full bootstrapping occurs. I found the data/stream dir on the working instance and saw that it was complete...but I had already taken down the second one! So I deleted the stream dir to save space and figured I'd start the process again tomorrow.
> A few hours later I am getting these Internal errors on writes:
> ERROR [pool-1-thread-287117] 2010-06-23 19:16:51,754 Cassandra.java (line 1492) Internal error processing insert
> java.lang.NullPointerException
> That was the entire trace.   We tried to kill -3 Cassandra...waited hours and it never killed.  Did a kill -6 but got no usable dump.   Perhaps it is possible for someone to recreate this situation?
> I also noticed that the virtual memory Cassandra was taking up tacked on the extra 10+GB for the stream file.  It never released this either which is bad.
> Thanks,
> Jeff

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1224) Cassandra NPE on insert after one node goes down.

Posted by "Jeff Lerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882201#action_12882201 ] 

Jeff Lerman commented on CASSANDRA-1224:
----------------------------------------

Unfortunately that was all that was printed out.  We jacked up the debug level to no avail.  I'm not sure that we want to try to reproduce this ourselves as we'd really like to get Cassandra back up and running.  FYI -- a restart of Cassandra fixed the issue so it definitely appears to be some sort of thread lock.

> Cassandra NPE on insert after one node goes down.
> -------------------------------------------------
>
>                 Key: CASSANDRA-1224
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1224
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.1
>         Environment: Gentoo  Linux
>            Reporter: Jeff Lerman
>            Priority: Minor
>
> Hi all,
> I posted this in a different thread and was instructed to create a new bug.  As far as I can tell it is not too major of an issue as it may have been cause by us prematurely taking down a node.
> I just had this happen in Cassandra 0.6.1. We're only running two nodes as of now and our second one was barely accepting any requests and only being replicated to for the most part. The load went up to 9 consistently so we investigated and noticed its "Load" on nodetool was 2x as large as our other instance. I went and cleared out the data and commitlogs, set autobootstrap to true and put it back in.
> This is where our case gets funky...we noticed the other instance's load going up a lot and saw that the one I just readded was not doing much. After awhile of contemplating, I took down the second one again. Minutes later I found an open case about the anticompaction happening before full bootstrapping occurs. I found the data/stream dir on the working instance and saw that it was complete...but I had already taken down the second one! So I deleted the stream dir to save space and figured I'd start the process again tomorrow.
> A few hours later I am getting these Internal errors on writes:
> ERROR [pool-1-thread-287117] 2010-06-23 19:16:51,754 Cassandra.java (line 1492) Internal error processing insert
> java.lang.NullPointerException
> That was the entire trace.   We tried to kill -3 Cassandra...waited hours and it never killed.  Did a kill -6 but got no usable dump.   Perhaps it is possible for someone to recreate this situation?
> I also noticed that the virtual memory Cassandra was taking up tacked on the extra 10+GB for the stream file.  It never released this either which is bad.
> Thanks,
> Jeff

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.