You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Julian Hyde <jh...@apache.org> on 2016/02/23 19:50:32 UTC

Cassandra dying in VM

I'm using the latest https://github.com/vlsi/calcite-test-dataset. It
seems that Cassandra runs for a day or so (I can run Calcite's full
suite, including integration tests), then Cassandra dies (the 4
CassandraAdapterIT tests fail). When I log into the VM, Cassandra is
not available:

$ calcite-test-dataset/vm/
$ vagrant ssh
Welcome to Ubuntu 14.04.2 LTS (GNU/Linux 3.16.0-30-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
Last login: Fri Feb 19 12:33:59 2016 from 10.0.2.2
vagrant$ cqlsh -k twissandra `hostname -I`
Connection error: ('Unable to connect to any servers', {'10.0.2.15':
error(111, "Tried connecting to [('10.0.2.15', 9042)]. Last error:
Connection refused")})

Restarting the service doesn't help:

vagrant$ sudo service cassandra restart
 * Restarting Cassandra cassandra
             [OK]
start-stop-daemon: warning: failed to kill 11109: No such process
vagrant$ sudo service cassandra restart
 * Restarting Cassandra cassandra
             [ OK ]
vagrant$ cqlsh -k twissandra `hostname -I`
Connection error: ('Unable to connect to any servers', {'10.0.2.15':
error(111, "Tried connecting to [('10.0.2.15', 9042)]. Last error:
Connection refused")})

Restarting the VM solves the problem.

Is anyone else seeing this? Any ideas how to fix Cassandra so that it stays up?

Julian

Re: Cassandra dying in VM

Posted by Julian Hyde <jh...@apache.org>.
Agreed. Good enough.

On Tue, Feb 23, 2016 at 12:23 PM, Michael Mior <mi...@gmail.com> wrote:
> Assuming you have these automated, you could add in a restart of the VM.
> Obviously not ideal, but it seems that solves the problem for now.
> On Feb 23, 2016 14:24, "Julian Hyde" <jh...@apache.org> wrote:
>
>> Ok, let's monitor this situation. It's not fatal, because we can still
>> run the test suite. But it is inconvenient because my nightly tests
>> will generate some false negatives.
>>
>> On Tue, Feb 23, 2016 at 11:14 AM, Michael Mior <mm...@uwaterloo.ca> wrote:
>> > Unfortunately I think I may have been bitten with this as well. I can't
>> be
>> > sure, but I believe I left the Cassandra instance in the VM running but I
>> > went to use it a couple days later and it had died. Also no signs of any
>> > issue in the logs. Fortunately restarting the daemon seems to have gotten
>> > things going again without any issue. (I didn't have to restart the VM.)
>> >
>> > --
>> > Michael Mior
>> > mmior@uwaterloo.ca
>> >
>> > 2016-02-23 14:11 GMT-05:00 Julian Hyde <jh...@apache.org>:
>> >
>> >> There doesn't seem to be anything in /var/log/cassandra or
>> >> /var/lib/cassandra. I'll check next time there's a failure.
>> >>
>> >> On Tue, Feb 23, 2016 at 10:54 AM, Vladimir Sitnikov
>> >> <si...@gmail.com> wrote:
>> >> > Does it leave some logs behind?
>> >> >
>> >> > For instance: regular java "stdout" logs or some hs_err_pid file.
>> >> >
>> >> > Blind guess would be "old JRE + random JIT/GC bug == sigsegv".
>> >> >
>> >> > Vladimir
>> >>
>>

Re: Cassandra dying in VM

Posted by Michael Mior <mi...@gmail.com>.
Assuming you have these automated, you could add in a restart of the VM.
Obviously not ideal, but it seems that solves the problem for now.
On Feb 23, 2016 14:24, "Julian Hyde" <jh...@apache.org> wrote:

> Ok, let's monitor this situation. It's not fatal, because we can still
> run the test suite. But it is inconvenient because my nightly tests
> will generate some false negatives.
>
> On Tue, Feb 23, 2016 at 11:14 AM, Michael Mior <mm...@uwaterloo.ca> wrote:
> > Unfortunately I think I may have been bitten with this as well. I can't
> be
> > sure, but I believe I left the Cassandra instance in the VM running but I
> > went to use it a couple days later and it had died. Also no signs of any
> > issue in the logs. Fortunately restarting the daemon seems to have gotten
> > things going again without any issue. (I didn't have to restart the VM.)
> >
> > --
> > Michael Mior
> > mmior@uwaterloo.ca
> >
> > 2016-02-23 14:11 GMT-05:00 Julian Hyde <jh...@apache.org>:
> >
> >> There doesn't seem to be anything in /var/log/cassandra or
> >> /var/lib/cassandra. I'll check next time there's a failure.
> >>
> >> On Tue, Feb 23, 2016 at 10:54 AM, Vladimir Sitnikov
> >> <si...@gmail.com> wrote:
> >> > Does it leave some logs behind?
> >> >
> >> > For instance: regular java "stdout" logs or some hs_err_pid file.
> >> >
> >> > Blind guess would be "old JRE + random JIT/GC bug == sigsegv".
> >> >
> >> > Vladimir
> >>
>

Re: Cassandra dying in VM

Posted by Julian Hyde <jh...@apache.org>.
Ok, let's monitor this situation. It's not fatal, because we can still
run the test suite. But it is inconvenient because my nightly tests
will generate some false negatives.

On Tue, Feb 23, 2016 at 11:14 AM, Michael Mior <mm...@uwaterloo.ca> wrote:
> Unfortunately I think I may have been bitten with this as well. I can't be
> sure, but I believe I left the Cassandra instance in the VM running but I
> went to use it a couple days later and it had died. Also no signs of any
> issue in the logs. Fortunately restarting the daemon seems to have gotten
> things going again without any issue. (I didn't have to restart the VM.)
>
> --
> Michael Mior
> mmior@uwaterloo.ca
>
> 2016-02-23 14:11 GMT-05:00 Julian Hyde <jh...@apache.org>:
>
>> There doesn't seem to be anything in /var/log/cassandra or
>> /var/lib/cassandra. I'll check next time there's a failure.
>>
>> On Tue, Feb 23, 2016 at 10:54 AM, Vladimir Sitnikov
>> <si...@gmail.com> wrote:
>> > Does it leave some logs behind?
>> >
>> > For instance: regular java "stdout" logs or some hs_err_pid file.
>> >
>> > Blind guess would be "old JRE + random JIT/GC bug == sigsegv".
>> >
>> > Vladimir
>>

Re: Cassandra dying in VM

Posted by Michael Mior <mm...@uwaterloo.ca>.
Unfortunately I think I may have been bitten with this as well. I can't be
sure, but I believe I left the Cassandra instance in the VM running but I
went to use it a couple days later and it had died. Also no signs of any
issue in the logs. Fortunately restarting the daemon seems to have gotten
things going again without any issue. (I didn't have to restart the VM.)

--
Michael Mior
mmior@uwaterloo.ca

2016-02-23 14:11 GMT-05:00 Julian Hyde <jh...@apache.org>:

> There doesn't seem to be anything in /var/log/cassandra or
> /var/lib/cassandra. I'll check next time there's a failure.
>
> On Tue, Feb 23, 2016 at 10:54 AM, Vladimir Sitnikov
> <si...@gmail.com> wrote:
> > Does it leave some logs behind?
> >
> > For instance: regular java "stdout" logs or some hs_err_pid file.
> >
> > Blind guess would be "old JRE + random JIT/GC bug == sigsegv".
> >
> > Vladimir
>

Re: Cassandra dying in VM

Posted by Julian Hyde <jh...@apache.org>.
There doesn't seem to be anything in /var/log/cassandra or
/var/lib/cassandra. I'll check next time there's a failure.

On Tue, Feb 23, 2016 at 10:54 AM, Vladimir Sitnikov
<si...@gmail.com> wrote:
> Does it leave some logs behind?
>
> For instance: regular java "stdout" logs or some hs_err_pid file.
>
> Blind guess would be "old JRE + random JIT/GC bug == sigsegv".
>
> Vladimir

Re: Cassandra dying in VM

Posted by Vladimir Sitnikov <si...@gmail.com>.
Does it leave some logs behind?

For instance: regular java "stdout" logs or some hs_err_pid file.

Blind guess would be "old JRE + random JIT/GC bug == sigsegv".

Vladimir

Re: Cassandra dying in VM

Posted by Julian Hyde <jh...@apache.org>.
Increasing memory is a good idea. I’ve been running with my VM with 2G recently (see the patch at the end of this message) and I recommend it to anyone who has enough memory on their host machine.

But even with the increased memory, I’m running into other problems with the VM, seemingly around mysql and postgres.

Julian


diff --git a/vm/Vagrantfile b/vm/Vagrantfile
index 49cbae3..0495f9b 100644
--- a/vm/Vagrantfile
+++ b/vm/Vagrantfile
@@ -41,7 +42,7 @@ Vagrant.configure(2) do |config|
 
       v.customize ['modifyvm', :id, '--name', 'ubuntu1404-calcite']
       v.customize ['modifyvm', :id, '--cpus', '1']
-      v.customize ['modifyvm', :id, '--memory', 512]
+      v.customize ['modifyvm', :id, '--memory', 2048]
       v.customize ['modifyvm', :id, '--ioapic', 'off']
       v.customize ['modifyvm', :id, '--natdnshostresolver1', 'on']
       v.customize ['modifyvm', :id, '--nictype1', 'virtio']



> On Feb 29, 2016, at 9:52 AM, Michael Mior <mm...@uwaterloo.ca> wrote:
> 
> This happened to me again. I checked via dmesg and it looks like the OOM
> killer kicked in. I wonder if that is the same in your case? Looking at the
> currently defined memory, I see it's set to 512MB. Any reason we can't
> double that to 1GB? I think that's pretty reasonable for most users and the
> VM can always be suspended when not in use.
> 
> --
> Michael Mior
> mmior@uwaterloo.ca
> 
> 2016-02-23 13:50 GMT-05:00 Julian Hyde <jh...@apache.org>:
> 
>> I'm using the latest https://github.com/vlsi/calcite-test-dataset. It
>> seems that Cassandra runs for a day or so (I can run Calcite's full
>> suite, including integration tests), then Cassandra dies (the 4
>> CassandraAdapterIT tests fail). When I log into the VM, Cassandra is
>> not available:
>> 
>> $ calcite-test-dataset/vm/
>> $ vagrant ssh
>> Welcome to Ubuntu 14.04.2 LTS (GNU/Linux 3.16.0-30-generic x86_64)
>> 
>> * Documentation:  https://help.ubuntu.com/
>> Last login: Fri Feb 19 12:33:59 2016 from 10.0.2.2
>> vagrant$ cqlsh -k twissandra `hostname -I`
>> Connection error: ('Unable to connect to any servers', {'10.0.2.15':
>> error(111, "Tried connecting to [('10.0.2.15', 9042)]. Last error:
>> Connection refused")})
>> 
>> Restarting the service doesn't help:
>> 
>> vagrant$ sudo service cassandra restart
>> * Restarting Cassandra cassandra
>>             [OK]
>> start-stop-daemon: warning: failed to kill 11109: No such process
>> vagrant$ sudo service cassandra restart
>> * Restarting Cassandra cassandra
>>             [ OK ]
>> vagrant$ cqlsh -k twissandra `hostname -I`
>> Connection error: ('Unable to connect to any servers', {'10.0.2.15':
>> error(111, "Tried connecting to [('10.0.2.15', 9042)]. Last error:
>> Connection refused")})
>> 
>> Restarting the VM solves the problem.
>> 
>> Is anyone else seeing this? Any ideas how to fix Cassandra so that it
>> stays up?
>> 
>> Julian
>> 


Re: Cassandra dying in VM

Posted by Michael Mior <mm...@uwaterloo.ca>.
This happened to me again. I checked via dmesg and it looks like the OOM
killer kicked in. I wonder if that is the same in your case? Looking at the
currently defined memory, I see it's set to 512MB. Any reason we can't
double that to 1GB? I think that's pretty reasonable for most users and the
VM can always be suspended when not in use.

--
Michael Mior
mmior@uwaterloo.ca

2016-02-23 13:50 GMT-05:00 Julian Hyde <jh...@apache.org>:

> I'm using the latest https://github.com/vlsi/calcite-test-dataset. It
> seems that Cassandra runs for a day or so (I can run Calcite's full
> suite, including integration tests), then Cassandra dies (the 4
> CassandraAdapterIT tests fail). When I log into the VM, Cassandra is
> not available:
>
> $ calcite-test-dataset/vm/
> $ vagrant ssh
> Welcome to Ubuntu 14.04.2 LTS (GNU/Linux 3.16.0-30-generic x86_64)
>
>  * Documentation:  https://help.ubuntu.com/
> Last login: Fri Feb 19 12:33:59 2016 from 10.0.2.2
> vagrant$ cqlsh -k twissandra `hostname -I`
> Connection error: ('Unable to connect to any servers', {'10.0.2.15':
> error(111, "Tried connecting to [('10.0.2.15', 9042)]. Last error:
> Connection refused")})
>
> Restarting the service doesn't help:
>
> vagrant$ sudo service cassandra restart
>  * Restarting Cassandra cassandra
>              [OK]
> start-stop-daemon: warning: failed to kill 11109: No such process
> vagrant$ sudo service cassandra restart
>  * Restarting Cassandra cassandra
>              [ OK ]
> vagrant$ cqlsh -k twissandra `hostname -I`
> Connection error: ('Unable to connect to any servers', {'10.0.2.15':
> error(111, "Tried connecting to [('10.0.2.15', 9042)]. Last error:
> Connection refused")})
>
> Restarting the VM solves the problem.
>
> Is anyone else seeing this? Any ideas how to fix Cassandra so that it
> stays up?
>
> Julian
>