You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Nick Wood <nw...@gmail.com> on 2016/01/28 23:48:24 UTC

crash reports

I've been seeing a lot of crashes in my logs lately but I'm not sure how to
decipher them. I have two questions regarding this:

1. Is there a self-help resource that can help me identify the cause of
crashes like these?

2. Can someone tell me what the cause of this specific crash is, and even
better, how I might fix it?

[Thu, 28 Jan 2016 22:35:53 GMT] [error] [<0.19147.83>] ** Generic server
<0.19147.83> terminating
** Last message in was {'EXIT',<0.23149.104>,killed}
** When Server state == {state,"http://admin:***@***:5984/***/",
                               20,[],[],
                               {[],[]}}
** Reason for termination ==
** killed

[Thu, 28 Jan 2016 22:35:53 GMT] [error] [<0.19147.83>]
{error_report,<0.30.0>,
                        {<0.19147.83>,crash_report,
                         [[{initial_call,
                            {couch_replicator_httpc_pool,init,
                             ['Argument__1']}},
                           {pid,<0.19147.83>},
                           {registered_name,[]},
                           {error_info,
                            {exit,killed,
                             [{gen_server,terminate,7,
                               [{file,"gen_server.erl"},{line,804}]},
                              {proc_lib,init_p_do_apply,3,
                               [{file,"proc_lib.erl"},{line,237}]}]}},
                           {ancestors,
                            [<0.23149.104>,couch_replicator_job_sup,
                             couch_primary_services,couch_server_sup,
                             <0.31.0>]},
                           {messages,[]},
                           {links,[]},
                           {dictionary,[]},
                           {trap_exit,true},
                           {status,running},
                           {heap_size,376},
                           {stack_size,27},
                           {reductions,798}],
                          []]}}

Regards,

  Nick

Re: crash reports

Posted by Alexander Shorin <kx...@gmail.com>.
On Fri, Jan 29, 2016 at 2:24 AM, Merlin Calo <me...@gmail.com> wrote:
> Thanks for sending the link..
>
> I decided to stick around, I like what the group is doing.
>
> I just get way too many emails, lol..

Use the filters, Merlin! (: Anyway, decision is yours...

--
,,,^..^,,,

Re: crash reports

Posted by Merlin Calo <me...@gmail.com>.
Thanks for sending the link..

I decided to stick around, I like what the group is doing.

I just get way too many emails, lol..

~Merlin

On Thu, Jan 28, 2016 at 5:54 PM, Alexander Shorin <kx...@gmail.com> wrote:

> On Fri, Jan 29, 2016 at 1:52 AM, Merlin Calo <me...@gmail.com>
> wrote:
> > would you be able to remove me from your email-list.
>
> Sad to hear that. Anyway, please follow the unscribe procedure:
> http://couchdb.apache.org/#mailing-lists
>
> --
> ,,,^..^,,,
>



-- 
*~Merlin*

Re: crash reports

Posted by Alexander Shorin <kx...@gmail.com>.
On Fri, Jan 29, 2016 at 1:52 AM, Merlin Calo <me...@gmail.com> wrote:
> would you be able to remove me from your email-list.

Sad to hear that. Anyway, please follow the unscribe procedure:
http://couchdb.apache.org/#mailing-lists

--
,,,^..^,,,

Re: crash reports

Posted by Merlin Calo <me...@gmail.com>.
Hello,

would you be able to remove me from your email-list.

Thanks.

On Thu, Jan 28, 2016 at 5:48 PM, Nick Wood <nw...@gmail.com> wrote:

> I've been seeing a lot of crashes in my logs lately but I'm not sure how to
> decipher them. I have two questions regarding this:
>
> 1. Is there a self-help resource that can help me identify the cause of
> crashes like these?
>
> 2. Can someone tell me what the cause of this specific crash is, and even
> better, how I might fix it?
>
> [Thu, 28 Jan 2016 22:35:53 GMT] [error] [<0.19147.83>] ** Generic server
> <0.19147.83> terminating
> ** Last message in was {'EXIT',<0.23149.104>,killed}
> ** When Server state == {state,"http://admin:***@***:5984/***/",
>                                20,[],[],
>                                {[],[]}}
> ** Reason for termination ==
> ** killed
>
> [Thu, 28 Jan 2016 22:35:53 GMT] [error] [<0.19147.83>]
> {error_report,<0.30.0>,
>                         {<0.19147.83>,crash_report,
>                          [[{initial_call,
>                             {couch_replicator_httpc_pool,init,
>                              ['Argument__1']}},
>                            {pid,<0.19147.83>},
>                            {registered_name,[]},
>                            {error_info,
>                             {exit,killed,
>                              [{gen_server,terminate,7,
>                                [{file,"gen_server.erl"},{line,804}]},
>                               {proc_lib,init_p_do_apply,3,
>                                [{file,"proc_lib.erl"},{line,237}]}]}},
>                            {ancestors,
>                             [<0.23149.104>,couch_replicator_job_sup,
>                              couch_primary_services,couch_server_sup,
>                              <0.31.0>]},
>                            {messages,[]},
>                            {links,[]},
>                            {dictionary,[]},
>                            {trap_exit,true},
>                            {status,running},
>                            {heap_size,376},
>                            {stack_size,27},
>                            {reductions,798}],
>                           []]}}
>
> Regards,
>
>   Nick
>



-- 
*~Merlin*

Re: crash reports

Posted by Alexander Shorin <kx...@gmail.com>.
On Fri, Jan 29, 2016 at 9:04 PM, Nick Wood <nw...@gmail.com> wrote:
> Can you recommend an ideal erlang version to use?

Ideal is hard to say, but R14B04, R16B03-1, 17.5 are all good for 1.6.1

--
,,,^..^,,,

Re: crash reports

Posted by Nick Wood <nw...@gmail.com>.
Can you recommend an ideal erlang version to use?

I'm not using https.

  Nick

On Fri, Jan 29, 2016 at 11:02 AM, Alexander Shorin <kx...@gmail.com> wrote:

> On Fri, Jan 29, 2016 at 8:23 PM, Nick Wood <nw...@gmail.com> wrote:
> > I updated both servers to use this container image -
> > https://hub.docker.com/r/klaemo/couchdb/~/dockerfile/ which is what I
> > believe the CouchDB team is trying to base it's official docker image off
> > of.
>
> That's true.
>
> > When I cycle through all ~1100 of our databases and do a non-continuous
> > replication, I don't receive a single error or crash. I continuously
> queue
> > 20 of them into the _replicator database and I can see it processing
> about
> > 2 per second. (on a side note, any idea why it only does 2-3 per second
> > when I can see plenty of available disk, ram and cpu? Seems like there's
> an
> > internal hard-code delay maybe?)
>
> There are no any hardcoded delays of such kind.
>
> No ideas, except some docker-specific issues or constraints. I would
> try to remove docker, do plain install and ensure that there are no
> any issues with you setup, and then try to dig into what other people
> reports about their docker apps under load and which issues they
> faced. Finally, debug your container to see what causes delays. That's
> the list I would follow in same situation.
>
> > Might I still have a broken version of erlang? It looks like the version
> I
> > ended up with using that docker image is 17.3.
>
> Erlang 17.3 is broken indeed, you should avoid this release.
> Especially, if you use https.
>
> --
> ,,,^..^,,,
>

Re: crash reports

Posted by Alexander Shorin <kx...@gmail.com>.
On Fri, Jan 29, 2016 at 8:23 PM, Nick Wood <nw...@gmail.com> wrote:
> I updated both servers to use this container image -
> https://hub.docker.com/r/klaemo/couchdb/~/dockerfile/ which is what I
> believe the CouchDB team is trying to base it's official docker image off
> of.

That's true.

> When I cycle through all ~1100 of our databases and do a non-continuous
> replication, I don't receive a single error or crash. I continuously queue
> 20 of them into the _replicator database and I can see it processing about
> 2 per second. (on a side note, any idea why it only does 2-3 per second
> when I can see plenty of available disk, ram and cpu? Seems like there's an
> internal hard-code delay maybe?)

There are no any hardcoded delays of such kind.

No ideas, except some docker-specific issues or constraints. I would
try to remove docker, do plain install and ensure that there are no
any issues with you setup, and then try to dig into what other people
reports about their docker apps under load and which issues they
faced. Finally, debug your container to see what causes delays. That's
the list I would follow in same situation.

> Might I still have a broken version of erlang? It looks like the version I
> ended up with using that docker image is 17.3.

Erlang 17.3 is broken indeed, you should avoid this release.
Especially, if you use https.

--
,,,^..^,,,

Re: crash reports

Posted by Nick Wood <nw...@gmail.com>.
I updated both servers to use this container image -
https://hub.docker.com/r/klaemo/couchdb/~/dockerfile/ which is what I
believe the CouchDB team is trying to base it's official docker image off
of. Still getting lots of crashes and replication isn't starting much of
the time. For example, right now I have over 100 documents in the
_replicator database. 1 of them is "triggered", about 15 are in an error
state with the reason being "timeout", and the rest haven't even been
started (_replication_state isn't set).

When I cycle through all ~1100 of our databases and do a non-continuous
replication, I don't receive a single error or crash. I continuously queue
20 of them into the _replicator database and I can see it processing about
2 per second. (on a side note, any idea why it only does 2-3 per second
when I can see plenty of available disk, ram and cpu? Seems like there's an
internal hard-code delay maybe?)

The issue happens when I try to initiate continuous replications.

Might I still have a broken version of erlang? It looks like the version I
ended up with using that docker image is 17.3.

  Nick


On Thu, Jan 28, 2016 at 4:27 PM, Alexander Shorin <kx...@gmail.com> wrote:

> On Fri, Jan 29, 2016 at 2:12 AM, Nick Wood <nw...@gmail.com> wrote:
> > This is from over 11 months ago. It looks like 18:2 is available. Do you
> > think it's worth upgrading to see if it fixes the issue? Or is it the
> later
> > 18.x releases that may have caused the crashes you mentioned?
>
> No, upgrade to 18 won't work, actually, for you. If you upgrade
> Erlang, you need to rebuild CouchDB to avoid any kind of problems and
> 18 support will require tiny, but sources patch.
>
> However, if you didn't touch anything for 11 month, then no idea for now.
>
> Your problem is that replicator httpc (http client) pool get instantly
> killed on the initialization. Why that happens? Good question.
>
> --
> ,,,^..^,,,
>

Re: crash reports

Posted by Alexander Shorin <kx...@gmail.com>.
On Fri, Jan 29, 2016 at 2:12 AM, Nick Wood <nw...@gmail.com> wrote:
> This is from over 11 months ago. It looks like 18:2 is available. Do you
> think it's worth upgrading to see if it fixes the issue? Or is it the later
> 18.x releases that may have caused the crashes you mentioned?

No, upgrade to 18 won't work, actually, for you. If you upgrade
Erlang, you need to rebuild CouchDB to avoid any kind of problems and
18 support will require tiny, but sources patch.

However, if you didn't touch anything for 11 month, then no idea for now.

Your problem is that replicator httpc (http client) pool get instantly
killed on the initialization. Why that happens? Good question.

--
,,,^..^,,,

Re: crash reports

Posted by Nick Wood <nw...@gmail.com>.
Looks like I'm on version 17?

1> erlang:system_info(otp_release).
"17"

Running in a docker container on debian:wheezy

It looks like the current Dockerfile is doing this:

curl -ssL https://packages.erlang-solutions.com/erlang-solutions_1.0_all.deb
-o esl.deb
dpkg -i esl.deb
apt-get update
apt-get install -y --no-install-recommends erlang-nox=1:17.4
erlang-dev=1:17.4

This is from over 11 months ago. It looks like 18:2 is available. Do you
think it's worth upgrading to see if it fixes the issue? Or is it the later
18.x releases that may have caused the crashes you mentioned?

  Nick


On Thu, Jan 28, 2016 at 3:53 PM, Alexander Shorin <kx...@gmail.com> wrote:

> Hi!
>
> Have you upgrade Erlang version recently?
> Last report of such kind I saw caused by broken instance. Reinstall helped.
> --
> ,,,^..^,,,
>
>
> On Fri, Jan 29, 2016 at 1:48 AM, Nick Wood <nw...@gmail.com> wrote:
> > I've been seeing a lot of crashes in my logs lately but I'm not sure how
> to
> > decipher them. I have two questions regarding this:
> >
> > 1. Is there a self-help resource that can help me identify the cause of
> > crashes like these?
> >
> > 2. Can someone tell me what the cause of this specific crash is, and even
> > better, how I might fix it?
> >
> > [Thu, 28 Jan 2016 22:35:53 GMT] [error] [<0.19147.83>] ** Generic server
> > <0.19147.83> terminating
> > ** Last message in was {'EXIT',<0.23149.104>,killed}
> > ** When Server state == {state,"http://admin:***@***:5984/***/",
> >                                20,[],[],
> >                                {[],[]}}
> > ** Reason for termination ==
> > ** killed
> >
> > [Thu, 28 Jan 2016 22:35:53 GMT] [error] [<0.19147.83>]
> > {error_report,<0.30.0>,
> >                         {<0.19147.83>,crash_report,
> >                          [[{initial_call,
> >                             {couch_replicator_httpc_pool,init,
> >                              ['Argument__1']}},
> >                            {pid,<0.19147.83>},
> >                            {registered_name,[]},
> >                            {error_info,
> >                             {exit,killed,
> >                              [{gen_server,terminate,7,
> >                                [{file,"gen_server.erl"},{line,804}]},
> >                               {proc_lib,init_p_do_apply,3,
> >                                [{file,"proc_lib.erl"},{line,237}]}]}},
> >                            {ancestors,
> >                             [<0.23149.104>,couch_replicator_job_sup,
> >                              couch_primary_services,couch_server_sup,
> >                              <0.31.0>]},
> >                            {messages,[]},
> >                            {links,[]},
> >                            {dictionary,[]},
> >                            {trap_exit,true},
> >                            {status,running},
> >                            {heap_size,376},
> >                            {stack_size,27},
> >                            {reductions,798}],
> >                           []]}}
> >
> > Regards,
> >
> >   Nick
>

Re: crash reports

Posted by Alexander Shorin <kx...@gmail.com>.
Hi!

Have you upgrade Erlang version recently?
Last report of such kind I saw caused by broken instance. Reinstall helped.
--
,,,^..^,,,


On Fri, Jan 29, 2016 at 1:48 AM, Nick Wood <nw...@gmail.com> wrote:
> I've been seeing a lot of crashes in my logs lately but I'm not sure how to
> decipher them. I have two questions regarding this:
>
> 1. Is there a self-help resource that can help me identify the cause of
> crashes like these?
>
> 2. Can someone tell me what the cause of this specific crash is, and even
> better, how I might fix it?
>
> [Thu, 28 Jan 2016 22:35:53 GMT] [error] [<0.19147.83>] ** Generic server
> <0.19147.83> terminating
> ** Last message in was {'EXIT',<0.23149.104>,killed}
> ** When Server state == {state,"http://admin:***@***:5984/***/",
>                                20,[],[],
>                                {[],[]}}
> ** Reason for termination ==
> ** killed
>
> [Thu, 28 Jan 2016 22:35:53 GMT] [error] [<0.19147.83>]
> {error_report,<0.30.0>,
>                         {<0.19147.83>,crash_report,
>                          [[{initial_call,
>                             {couch_replicator_httpc_pool,init,
>                              ['Argument__1']}},
>                            {pid,<0.19147.83>},
>                            {registered_name,[]},
>                            {error_info,
>                             {exit,killed,
>                              [{gen_server,terminate,7,
>                                [{file,"gen_server.erl"},{line,804}]},
>                               {proc_lib,init_p_do_apply,3,
>                                [{file,"proc_lib.erl"},{line,237}]}]}},
>                            {ancestors,
>                             [<0.23149.104>,couch_replicator_job_sup,
>                              couch_primary_services,couch_server_sup,
>                              <0.31.0>]},
>                            {messages,[]},
>                            {links,[]},
>                            {dictionary,[]},
>                            {trap_exit,true},
>                            {status,running},
>                            {heap_size,376},
>                            {stack_size,27},
>                            {reductions,798}],
>                           []]}}
>
> Regards,
>
>   Nick