You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Jeff Macdonald <ma...@gmail.com> on 2009/05/20 00:31:50 UTC

couchdb kills itself

Hi,
I'm trying to load 9 million documents. I get to around 900K+ and
couchdb decides it has had enough:

$ more couchdb.stderr

heart_beat_kill_pid = 7227
heart_beat_timeout = 11
heart: Tue May 19 18:11:33 2009: heart-beat time-out.
heart: Tue May 19 18:11:57 2009: Executed
"/home/couchdb/couchdb/bin/couchdb -k". Terminating.

The resulting database is 4Gigs.

The tail end of the log in var/log/couchdb shows this:

[Tue, 19 May 2009 22:09:59 GMT] [info] [<0.12575.0>] 127.0.0.1 - -
'POST' /activity/_bulk_docs 201

[Tue, 19 May 2009 22:10:14 GMT] [error] [<0.12575.0>] {error_report,<0.21.0>,
    {<0.12575.0>,crash_report,
     [[{pid,<0.12575.0>},
       {registered_name,[]},
       {error_info,
           {exit,
               {timeout,
                   {gen_server,call,
                       [couch_config,
                        {register,#Fun<couch_httpd.9.104562741>,
                            <0.12575.0>}]}},
               [{gen_server,call,2},
                {couch_httpd,handle_request,4},
                {mochiweb_http,headers,5},
                {proc_lib,init_p_do_apply,3}]}},
       {initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
       {ancestors,
           [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
       {messages,[]},
       {links,[<0.51.0>,#Port<0.1097>]},
       {dictionary,[{jsonp,undefined}]},
       {trap_exit,false},
       {status,running},
       {heap_size,4181},
       {stack_size,23},
       {reductions,2845177947}],
      []]}}

[Tue, 19 May 2009 22:10:14 GMT] [error] [<0.51.0>] {error_report,<0.21.0>,
    {<0.51.0>,std_error,
     {mochiweb_socket_server,235,
         {child_error,
             {timeout,
                 {gen_server,call,
                     [couch_config,
                      {register,#Fun<couch_httpd.9.104562741>,
                          <0.12575.0>}]}}}}}}

The last document looks perfectly fine (I'm using Net::CouchDB):

964979: $VAR1 = {
          'activity_time' => '20090120154136',
          'product_id' => '44-444-444',
          'email_id' => 'A@example.net',
          'categories' => [
                            'S1234'
                          ],
          'type' => 'b'
        };

I've tried this twice now. Any ideas?

$ bin/couchdb -V
couchdb - Apache CouchDB 0.10.0a776321


TIA

-- 
Jeff Macdonald
Ayer, MA

Re: couchdb kills itself

Posted by Tim Somers <so...@gmail.com>.
No problem

I did the update, my test is running at the moment...

Tim


On Wed, May 20, 2009 at 1:11 PM, Paul Davis <pa...@gmail.com>wrote:

> Should mention that r776685 translates to `svn up` :)
>
> Paul
>
> On Wed, May 20, 2009 at 9:10 AM, Paul Davis <pa...@gmail.com>
> wrote:
> > Tim,
> >
> > Can you try r776685 and see if the problem persists?
> >
> > Paul Davis
> >
> > On Wed, May 20, 2009 at 8:52 AM, Tim Somers <so...@gmail.com>
> wrote:
> >> At first sight, I'm getting the same error once for each running process
> >> using couchdb, so that makes 5 times. I'll repeat my test though, and
> check
> >> if there's something different maybe in the first one.
> >>
> >> Tim
> >>
> >>
> >> On Wed, May 20, 2009 at 12:47 PM, Damien Katz <da...@apache.org>
> wrote:
> >>
> >>> Are there more errors in the log? This error only makes sense to me if
> >>> something else is restarting, because of a configuration change or
> because
> >>> something else must have crashed beforehand. For example, running the
> test
> >>> suite restarts components during testing which could cause a crash like
> >>> this.
> >>>
> >>> -Damien
> >>>
> >>>
> >>> On May 20, 2009, at 8:03 AM, Tim Somers wrote:
> >>>
> >>>  On Wed, May 20, 2009 at 11:57 AM, Paul Davis <
> paul.joseph.davis@gmail.com
> >>>> >wrote:
> >>>>
> >>>>  On Wed, May 20, 2009 at 6:15 AM, Tim Somers <so...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I'm getting the exact same error:
> >>>>>>
> >>>>>> [error] [<0.7002.0>] {error_report,<0.22.0>,
> >>>>>>  {<0.7002.0>,crash_report,
> >>>>>>   [[{pid,<0.7002.0>},
> >>>>>>     {registered_name,[]},
> >>>>>>     {error_info,
> >>>>>>         {exit,
> >>>>>>             {timeout,
> >>>>>>                 {gen_server,call,
> >>>>>>                     [couch_config,
> >>>>>>
> >>>>>> {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}},
> >>>>>>             [{gen_server,call,2},
> >>>>>>              {couch_httpd,handle_request,4},
> >>>>>>              {mochiweb_http,headers,5},
> >>>>>>              {proc_lib,init_p_do_apply,3}]}},
> >>>>>>
> >>>>>>
> {initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
> >>>>>>     {ancestors,
> >>>>>>
> >>>>>>  [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
> >>>>>
> >>>>>>     {messages,[]},
> >>>>>>     {links,[<0.52.0>,#Port<0.4751>]},
> >>>>>>     {dictionary,[]},
> >>>>>>     {trap_exit,false},
> >>>>>>     {status,running},
> >>>>>>     {heap_size,2584},
> >>>>>>     {stack_size,23},
> >>>>>>     {reductions,1669}],
> >>>>>>    []]}}
> >>>>>> [error] [<0.52.0>] {error_report,<0.22.0>,
> >>>>>>  {<0.52.0>,std_error,
> >>>>>>   {mochiweb_socket_server,235,
> >>>>>>       {child_error,
> >>>>>>           {timeout,
> >>>>>>               {gen_server,call,
> >>>>>>                   [couch_config,
> >>>>>>                    {register,#Fun<couch_httpd.9.104562741>,
> >>>>>>                        <0.7002.0>}]}}}}}}
> >>>>>>
> >>>>>> =ERROR REPORT==== 20-May-2009::12:02:45 ===
> >>>>>> {mochiweb_socket_server,235,
> >>>>>>  {child_error,
> >>>>>>      {timeout,
> >>>>>>          {gen_server,call,
> >>>>>>              [couch_config,
> >>>>>>
> {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}}}}
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> although it seems to happen when the system is overloaded. In total,
> I
> >>>>>>
> >>>>> have
> >>>>>
> >>>>>> 5 processing constantly reading from and writing to the same
> couchdb,
> >>>>>>
> >>>>> with a
> >>>>>
> >>>>>> resulting load average of about 3 and physical memory at it's limit.
> I
> >>>>>>
> >>>>> get
> >>>>>
> >>>>>> the impression (though it's hard to reproduce) that this error come
> at
> >>>>>>
> >>>>> the
> >>>>>
> >>>>>> moment the system is swapping some ram out to disk, making couchdb
> run
> >>>>>>
> >>>>> into
> >>>>>
> >>>>>> some timeout while calculating a view.
> >>>>>> Couchdb does stay online though, only crashing my app with an
> unusable
> >>>>>> result.
> >>>>>>
> >>>>>>
> >>>>> Can you check if couchdb actually stays alive or if it's getting
> >>>>> respawned by heart? The easiest way to test this is to run couchdb
> >>>>> with the command line without the init.d script.
> >>>>>
> >>>>> Erlang closes the entire VM when it's unable to acquire memory. Ie,
> if
> >>>>> malloc returns NULL, then the whole VM closes. The general idea being
> >>>>> that it'll just rely on heart to be restarted.
> >>>>>
> >>>>> Paul Davis
> >>>>>
> >>>>>  I'm using svn version 776257
> >>>>>>
> >>>>>> Tim
> >>>>>>
> >>>>>>
> >>>>>
> >>>> It stays alive, I always start it from command line. I use the svn
> version
> >>>> on port 5985, and the version installed by debian package manager on
> port
> >>>> 5984 for comparison.
> >>>>
> >>>> Tim
> >>>>
> >>>
> >>>
> >>
> >
>

Re: couchdb kills itself

Posted by Paul Davis <pa...@gmail.com>.
Also, can you take a look at cpu and disk usage while this is running
and see if its swapping back and forth between then two?

On Wed, May 20, 2009 at 12:54 PM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Wed, May 20, 2009 at 9:11 AM, Paul Davis <pa...@gmail.com> wrote:
>> Should mention that r776685 translates to `svn up` :)
>
> 'svn update' updated me to:
>
> $ bin/couchdb -V
> couchdb - Apache CouchDB 0.10.0a776732
>
> I'm seeing very bursty performance now. A bunch of injections followed
> by blocking, followed by a bunch of injections, repeat.
>
> This is still with Erlang RB12 and Fedora 9.
>
> I'm afraid it will take a day or two to get to the 4Gig size. :(
>
>
> --
> Jeff Macdonald
> Ayer, MA
>

Re: couchdb kills itself

Posted by Paul Davis <pa...@gmail.com>.
Couple of days? How are you loading records? If you're doing data
imports you should use _bulk_docs. If you can't use that and you can
tolerate the possibility of data loss in a catastrophic node failure,
bulk=ok is another option.

Paul Davis

On Wed, May 20, 2009 at 12:54 PM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Wed, May 20, 2009 at 9:11 AM, Paul Davis <pa...@gmail.com> wrote:
>> Should mention that r776685 translates to `svn up` :)
>
> 'svn update' updated me to:
>
> $ bin/couchdb -V
> couchdb - Apache CouchDB 0.10.0a776732
>
> I'm seeing very bursty performance now. A bunch of injections followed
> by blocking, followed by a bunch of injections, repeat.
>
> This is still with Erlang RB12 and Fedora 9.
>
> I'm afraid it will take a day or two to get to the 4Gig size. :(
>
>
> --
> Jeff Macdonald
> Ayer, MA
>

Re: couchdb kills itself

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, May 20, 2009 at 1:06 PM, Jeff Macdonald <ma...@gmail.com> wrote:

> ah, well, it crashed already. However, I think this was due to VMware
> labmanager web plugin taking my CPU to 100%. That was probably the
> cause of the bursty performance too. I'll try again.

I'm now up to 2 million+ docs and 4.9G db file. I call the progress.

couchdb - Apache CouchDB 0.10.0a776732

-- 
Jeff Macdonald
Ayer, MA

Re: couchdb kills itself

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, May 20, 2009 at 12:54 PM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Wed, May 20, 2009 at 9:11 AM, Paul Davis <pa...@gmail.com> wrote:
>> Should mention that r776685 translates to `svn up` :)
>
> 'svn update' updated me to:
>
> $ bin/couchdb -V
> couchdb - Apache CouchDB 0.10.0a776732
>
> I'm seeing very bursty performance now. A bunch of injections followed
> by blocking, followed by a bunch of injections, repeat.
>
> This is still with Erlang RB12 and Fedora 9.
>
> I'm afraid it will take a day or two to get to the 4Gig size. :(

ah, well, it crashed already. However, I think this was due to VMware
labmanager web plugin taking my CPU to 100%. That was probably the
cause of the bursty performance too. I'll try again.

>
> --
> Jeff Macdonald
> Ayer, MA
>



-- 
Jeff Macdonald
Ayer, MA

Re: couchdb kills itself

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, May 20, 2009 at 9:11 AM, Paul Davis <pa...@gmail.com> wrote:
> Should mention that r776685 translates to `svn up` :)

'svn update' updated me to:

$ bin/couchdb -V
couchdb - Apache CouchDB 0.10.0a776732

I'm seeing very bursty performance now. A bunch of injections followed
by blocking, followed by a bunch of injections, repeat.

This is still with Erlang RB12 and Fedora 9.

I'm afraid it will take a day or two to get to the 4Gig size. :(


-- 
Jeff Macdonald
Ayer, MA

Re: couchdb kills itself

Posted by Paul Davis <pa...@gmail.com>.
Should mention that r776685 translates to `svn up` :)

Paul

On Wed, May 20, 2009 at 9:10 AM, Paul Davis <pa...@gmail.com> wrote:
> Tim,
>
> Can you try r776685 and see if the problem persists?
>
> Paul Davis
>
> On Wed, May 20, 2009 at 8:52 AM, Tim Somers <so...@gmail.com> wrote:
>> At first sight, I'm getting the same error once for each running process
>> using couchdb, so that makes 5 times. I'll repeat my test though, and check
>> if there's something different maybe in the first one.
>>
>> Tim
>>
>>
>> On Wed, May 20, 2009 at 12:47 PM, Damien Katz <da...@apache.org> wrote:
>>
>>> Are there more errors in the log? This error only makes sense to me if
>>> something else is restarting, because of a configuration change or because
>>> something else must have crashed beforehand. For example, running the test
>>> suite restarts components during testing which could cause a crash like
>>> this.
>>>
>>> -Damien
>>>
>>>
>>> On May 20, 2009, at 8:03 AM, Tim Somers wrote:
>>>
>>>  On Wed, May 20, 2009 at 11:57 AM, Paul Davis <paul.joseph.davis@gmail.com
>>>> >wrote:
>>>>
>>>>  On Wed, May 20, 2009 at 6:15 AM, Tim Somers <so...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm getting the exact same error:
>>>>>>
>>>>>> [error] [<0.7002.0>] {error_report,<0.22.0>,
>>>>>>  {<0.7002.0>,crash_report,
>>>>>>   [[{pid,<0.7002.0>},
>>>>>>     {registered_name,[]},
>>>>>>     {error_info,
>>>>>>         {exit,
>>>>>>             {timeout,
>>>>>>                 {gen_server,call,
>>>>>>                     [couch_config,
>>>>>>
>>>>>> {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}},
>>>>>>             [{gen_server,call,2},
>>>>>>              {couch_httpd,handle_request,4},
>>>>>>              {mochiweb_http,headers,5},
>>>>>>              {proc_lib,init_p_do_apply,3}]}},
>>>>>>
>>>>>> {initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
>>>>>>     {ancestors,
>>>>>>
>>>>>>  [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
>>>>>
>>>>>>     {messages,[]},
>>>>>>     {links,[<0.52.0>,#Port<0.4751>]},
>>>>>>     {dictionary,[]},
>>>>>>     {trap_exit,false},
>>>>>>     {status,running},
>>>>>>     {heap_size,2584},
>>>>>>     {stack_size,23},
>>>>>>     {reductions,1669}],
>>>>>>    []]}}
>>>>>> [error] [<0.52.0>] {error_report,<0.22.0>,
>>>>>>  {<0.52.0>,std_error,
>>>>>>   {mochiweb_socket_server,235,
>>>>>>       {child_error,
>>>>>>           {timeout,
>>>>>>               {gen_server,call,
>>>>>>                   [couch_config,
>>>>>>                    {register,#Fun<couch_httpd.9.104562741>,
>>>>>>                        <0.7002.0>}]}}}}}}
>>>>>>
>>>>>> =ERROR REPORT==== 20-May-2009::12:02:45 ===
>>>>>> {mochiweb_socket_server,235,
>>>>>>  {child_error,
>>>>>>      {timeout,
>>>>>>          {gen_server,call,
>>>>>>              [couch_config,
>>>>>>               {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}}}}
>>>>>>
>>>>>>
>>>>>>
>>>>>> although it seems to happen when the system is overloaded. In total, I
>>>>>>
>>>>> have
>>>>>
>>>>>> 5 processing constantly reading from and writing to the same couchdb,
>>>>>>
>>>>> with a
>>>>>
>>>>>> resulting load average of about 3 and physical memory at it's limit. I
>>>>>>
>>>>> get
>>>>>
>>>>>> the impression (though it's hard to reproduce) that this error come at
>>>>>>
>>>>> the
>>>>>
>>>>>> moment the system is swapping some ram out to disk, making couchdb run
>>>>>>
>>>>> into
>>>>>
>>>>>> some timeout while calculating a view.
>>>>>> Couchdb does stay online though, only crashing my app with an unusable
>>>>>> result.
>>>>>>
>>>>>>
>>>>> Can you check if couchdb actually stays alive or if it's getting
>>>>> respawned by heart? The easiest way to test this is to run couchdb
>>>>> with the command line without the init.d script.
>>>>>
>>>>> Erlang closes the entire VM when it's unable to acquire memory. Ie, if
>>>>> malloc returns NULL, then the whole VM closes. The general idea being
>>>>> that it'll just rely on heart to be restarted.
>>>>>
>>>>> Paul Davis
>>>>>
>>>>>  I'm using svn version 776257
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>>
>>>>>
>>>> It stays alive, I always start it from command line. I use the svn version
>>>> on port 5985, and the version installed by debian package manager on port
>>>> 5984 for comparison.
>>>>
>>>> Tim
>>>>
>>>
>>>
>>
>

Re: couchdb kills itself

Posted by Paul Davis <pa...@gmail.com>.
Tim,

Can you try r776685 and see if the problem persists?

Paul Davis

On Wed, May 20, 2009 at 8:52 AM, Tim Somers <so...@gmail.com> wrote:
> At first sight, I'm getting the same error once for each running process
> using couchdb, so that makes 5 times. I'll repeat my test though, and check
> if there's something different maybe in the first one.
>
> Tim
>
>
> On Wed, May 20, 2009 at 12:47 PM, Damien Katz <da...@apache.org> wrote:
>
>> Are there more errors in the log? This error only makes sense to me if
>> something else is restarting, because of a configuration change or because
>> something else must have crashed beforehand. For example, running the test
>> suite restarts components during testing which could cause a crash like
>> this.
>>
>> -Damien
>>
>>
>> On May 20, 2009, at 8:03 AM, Tim Somers wrote:
>>
>>  On Wed, May 20, 2009 at 11:57 AM, Paul Davis <paul.joseph.davis@gmail.com
>>> >wrote:
>>>
>>>  On Wed, May 20, 2009 at 6:15 AM, Tim Somers <so...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm getting the exact same error:
>>>>>
>>>>> [error] [<0.7002.0>] {error_report,<0.22.0>,
>>>>>  {<0.7002.0>,crash_report,
>>>>>   [[{pid,<0.7002.0>},
>>>>>     {registered_name,[]},
>>>>>     {error_info,
>>>>>         {exit,
>>>>>             {timeout,
>>>>>                 {gen_server,call,
>>>>>                     [couch_config,
>>>>>
>>>>> {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}},
>>>>>             [{gen_server,call,2},
>>>>>              {couch_httpd,handle_request,4},
>>>>>              {mochiweb_http,headers,5},
>>>>>              {proc_lib,init_p_do_apply,3}]}},
>>>>>
>>>>> {initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
>>>>>     {ancestors,
>>>>>
>>>>>  [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
>>>>
>>>>>     {messages,[]},
>>>>>     {links,[<0.52.0>,#Port<0.4751>]},
>>>>>     {dictionary,[]},
>>>>>     {trap_exit,false},
>>>>>     {status,running},
>>>>>     {heap_size,2584},
>>>>>     {stack_size,23},
>>>>>     {reductions,1669}],
>>>>>    []]}}
>>>>> [error] [<0.52.0>] {error_report,<0.22.0>,
>>>>>  {<0.52.0>,std_error,
>>>>>   {mochiweb_socket_server,235,
>>>>>       {child_error,
>>>>>           {timeout,
>>>>>               {gen_server,call,
>>>>>                   [couch_config,
>>>>>                    {register,#Fun<couch_httpd.9.104562741>,
>>>>>                        <0.7002.0>}]}}}}}}
>>>>>
>>>>> =ERROR REPORT==== 20-May-2009::12:02:45 ===
>>>>> {mochiweb_socket_server,235,
>>>>>  {child_error,
>>>>>      {timeout,
>>>>>          {gen_server,call,
>>>>>              [couch_config,
>>>>>               {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}}}}
>>>>>
>>>>>
>>>>>
>>>>> although it seems to happen when the system is overloaded. In total, I
>>>>>
>>>> have
>>>>
>>>>> 5 processing constantly reading from and writing to the same couchdb,
>>>>>
>>>> with a
>>>>
>>>>> resulting load average of about 3 and physical memory at it's limit. I
>>>>>
>>>> get
>>>>
>>>>> the impression (though it's hard to reproduce) that this error come at
>>>>>
>>>> the
>>>>
>>>>> moment the system is swapping some ram out to disk, making couchdb run
>>>>>
>>>> into
>>>>
>>>>> some timeout while calculating a view.
>>>>> Couchdb does stay online though, only crashing my app with an unusable
>>>>> result.
>>>>>
>>>>>
>>>> Can you check if couchdb actually stays alive or if it's getting
>>>> respawned by heart? The easiest way to test this is to run couchdb
>>>> with the command line without the init.d script.
>>>>
>>>> Erlang closes the entire VM when it's unable to acquire memory. Ie, if
>>>> malloc returns NULL, then the whole VM closes. The general idea being
>>>> that it'll just rely on heart to be restarted.
>>>>
>>>> Paul Davis
>>>>
>>>>  I'm using svn version 776257
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>
>>> It stays alive, I always start it from command line. I use the svn version
>>> on port 5985, and the version installed by debian package manager on port
>>> 5984 for comparison.
>>>
>>> Tim
>>>
>>
>>
>

Re: couchdb kills itself

Posted by Tim Somers <so...@gmail.com>.
At first sight, I'm getting the same error once for each running process
using couchdb, so that makes 5 times. I'll repeat my test though, and check
if there's something different maybe in the first one.

Tim


On Wed, May 20, 2009 at 12:47 PM, Damien Katz <da...@apache.org> wrote:

> Are there more errors in the log? This error only makes sense to me if
> something else is restarting, because of a configuration change or because
> something else must have crashed beforehand. For example, running the test
> suite restarts components during testing which could cause a crash like
> this.
>
> -Damien
>
>
> On May 20, 2009, at 8:03 AM, Tim Somers wrote:
>
>  On Wed, May 20, 2009 at 11:57 AM, Paul Davis <paul.joseph.davis@gmail.com
>> >wrote:
>>
>>  On Wed, May 20, 2009 at 6:15 AM, Tim Somers <so...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm getting the exact same error:
>>>>
>>>> [error] [<0.7002.0>] {error_report,<0.22.0>,
>>>>  {<0.7002.0>,crash_report,
>>>>   [[{pid,<0.7002.0>},
>>>>     {registered_name,[]},
>>>>     {error_info,
>>>>         {exit,
>>>>             {timeout,
>>>>                 {gen_server,call,
>>>>                     [couch_config,
>>>>
>>>> {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}},
>>>>             [{gen_server,call,2},
>>>>              {couch_httpd,handle_request,4},
>>>>              {mochiweb_http,headers,5},
>>>>              {proc_lib,init_p_do_apply,3}]}},
>>>>
>>>> {initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
>>>>     {ancestors,
>>>>
>>>>  [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
>>>
>>>>     {messages,[]},
>>>>     {links,[<0.52.0>,#Port<0.4751>]},
>>>>     {dictionary,[]},
>>>>     {trap_exit,false},
>>>>     {status,running},
>>>>     {heap_size,2584},
>>>>     {stack_size,23},
>>>>     {reductions,1669}],
>>>>    []]}}
>>>> [error] [<0.52.0>] {error_report,<0.22.0>,
>>>>  {<0.52.0>,std_error,
>>>>   {mochiweb_socket_server,235,
>>>>       {child_error,
>>>>           {timeout,
>>>>               {gen_server,call,
>>>>                   [couch_config,
>>>>                    {register,#Fun<couch_httpd.9.104562741>,
>>>>                        <0.7002.0>}]}}}}}}
>>>>
>>>> =ERROR REPORT==== 20-May-2009::12:02:45 ===
>>>> {mochiweb_socket_server,235,
>>>>  {child_error,
>>>>      {timeout,
>>>>          {gen_server,call,
>>>>              [couch_config,
>>>>               {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}}}}
>>>>
>>>>
>>>>
>>>> although it seems to happen when the system is overloaded. In total, I
>>>>
>>> have
>>>
>>>> 5 processing constantly reading from and writing to the same couchdb,
>>>>
>>> with a
>>>
>>>> resulting load average of about 3 and physical memory at it's limit. I
>>>>
>>> get
>>>
>>>> the impression (though it's hard to reproduce) that this error come at
>>>>
>>> the
>>>
>>>> moment the system is swapping some ram out to disk, making couchdb run
>>>>
>>> into
>>>
>>>> some timeout while calculating a view.
>>>> Couchdb does stay online though, only crashing my app with an unusable
>>>> result.
>>>>
>>>>
>>> Can you check if couchdb actually stays alive or if it's getting
>>> respawned by heart? The easiest way to test this is to run couchdb
>>> with the command line without the init.d script.
>>>
>>> Erlang closes the entire VM when it's unable to acquire memory. Ie, if
>>> malloc returns NULL, then the whole VM closes. The general idea being
>>> that it'll just rely on heart to be restarted.
>>>
>>> Paul Davis
>>>
>>>  I'm using svn version 776257
>>>>
>>>> Tim
>>>>
>>>>
>>>
>> It stays alive, I always start it from command line. I use the svn version
>> on port 5985, and the version installed by debian package manager on port
>> 5984 for comparison.
>>
>> Tim
>>
>
>

Re: couchdb kills itself

Posted by Damien Katz <da...@apache.org>.
Are there more errors in the log? This error only makes sense to me if  
something else is restarting, because of a configuration change or  
because something else must have crashed beforehand. For example,  
running the test suite restarts components during testing which could  
cause a crash like this.

-Damien

On May 20, 2009, at 8:03 AM, Tim Somers wrote:

> On Wed, May 20, 2009 at 11:57 AM, Paul Davis <paul.joseph.davis@gmail.com 
> >wrote:
>
>> On Wed, May 20, 2009 at 6:15 AM, Tim Somers <so...@gmail.com>  
>> wrote:
>>> Hi,
>>>
>>> I'm getting the exact same error:
>>>
>>> [error] [<0.7002.0>] {error_report,<0.22.0>,
>>>   {<0.7002.0>,crash_report,
>>>    [[{pid,<0.7002.0>},
>>>      {registered_name,[]},
>>>      {error_info,
>>>          {exit,
>>>              {timeout,
>>>                  {gen_server,call,
>>>                      [couch_config,
>>>
>>> {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}},
>>>              [{gen_server,call,2},
>>>               {couch_httpd,handle_request,4},
>>>               {mochiweb_http,headers,5},
>>>               {proc_lib,init_p_do_apply,3}]}},
>>>
>>> {initial_call,{mochiweb_socket_server,acceptor_loop, 
>>> ['Argument__1']}},
>>>      {ancestors,
>>>
>> [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
>>>      {messages,[]},
>>>      {links,[<0.52.0>,#Port<0.4751>]},
>>>      {dictionary,[]},
>>>      {trap_exit,false},
>>>      {status,running},
>>>      {heap_size,2584},
>>>      {stack_size,23},
>>>      {reductions,1669}],
>>>     []]}}
>>> [error] [<0.52.0>] {error_report,<0.22.0>,
>>>   {<0.52.0>,std_error,
>>>    {mochiweb_socket_server,235,
>>>        {child_error,
>>>            {timeout,
>>>                {gen_server,call,
>>>                    [couch_config,
>>>                     {register,#Fun<couch_httpd.9.104562741>,
>>>                         <0.7002.0>}]}}}}}}
>>>
>>> =ERROR REPORT==== 20-May-2009::12:02:45 ===
>>> {mochiweb_socket_server,235,
>>>   {child_error,
>>>       {timeout,
>>>           {gen_server,call,
>>>               [couch_config,
>>>                {register,#Fun<couch_httpd. 
>>> 9.104562741>,<0.7002.0>}]}}}}
>>>
>>>
>>>
>>> although it seems to happen when the system is overloaded. In  
>>> total, I
>> have
>>> 5 processing constantly reading from and writing to the same  
>>> couchdb,
>> with a
>>> resulting load average of about 3 and physical memory at it's  
>>> limit. I
>> get
>>> the impression (though it's hard to reproduce) that this error  
>>> come at
>> the
>>> moment the system is swapping some ram out to disk, making couchdb  
>>> run
>> into
>>> some timeout while calculating a view.
>>> Couchdb does stay online though, only crashing my app with an  
>>> unusable
>>> result.
>>>
>>
>> Can you check if couchdb actually stays alive or if it's getting
>> respawned by heart? The easiest way to test this is to run couchdb
>> with the command line without the init.d script.
>>
>> Erlang closes the entire VM when it's unable to acquire memory. Ie,  
>> if
>> malloc returns NULL, then the whole VM closes. The general idea being
>> that it'll just rely on heart to be restarted.
>>
>> Paul Davis
>>
>>> I'm using svn version 776257
>>>
>>> Tim
>>>
>>
>
> It stays alive, I always start it from command line. I use the svn  
> version
> on port 5985, and the version installed by debian package manager on  
> port
> 5984 for comparison.
>
> Tim


Re: couchdb kills itself

Posted by Tim Somers <so...@gmail.com>.
On Wed, May 20, 2009 at 11:57 AM, Paul Davis <pa...@gmail.com>wrote:

> On Wed, May 20, 2009 at 6:15 AM, Tim Somers <so...@gmail.com> wrote:
> > Hi,
> >
> > I'm getting the exact same error:
> >
> > [error] [<0.7002.0>] {error_report,<0.22.0>,
> >    {<0.7002.0>,crash_report,
> >     [[{pid,<0.7002.0>},
> >       {registered_name,[]},
> >       {error_info,
> >           {exit,
> >               {timeout,
> >                   {gen_server,call,
> >                       [couch_config,
> >
> > {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}},
> >               [{gen_server,call,2},
> >                {couch_httpd,handle_request,4},
> >                {mochiweb_http,headers,5},
> >                {proc_lib,init_p_do_apply,3}]}},
> >
> > {initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
> >       {ancestors,
> >
> [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
> >       {messages,[]},
> >       {links,[<0.52.0>,#Port<0.4751>]},
> >       {dictionary,[]},
> >       {trap_exit,false},
> >       {status,running},
> >       {heap_size,2584},
> >       {stack_size,23},
> >       {reductions,1669}],
> >      []]}}
> > [error] [<0.52.0>] {error_report,<0.22.0>,
> >    {<0.52.0>,std_error,
> >     {mochiweb_socket_server,235,
> >         {child_error,
> >             {timeout,
> >                 {gen_server,call,
> >                     [couch_config,
> >                      {register,#Fun<couch_httpd.9.104562741>,
> >                          <0.7002.0>}]}}}}}}
> >
> > =ERROR REPORT==== 20-May-2009::12:02:45 ===
> > {mochiweb_socket_server,235,
> >    {child_error,
> >        {timeout,
> >            {gen_server,call,
> >                [couch_config,
> >                 {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}}}}
> >
> >
> >
> > although it seems to happen when the system is overloaded. In total, I
> have
> > 5 processing constantly reading from and writing to the same couchdb,
> with a
> > resulting load average of about 3 and physical memory at it's limit. I
> get
> > the impression (though it's hard to reproduce) that this error come at
> the
> > moment the system is swapping some ram out to disk, making couchdb run
> into
> > some timeout while calculating a view.
> > Couchdb does stay online though, only crashing my app with an unusable
> > result.
> >
>
> Can you check if couchdb actually stays alive or if it's getting
> respawned by heart? The easiest way to test this is to run couchdb
> with the command line without the init.d script.
>
> Erlang closes the entire VM when it's unable to acquire memory. Ie, if
> malloc returns NULL, then the whole VM closes. The general idea being
> that it'll just rely on heart to be restarted.
>
> Paul Davis
>
> > I'm using svn version 776257
> >
> > Tim
> >
>

It stays alive, I always start it from command line. I use the svn version
on port 5985, and the version installed by debian package manager on port
5984 for comparison.

Tim

Re: couchdb kills itself

Posted by Paul Davis <pa...@gmail.com>.
On Wed, May 20, 2009 at 6:15 AM, Tim Somers <so...@gmail.com> wrote:
> Hi,
>
> I'm getting the exact same error:
>
> [error] [<0.7002.0>] {error_report,<0.22.0>,
>    {<0.7002.0>,crash_report,
>     [[{pid,<0.7002.0>},
>       {registered_name,[]},
>       {error_info,
>           {exit,
>               {timeout,
>                   {gen_server,call,
>                       [couch_config,
>
> {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}},
>               [{gen_server,call,2},
>                {couch_httpd,handle_request,4},
>                {mochiweb_http,headers,5},
>                {proc_lib,init_p_do_apply,3}]}},
>
> {initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
>       {ancestors,
>           [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
>       {messages,[]},
>       {links,[<0.52.0>,#Port<0.4751>]},
>       {dictionary,[]},
>       {trap_exit,false},
>       {status,running},
>       {heap_size,2584},
>       {stack_size,23},
>       {reductions,1669}],
>      []]}}
> [error] [<0.52.0>] {error_report,<0.22.0>,
>    {<0.52.0>,std_error,
>     {mochiweb_socket_server,235,
>         {child_error,
>             {timeout,
>                 {gen_server,call,
>                     [couch_config,
>                      {register,#Fun<couch_httpd.9.104562741>,
>                          <0.7002.0>}]}}}}}}
>
> =ERROR REPORT==== 20-May-2009::12:02:45 ===
> {mochiweb_socket_server,235,
>    {child_error,
>        {timeout,
>            {gen_server,call,
>                [couch_config,
>                 {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}}}}
>
>
>
> although it seems to happen when the system is overloaded. In total, I have
> 5 processing constantly reading from and writing to the same couchdb, with a
> resulting load average of about 3 and physical memory at it's limit. I get
> the impression (though it's hard to reproduce) that this error come at the
> moment the system is swapping some ram out to disk, making couchdb run into
> some timeout while calculating a view.
> Couchdb does stay online though, only crashing my app with an unusable
> result.
>

Can you check if couchdb actually stays alive or if it's getting
respawned by heart? The easiest way to test this is to run couchdb
with the command line without the init.d script.

Erlang closes the entire VM when it's unable to acquire memory. Ie, if
malloc returns NULL, then the whole VM closes. The general idea being
that it'll just rely on heart to be restarted.

Paul Davis

> I'm using svn version 776257
>
> Tim
>

Re: couchdb kills itself

Posted by Tim Somers <so...@gmail.com>.
Hi,

I'm getting the exact same error:

[error] [<0.7002.0>] {error_report,<0.22.0>,
    {<0.7002.0>,crash_report,
     [[{pid,<0.7002.0>},
       {registered_name,[]},
       {error_info,
           {exit,
               {timeout,
                   {gen_server,call,
                       [couch_config,

{register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}},
               [{gen_server,call,2},
                {couch_httpd,handle_request,4},
                {mochiweb_http,headers,5},
                {proc_lib,init_p_do_apply,3}]}},

{initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
       {ancestors,
           [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]},
       {messages,[]},
       {links,[<0.52.0>,#Port<0.4751>]},
       {dictionary,[]},
       {trap_exit,false},
       {status,running},
       {heap_size,2584},
       {stack_size,23},
       {reductions,1669}],
      []]}}
[error] [<0.52.0>] {error_report,<0.22.0>,
    {<0.52.0>,std_error,
     {mochiweb_socket_server,235,
         {child_error,
             {timeout,
                 {gen_server,call,
                     [couch_config,
                      {register,#Fun<couch_httpd.9.104562741>,
                          <0.7002.0>}]}}}}}}

=ERROR REPORT==== 20-May-2009::12:02:45 ===
{mochiweb_socket_server,235,
    {child_error,
        {timeout,
            {gen_server,call,
                [couch_config,
                 {register,#Fun<couch_httpd.9.104562741>,<0.7002.0>}]}}}}



although it seems to happen when the system is overloaded. In total, I have
5 processing constantly reading from and writing to the same couchdb, with a
resulting load average of about 3 and physical memory at it's limit. I get
the impression (though it's hard to reproduce) that this error come at the
moment the system is swapping some ram out to disk, making couchdb run into
some timeout while calculating a view.
Couchdb does stay online though, only crashing my app with an unusable
result.

I'm using svn version 776257

Tim

Re: couchdb kills itself

Posted by Jeff Macdonald <ma...@gmail.com>.
On Tue, May 19, 2009 at 7:31 PM, Noah Slater <ns...@apache.org> wrote:
> On Tue, May 19, 2009 at 06:31:50PM -0400, Jeff Macdonald wrote:
>> heart_beat_kill_pid = 7227
>> heart_beat_timeout = 11
>> heart: Tue May 19 18:11:33 2009: heart-beat time-out.
>> heart: Tue May 19 18:11:57 2009: Executed
>> "/home/couchdb/couchdb/bin/couchdb -k". Terminating.
>
> The Erlang heart monitor should bring CouchDB back up. Is this not happening?

no, it didn't. I'm going to try another run tomorrow.


-- 
Jeff Macdonald
Ayer, MA

Re: couchdb kills itself

Posted by Noah Slater <ns...@apache.org>.
On Tue, May 19, 2009 at 06:31:50PM -0400, Jeff Macdonald wrote:
> heart_beat_kill_pid = 7227
> heart_beat_timeout = 11
> heart: Tue May 19 18:11:33 2009: heart-beat time-out.
> heart: Tue May 19 18:11:57 2009: Executed
> "/home/couchdb/couchdb/bin/couchdb -k". Terminating.

The Erlang heart monitor should bring CouchDB back up. Is this not happening?

-- 
Noah Slater, http://tumbolia.org/nslater

Re: couchdb kills itself

Posted by Jan Lehnardt <ja...@apache.org>.
On 20 May 2009, at 17:30, Brian Candler wrote:

> On Wed, May 20, 2009 at 10:06:34AM -0400, Jeff Macdonald wrote:
>> On Wed, May 20, 2009 at 9:59 AM, Jeff Macdonald <macfisherman@gmail.com 
>> > wrote:
>>
>>> R12B from RPM. I'll try R12B-4 next.
>>
>> actually, any harm in trying R13B?
>
> I've heard others on the list say it works fine.

It works fine.

Cheers
Jan
--

Re: couchdb kills itself

Posted by Benoit Chesneau <bc...@gmail.com>.
2009/5/20 Brian Candler <B....@pobox.com>:

> I've heard others on the list say it works fine.
>

so which advices did you take in consideration ? Would be interresting
to have a summary :)

Re: couchdb kills itself

Posted by Brian Candler <B....@pobox.com>.
On Wed, May 20, 2009 at 10:06:34AM -0400, Jeff Macdonald wrote:
> On Wed, May 20, 2009 at 9:59 AM, Jeff Macdonald <ma...@gmail.com> wrote:
> 
> > R12B from RPM. I'll try R12B-4 next.
> 
> actually, any harm in trying R13B?

I've heard others on the list say it works fine.

Re: couchdb kills itself

Posted by Paul Davis <pa...@gmail.com>.
On Wed, May 20, 2009 at 10:06 AM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Wed, May 20, 2009 at 9:59 AM, Jeff Macdonald <ma...@gmail.com> wrote:
>
>> R12B from RPM. I'll try R12B-4 next.
>
> actually, any harm in trying R13B?
>

Not at all. Though I think the patch I committed this morning should
fix this regardless of Erlang VM version.

>
> --
> Jeff Macdonald
> Ayer, MA
>

Re: couchdb kills itself

Posted by Paul Davis <pa...@gmail.com>.
On Wed, May 20, 2009 at 10:22 AM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Wed, May 20, 2009 at 10:06 AM, Jeff Macdonald <ma...@gmail.com> wrote:
>> On Wed, May 20, 2009 at 9:59 AM, Jeff Macdonald <ma...@gmail.com> wrote:
>>
>>> R12B from RPM. I'll try R12B-4 next.
>>
>> actually, any harm in trying R13B?
>
> hmmm... looks like a need to move this stuff to a modern Linux
> distribution. Fedora 9 has gcc-4.3.0 which apparently r13b doesn't
> like.
>

If you don't want to bother, you can just svn up and try your test on
trunk. I don't think that this error is from the OS or hardware, but
rather my inability to think.

Paul Davis

> --
> Jeff Macdonald
> Ayer, MA
>

Re: couchdb kills itself

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, May 20, 2009 at 10:06 AM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Wed, May 20, 2009 at 9:59 AM, Jeff Macdonald <ma...@gmail.com> wrote:
>
>> R12B from RPM. I'll try R12B-4 next.
>
> actually, any harm in trying R13B?

hmmm... looks like a need to move this stuff to a modern Linux
distribution. Fedora 9 has gcc-4.3.0 which apparently r13b doesn't
like.

-- 
Jeff Macdonald
Ayer, MA

Re: couchdb kills itself

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, May 20, 2009 at 9:59 AM, Jeff Macdonald <ma...@gmail.com> wrote:

> R12B from RPM. I'll try R12B-4 next.

actually, any harm in trying R13B?


-- 
Jeff Macdonald
Ayer, MA

Re: couchdb kills itself

Posted by Brian Candler <B....@pobox.com>.
On Wed, May 20, 2009 at 01:09:53PM -0400, Jeff Macdonald wrote:
> On Wed, May 20, 2009 at 11:38 AM, Brian Candler <B....@pobox.com> wrote:
> > #include <fcntl.h>
> > #include <stdio.h>
> > int main(void)
> > {
> >  printf("%d\n", sizeof(off_t));
> >  return 0;
> > }
> >
> 
> $ gcc -Wall -o t t.c && ./t
> t.c: In function ‘main’:
> t.c:5: warning: format ‘%d’ expects type ‘int’, but argument 2 has
> type ‘long unsigned int’
> 8
> 
> $ uname -a
> Linux jmac 2.6.27.9-73.fc9.x86_64 #1 SMP Tue Dec 16 14:54:03 EST 2008
> x86_64 x86_64 x86_64 GNU/Linux

Sorry, didn't realise you were on a 64 bit platform. In that case, the
limitation seems very odd.

Re: couchdb kills itself

Posted by Brian Candler <B....@pobox.com>.
On Wed, May 20, 2009 at 09:59:38AM -0400, Jeff Macdonald wrote:
> On Wed, May 20, 2009 at 7:10 AM, Brian Candler <B....@pobox.com> wrote:
> > On Tue, May 19, 2009 at 06:31:50PM -0400, Jeff Macdonald wrote:
> >> "/home/couchdb/couchdb/bin/couchdb -k". Terminating.
> >>
> >> The resulting database is 4Gigs.
> >
> > Is it *exactly* 4 Gigs? That would suggest a 32-bit file size problem.
> 
> I could swear the ext3 can handle larger files than that.

You're probably right, but there's a lot more than the filesystem that has
to be 64-bit clean. So I repeat the question: what's the exact size of your
resulting database?

Regards,

Brian.

Re: couchdb kills itself

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, May 20, 2009 at 7:10 AM, Brian Candler <B....@pobox.com> wrote:
> On Tue, May 19, 2009 at 06:31:50PM -0400, Jeff Macdonald wrote:
>> "/home/couchdb/couchdb/bin/couchdb -k". Terminating.
>>
>> The resulting database is 4Gigs.
>
> Is it *exactly* 4 Gigs? That would suggest a 32-bit file size problem.

I could swear the ext3 can handle larger files than that.

>
>> $ bin/couchdb -V
>> couchdb - Apache CouchDB 0.10.0a776321
>
> What O/S? What version of Erlang, and installed from where?

fedora 9
R12B from RPM. I'll try R12B-4 next.



-- 
Jeff Macdonald
Ayer, MA

Re: couchdb kills itself

Posted by Brian Candler <B....@pobox.com>.
On Tue, May 19, 2009 at 06:31:50PM -0400, Jeff Macdonald wrote:
> "/home/couchdb/couchdb/bin/couchdb -k". Terminating.
> 
> The resulting database is 4Gigs.

Is it *exactly* 4 Gigs? That would suggest a 32-bit file size problem.

> $ bin/couchdb -V
> couchdb - Apache CouchDB 0.10.0a776321

What O/S? What version of Erlang, and installed from where?

Regards,

Brian.