You are viewing a plain text version of this content. The canonical link for it is here.
Posted to olio-dev@incubator.apache.org by "Mandy Waite (JIRA)" <ji...@apache.org> on 2009/01/18 20:55:59 UTC

[jira] Created: (OLIO-38) Rails app aborts and core dumps when running on Thin on OpenSolaris

Rails app aborts and core dumps when running on Thin on OpenSolaris
-------------------------------------------------------------------

                 Key: OLIO-38
                 URL: https://issues.apache.org/jira/browse/OLIO-38
             Project: Olio
          Issue Type: Bug
          Components: rails-app
         Environment: OpenSolaris 2008.11
            Reporter: Mandy Waite


Rails app crashes after a random period of time when running on Thin  on OpenSolaris. Error output by Thin is:

Assertion failed: nbytes > 0, file ed.cpp, line 622
Abort (core dumped)

Which indicates that the problem occurred within native code in the eventmachine gem. 

The assertion checks that there are bytes available to be written on a call to eventmachine's Write(). At the time of the crash OutboundPages.size() = 0 
OutboundPages is the deque used to queue pages of data for writing. 

Stack trace:

core 'core' of 13659:   /usr/ruby/1.8/bin/ruby /var/ruby/1.8/gem_home/bin/thin -e production s
 fecb17c5 _lwp_kill (1, 6, 8034760, fec5a02e) + 15
 fec5a03a raise    (6, 0, 80347b0, fec315ea) + 22
 fec3160a abort    (65737341, 6f697472, 6166206e, 64656c69, 626e203a, 73657479) + f2
 fec3185a _assert  (feaa4c96, feaa4bf6, 26e, 0, 6d7a, a17edb8) + 82
 fea9c2dd _ZN20ConnectionDescriptor18_WriteOutboundDataEv (a17ed88, 8038b0c, 8038b8c, fea9c548) + 391
 fea9c575 _ZN20ConnectionDescriptor5WriteEv (a17ed88, fed3f000, 8038b10, fec4ae14, 2d, 9a5a744) + 3d
 fea9ebe3 _ZN14EventMachine_t14_RunSelectOnceEv (9a5a720, bc00a, fecb0615, fea9ee91, 9a5a760, feab51b8) + 20b
 fea9eead _ZN14EventMachine_t8_RunOnceEv (9a5a720, 8039234, 0, feaa18dd) + 29
 feaa191e _ZN14EventMachine_t3RunEv (9a5a720, 480, 4166135, fea8eeba) + 4e
 fea8ef2d evma_run_machine (0, 808a254, 80391dc, 80681fc, 839c594, 808a254) + 81
 fea912ba _Z29t_run_machine_without_threadsm (839c594) + 16
 080681fc rb_call0 (839c454, 839c594, 5161, 5161, 0, 0) + 998
 08068dd2 rb_call  (839c454, 839c594, 5161, 0, 0, 2) + 196
 080627f9 rb_eval  (839c594, 83b218c) + 19ed
 08063241 rb_eval  (839c594, 83b2628) + 2435
 080686ca rb_call0 (839c454, 839c594, 1411, 1411, 0, 0) + e66
 08068dd2 rb_call  (839c454, 839c594, 1411, 0, 0, 0) + 196
 08062c99 rb_eval  (84281e8, 8425df8) + 1e8d
 0806ea95 block_pass (84281e8, 8425e0c) + 3a1
 080636c8 rb_eval  (84281e8, 8425fd8) + 28bc
 080686ca rb_call0 (8422414, 84281e8, 13c1, 13c1, 0, 0) + e66
 08068dd2 rb_call  (8422414, 84281e8, 13c1, 0, 0, 0) + 196
 08062c99 rb_eval  (842865c, 8280b88) + 1e8d
 080686ca rb_call0 (827ccf4, 842865c, 13c1, 13c1, 0, 0) + e66
 08068dd2 rb_call  (827ccf4, 842865c, 13c1, 0, 0, 0) + 196
 08062c99 rb_eval  (8430104, 842ee1c) + 1e8d
 080686ca rb_call0 (84286c0, 8430104, 13c1, 13c1, 0, 8041444) + e66
 08068dd2 rb_call  (84286c0, 8430104, 13c1, 0, 8041444, 1) + 196
 08068fbc rb_f_send (1, 8041440, 8430104) + f4
 0806820d rb_call0 (80c8bb0, 8430104, fd1, fd1, 1, 8041440) + 9a9
 08068dd2 rb_call  (80c8bb0, 8430104, fd1, 1, 8041440, 0) + 196
 08062c99 rb_eval  (8286cb8, 834ba90) + 1e8d
 0806344a rb_eval  (8286cb8, 834c2b0) + 263e
 080686ca rb_call0 (8286114, 8286cb8, 6351, 6351, 0, 0) + e66
 08068dd2 rb_call  (8286114, 8286cb8, 6351, 0, 0, 2) + 196
 080627f9 rb_eval  (8286cb8, 834c508) + 19ed
 080686ca rb_call0 (8286114, 8286cb8, 4ecf, 4ecf, 0, 0) + e66
 08068dd2 rb_call  (8286114, 8286cb8, 4ecf, 0, 0, 0) + 196
 08062c99 rb_eval  (80c78b4, 83d7978) + 1e8d
 0805dfeb eval_node (80c78b4, 83d7978) + 3f
 0806a819 rb_load  (83d7ba8, 0) + 391
 0806aaf4 rb_f_load (1, 80469c0, 80c78b4) + 48
 0806820d rb_call0 (80c8bb0, 80c78b4, 25c1, 25c1, 1, 80469c0) + 9a9
 08068dd2 rb_call  (80c8bb0, 80c78b4, 25c1, 1, 80469c0, 1) + 196
 08062a39 rb_eval  (80c78b4, 80b761c) + 1c2d
 0805dfeb eval_node (80c78b4, 80b761c) + 3f
 0805e6d6 ruby_exec_internal (808a254, 80c8c50, 0, 0, 0, a00000) + d6
 0805e74b ruby_exec (808a254, 8058cbe, 8047d2c, 8058cc6, feffb7e4, feffde70) + 27
 0805e773 ruby_run (feffb7e4, feffde70, 0, 8047d2c, 8076d8d, feffb7e4) + 23
 08058cc6 main     (5, 8047d60, 8047d78) + 3a
 08058bfe _start   (5, 8047e1c, 8047e33, 8047e53, 8047e56, 8047e61) + 7a

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (OLIO-38) Rails app aborts and core dumps when running on Thin on OpenSolaris

Posted by William Sobel <ws...@eecs.berkeley.edu>.
On Jan 22, 2009, at 3:08 PM, Amanda waite wrote:

> I've been talking to the Eventmachine folks and it looks like we've  
> fixed it. Still testing will update the bug when the fix is confirmed.

Great...

Cheers,
- Will Sobel


Re: [jira] Commented: (OLIO-38) Rails app aborts and core dumps when running on Thin on OpenSolaris

Posted by Amanda waite <Am...@Sun.COM>.
I've been talking to the Eventmachine folks and it looks like we've 
fixed it. Still testing will update the bug when the fix is confirmed.

Amanda

William Sobel (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/OLIO-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666341#action_12666341 ] 
>
> William Sobel commented on OLIO-38:
> -----------------------------------
>
> Please give me some more information:
>
> OpenSolaris version?
> Load Level (how many concurrent users)?
>
> I will also try to load the the evented machine gem in isolation and see if I can reproduce the same problem.
>
>   
>> Rails app aborts and core dumps when running on Thin on OpenSolaris
>> -------------------------------------------------------------------
>>
>>                 Key: OLIO-38
>>                 URL: https://issues.apache.org/jira/browse/OLIO-38
>>             Project: Olio
>>          Issue Type: Bug
>>          Components: rails-app
>>         Environment: OpenSolaris 2008.11
>>            Reporter: Mandy Waite
>>
>> Rails app crashes after a random period of time when running on Thin  on OpenSolaris. Error output by Thin is:
>> Assertion failed: nbytes > 0, file ed.cpp, line 622
>> Abort (core dumped)
>> Which indicates that the problem occurred within native code in the eventmachine gem. 
>> The assertion checks that there are bytes available to be written on a call to eventmachine's Write(). At the time of the crash OutboundPages.size() = 0 
>> OutboundPages is the deque used to queue pages of data for writing. 
>> Stack trace:
>> core 'core' of 13659:   /usr/ruby/1.8/bin/ruby /var/ruby/1.8/gem_home/bin/thin -e production s
>>  fecb17c5 _lwp_kill (1, 6, 8034760, fec5a02e) + 15
>>  fec5a03a raise    (6, 0, 80347b0, fec315ea) + 22
>>  fec3160a abort    (65737341, 6f697472, 6166206e, 64656c69, 626e203a, 73657479) + f2
>>  fec3185a _assert  (feaa4c96, feaa4bf6, 26e, 0, 6d7a, a17edb8) + 82
>>  fea9c2dd _ZN20ConnectionDescriptor18_WriteOutboundDataEv (a17ed88, 8038b0c, 8038b8c, fea9c548) + 391
>>  fea9c575 _ZN20ConnectionDescriptor5WriteEv (a17ed88, fed3f000, 8038b10, fec4ae14, 2d, 9a5a744) + 3d
>>  fea9ebe3 _ZN14EventMachine_t14_RunSelectOnceEv (9a5a720, bc00a, fecb0615, fea9ee91, 9a5a760, feab51b8) + 20b
>>  fea9eead _ZN14EventMachine_t8_RunOnceEv (9a5a720, 8039234, 0, feaa18dd) + 29
>>  feaa191e _ZN14EventMachine_t3RunEv (9a5a720, 480, 4166135, fea8eeba) + 4e
>>  fea8ef2d evma_run_machine (0, 808a254, 80391dc, 80681fc, 839c594, 808a254) + 81
>>  fea912ba _Z29t_run_machine_without_threadsm (839c594) + 16
>>  080681fc rb_call0 (839c454, 839c594, 5161, 5161, 0, 0) + 998
>>  08068dd2 rb_call  (839c454, 839c594, 5161, 0, 0, 2) + 196
>>  080627f9 rb_eval  (839c594, 83b218c) + 19ed
>>  08063241 rb_eval  (839c594, 83b2628) + 2435
>>  080686ca rb_call0 (839c454, 839c594, 1411, 1411, 0, 0) + e66
>>  08068dd2 rb_call  (839c454, 839c594, 1411, 0, 0, 0) + 196
>>  08062c99 rb_eval  (84281e8, 8425df8) + 1e8d
>>  0806ea95 block_pass (84281e8, 8425e0c) + 3a1
>>  080636c8 rb_eval  (84281e8, 8425fd8) + 28bc
>>  080686ca rb_call0 (8422414, 84281e8, 13c1, 13c1, 0, 0) + e66
>>  08068dd2 rb_call  (8422414, 84281e8, 13c1, 0, 0, 0) + 196
>>  08062c99 rb_eval  (842865c, 8280b88) + 1e8d
>>  080686ca rb_call0 (827ccf4, 842865c, 13c1, 13c1, 0, 0) + e66
>>  08068dd2 rb_call  (827ccf4, 842865c, 13c1, 0, 0, 0) + 196
>>  08062c99 rb_eval  (8430104, 842ee1c) + 1e8d
>>  080686ca rb_call0 (84286c0, 8430104, 13c1, 13c1, 0, 8041444) + e66
>>  08068dd2 rb_call  (84286c0, 8430104, 13c1, 0, 8041444, 1) + 196
>>  08068fbc rb_f_send (1, 8041440, 8430104) + f4
>>  0806820d rb_call0 (80c8bb0, 8430104, fd1, fd1, 1, 8041440) + 9a9
>>  08068dd2 rb_call  (80c8bb0, 8430104, fd1, 1, 8041440, 0) + 196
>>  08062c99 rb_eval  (8286cb8, 834ba90) + 1e8d
>>  0806344a rb_eval  (8286cb8, 834c2b0) + 263e
>>  080686ca rb_call0 (8286114, 8286cb8, 6351, 6351, 0, 0) + e66
>>  08068dd2 rb_call  (8286114, 8286cb8, 6351, 0, 0, 2) + 196
>>  080627f9 rb_eval  (8286cb8, 834c508) + 19ed
>>  080686ca rb_call0 (8286114, 8286cb8, 4ecf, 4ecf, 0, 0) + e66
>>  08068dd2 rb_call  (8286114, 8286cb8, 4ecf, 0, 0, 0) + 196
>>  08062c99 rb_eval  (80c78b4, 83d7978) + 1e8d
>>  0805dfeb eval_node (80c78b4, 83d7978) + 3f
>>  0806a819 rb_load  (83d7ba8, 0) + 391
>>  0806aaf4 rb_f_load (1, 80469c0, 80c78b4) + 48
>>  0806820d rb_call0 (80c8bb0, 80c78b4, 25c1, 25c1, 1, 80469c0) + 9a9
>>  08068dd2 rb_call  (80c8bb0, 80c78b4, 25c1, 1, 80469c0, 1) + 196
>>  08062a39 rb_eval  (80c78b4, 80b761c) + 1c2d
>>  0805dfeb eval_node (80c78b4, 80b761c) + 3f
>>  0805e6d6 ruby_exec_internal (808a254, 80c8c50, 0, 0, 0, a00000) + d6
>>  0805e74b ruby_exec (808a254, 8058cbe, 8047d2c, 8058cc6, feffb7e4, feffde70) + 27
>>  0805e773 ruby_run (feffb7e4, feffde70, 0, 8047d2c, 8076d8d, feffb7e4) + 23
>>  08058cc6 main     (5, 8047d60, 8047d78) + 3a
>>  08058bfe _start   (5, 8047e1c, 8047e33, 8047e53, 8047e56, 8047e61) + 7a
>>     
>
>   


[jira] Commented: (OLIO-38) Rails app aborts and core dumps when running on Thin on OpenSolaris

Posted by "William Sobel (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OLIO-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666341#action_12666341 ] 

William Sobel commented on OLIO-38:
-----------------------------------

Please give me some more information:

OpenSolaris version?
Load Level (how many concurrent users)?

I will also try to load the the evented machine gem in isolation and see if I can reproduce the same problem.

> Rails app aborts and core dumps when running on Thin on OpenSolaris
> -------------------------------------------------------------------
>
>                 Key: OLIO-38
>                 URL: https://issues.apache.org/jira/browse/OLIO-38
>             Project: Olio
>          Issue Type: Bug
>          Components: rails-app
>         Environment: OpenSolaris 2008.11
>            Reporter: Mandy Waite
>
> Rails app crashes after a random period of time when running on Thin  on OpenSolaris. Error output by Thin is:
> Assertion failed: nbytes > 0, file ed.cpp, line 622
> Abort (core dumped)
> Which indicates that the problem occurred within native code in the eventmachine gem. 
> The assertion checks that there are bytes available to be written on a call to eventmachine's Write(). At the time of the crash OutboundPages.size() = 0 
> OutboundPages is the deque used to queue pages of data for writing. 
> Stack trace:
> core 'core' of 13659:   /usr/ruby/1.8/bin/ruby /var/ruby/1.8/gem_home/bin/thin -e production s
>  fecb17c5 _lwp_kill (1, 6, 8034760, fec5a02e) + 15
>  fec5a03a raise    (6, 0, 80347b0, fec315ea) + 22
>  fec3160a abort    (65737341, 6f697472, 6166206e, 64656c69, 626e203a, 73657479) + f2
>  fec3185a _assert  (feaa4c96, feaa4bf6, 26e, 0, 6d7a, a17edb8) + 82
>  fea9c2dd _ZN20ConnectionDescriptor18_WriteOutboundDataEv (a17ed88, 8038b0c, 8038b8c, fea9c548) + 391
>  fea9c575 _ZN20ConnectionDescriptor5WriteEv (a17ed88, fed3f000, 8038b10, fec4ae14, 2d, 9a5a744) + 3d
>  fea9ebe3 _ZN14EventMachine_t14_RunSelectOnceEv (9a5a720, bc00a, fecb0615, fea9ee91, 9a5a760, feab51b8) + 20b
>  fea9eead _ZN14EventMachine_t8_RunOnceEv (9a5a720, 8039234, 0, feaa18dd) + 29
>  feaa191e _ZN14EventMachine_t3RunEv (9a5a720, 480, 4166135, fea8eeba) + 4e
>  fea8ef2d evma_run_machine (0, 808a254, 80391dc, 80681fc, 839c594, 808a254) + 81
>  fea912ba _Z29t_run_machine_without_threadsm (839c594) + 16
>  080681fc rb_call0 (839c454, 839c594, 5161, 5161, 0, 0) + 998
>  08068dd2 rb_call  (839c454, 839c594, 5161, 0, 0, 2) + 196
>  080627f9 rb_eval  (839c594, 83b218c) + 19ed
>  08063241 rb_eval  (839c594, 83b2628) + 2435
>  080686ca rb_call0 (839c454, 839c594, 1411, 1411, 0, 0) + e66
>  08068dd2 rb_call  (839c454, 839c594, 1411, 0, 0, 0) + 196
>  08062c99 rb_eval  (84281e8, 8425df8) + 1e8d
>  0806ea95 block_pass (84281e8, 8425e0c) + 3a1
>  080636c8 rb_eval  (84281e8, 8425fd8) + 28bc
>  080686ca rb_call0 (8422414, 84281e8, 13c1, 13c1, 0, 0) + e66
>  08068dd2 rb_call  (8422414, 84281e8, 13c1, 0, 0, 0) + 196
>  08062c99 rb_eval  (842865c, 8280b88) + 1e8d
>  080686ca rb_call0 (827ccf4, 842865c, 13c1, 13c1, 0, 0) + e66
>  08068dd2 rb_call  (827ccf4, 842865c, 13c1, 0, 0, 0) + 196
>  08062c99 rb_eval  (8430104, 842ee1c) + 1e8d
>  080686ca rb_call0 (84286c0, 8430104, 13c1, 13c1, 0, 8041444) + e66
>  08068dd2 rb_call  (84286c0, 8430104, 13c1, 0, 8041444, 1) + 196
>  08068fbc rb_f_send (1, 8041440, 8430104) + f4
>  0806820d rb_call0 (80c8bb0, 8430104, fd1, fd1, 1, 8041440) + 9a9
>  08068dd2 rb_call  (80c8bb0, 8430104, fd1, 1, 8041440, 0) + 196
>  08062c99 rb_eval  (8286cb8, 834ba90) + 1e8d
>  0806344a rb_eval  (8286cb8, 834c2b0) + 263e
>  080686ca rb_call0 (8286114, 8286cb8, 6351, 6351, 0, 0) + e66
>  08068dd2 rb_call  (8286114, 8286cb8, 6351, 0, 0, 2) + 196
>  080627f9 rb_eval  (8286cb8, 834c508) + 19ed
>  080686ca rb_call0 (8286114, 8286cb8, 4ecf, 4ecf, 0, 0) + e66
>  08068dd2 rb_call  (8286114, 8286cb8, 4ecf, 0, 0, 0) + 196
>  08062c99 rb_eval  (80c78b4, 83d7978) + 1e8d
>  0805dfeb eval_node (80c78b4, 83d7978) + 3f
>  0806a819 rb_load  (83d7ba8, 0) + 391
>  0806aaf4 rb_f_load (1, 80469c0, 80c78b4) + 48
>  0806820d rb_call0 (80c8bb0, 80c78b4, 25c1, 25c1, 1, 80469c0) + 9a9
>  08068dd2 rb_call  (80c8bb0, 80c78b4, 25c1, 1, 80469c0, 1) + 196
>  08062a39 rb_eval  (80c78b4, 80b761c) + 1c2d
>  0805dfeb eval_node (80c78b4, 80b761c) + 3f
>  0805e6d6 ruby_exec_internal (808a254, 80c8c50, 0, 0, 0, a00000) + d6
>  0805e74b ruby_exec (808a254, 8058cbe, 8047d2c, 8058cc6, feffb7e4, feffde70) + 27
>  0805e773 ruby_run (feffb7e4, feffde70, 0, 8047d2c, 8076d8d, feffb7e4) + 23
>  08058cc6 main     (5, 8047d60, 8047d78) + 3a
>  08058bfe _start   (5, 8047e1c, 8047e33, 8047e53, 8047e56, 8047e61) + 7a

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Resolved: (OLIO-38) Rails app aborts and core dumps when running on Thin on OpenSolaris

Posted by Shanti Subramanyam - PAE <Sh...@Sun.COM>.
Amanda or Will,
  Can you please add the OpenSolaris specific issues and 
workarounds/fixes to the rails setup.html doc ? Or maybe a better idea 
would be to put them in the FAQ and then add a ptr. to the setup.html.


Shanti

On 01/25/09 05:28 PM, Mandy Waite (JIRA) wrote:
>      [ https://issues.apache.org/jira/browse/OLIO-38?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
> 
> Mandy Waite resolved OLIO-38.
> -----------------------------
> 
>     Resolution: Fixed
> 
> Problem with Eventmachine, fixed and fix pushed out the the Eventmachine git repository. Fix will be in next Eventmachine release. Here's details of the fix and the Eventmachine patch:
> 
> Francis and I just spent some time investigating the issue and I think
> we've finally solved it. This should also fix the long-standing
> problem in the kqueue reactor.
> 
> Basically, the problem arises when send() fails with an EWOULDBLOCK or
> EINPROGRESS. The data is not sent, but its also not put back into
> OutboundPages, which causes OutboundDataSize to get out of sync and
> triggers the assertion failure. The fix is to always put unsent data
> back into the outbound buffer:
> 
> ===================================================================
> --- ed.cpp      (revision 674)
> +++ ed.cpp      (working copy)
> @@ -609,31 +609,38 @@
>         assert (GetSocket() != INVALID_SOCKET);
>         int bytes_written = send (GetSocket(), output_buffer, nbytes, 0);
> 
> -       if (bytes_written > 0) {
> -               OutboundDataSize -= bytes_written;
> -               if ((size_t)bytes_written < nbytes) {
> -                       int len = nbytes - bytes_written;
> -                       char *buffer = (char*) malloc (len + 1);
> -                       if (!buffer)
> -                               throw std::runtime_error ("bad alloc
> throwing back data");
> -                       memcpy (buffer, output_buffer + bytes_written, len);
> -                       buffer [len] = 0;
> -                       OutboundPages.push_front (OutboundPage (buffer, len));
> -               }
> +       bool err = false;
> +       if (bytes_written < 0) {
> +               err = true;
> +               bytes_written = 0;
> +       }
> 
> -               #ifdef HAVE_EPOLL
> -               EpollEvent.events = (EPOLLIN | (SelectForWrite() ?
> EPOLLOUT : 0));
> -               assert (MyEventMachine);
> -               MyEventMachine->Modify (this);
> -               #endif
> -               #ifdef HAVE_KQUEUE
> -               if (SelectForWrite()) {
> -                       MyEventMachine->ArmKqueueWriter (this);
> -                       cerr << "POW\n";
> -               }
> -               #endif
> +       assert (bytes_written >= 0);
> +       OutboundDataSize -= bytes_written;
> +       if ((size_t)bytes_written < nbytes) {
> +               int len = nbytes - bytes_written;
> +               char *buffer = (char*) malloc (len + 1);
> +               if (!buffer)
> +                       throw std::runtime_error ("bad alloc throwing
> back data");
> +               memcpy (buffer, output_buffer + bytes_written, len);
> +               buffer [len] = 0;
> +               OutboundPages.push_front (OutboundPage (buffer, len));
>         }
> -       else {
> +
> +
> +       #ifdef HAVE_EPOLL
> +       EpollEvent.events = (EPOLLIN | (SelectForWrite() ? EPOLLOUT : 0));
> +       assert (MyEventMachine);
> +       MyEventMachine->Modify (this);
> +       #endif
> +       #ifdef HAVE_KQUEUE
> +       if (SelectForWrite()) {
> +               MyEventMachine->ArmKqueueWriter (this);
> +       }
> +       #endif
> +
> +
> +       if (err) {
>                 #ifdef OS_UNIX
>                 if ((errno != EINPROGRESS) && (errno != EWOULDBLOCK)
> && (errno != EINTR))
>                 #endif
> 
> 
>> Rails app aborts and core dumps when running on Thin on OpenSolaris
>> -------------------------------------------------------------------
>>
>>                 Key: OLIO-38
>>                 URL: https://issues.apache.org/jira/browse/OLIO-38
>>             Project: Olio
>>          Issue Type: Bug
>>          Components: rails-app
>>         Environment: OpenSolaris 2008.11
>>            Reporter: Mandy Waite
>>
>> Rails app crashes after a random period of time when running on Thin  on OpenSolaris. Error output by Thin is:
>> Assertion failed: nbytes > 0, file ed.cpp, line 622
>> Abort (core dumped)
>> Which indicates that the problem occurred within native code in the eventmachine gem. 
>> The assertion checks that there are bytes available to be written on a call to eventmachine's Write(). At the time of the crash OutboundPages.size() = 0 
>> OutboundPages is the deque used to queue pages of data for writing. 
>> Stack trace:
>> core 'core' of 13659:   /usr/ruby/1.8/bin/ruby /var/ruby/1.8/gem_home/bin/thin -e production s
>>  fecb17c5 _lwp_kill (1, 6, 8034760, fec5a02e) + 15
>>  fec5a03a raise    (6, 0, 80347b0, fec315ea) + 22
>>  fec3160a abort    (65737341, 6f697472, 6166206e, 64656c69, 626e203a, 73657479) + f2
>>  fec3185a _assert  (feaa4c96, feaa4bf6, 26e, 0, 6d7a, a17edb8) + 82
>>  fea9c2dd _ZN20ConnectionDescriptor18_WriteOutboundDataEv (a17ed88, 8038b0c, 8038b8c, fea9c548) + 391
>>  fea9c575 _ZN20ConnectionDescriptor5WriteEv (a17ed88, fed3f000, 8038b10, fec4ae14, 2d, 9a5a744) + 3d
>>  fea9ebe3 _ZN14EventMachine_t14_RunSelectOnceEv (9a5a720, bc00a, fecb0615, fea9ee91, 9a5a760, feab51b8) + 20b
>>  fea9eead _ZN14EventMachine_t8_RunOnceEv (9a5a720, 8039234, 0, feaa18dd) + 29
>>  feaa191e _ZN14EventMachine_t3RunEv (9a5a720, 480, 4166135, fea8eeba) + 4e
>>  fea8ef2d evma_run_machine (0, 808a254, 80391dc, 80681fc, 839c594, 808a254) + 81
>>  fea912ba _Z29t_run_machine_without_threadsm (839c594) + 16
>>  080681fc rb_call0 (839c454, 839c594, 5161, 5161, 0, 0) + 998
>>  08068dd2 rb_call  (839c454, 839c594, 5161, 0, 0, 2) + 196
>>  080627f9 rb_eval  (839c594, 83b218c) + 19ed
>>  08063241 rb_eval  (839c594, 83b2628) + 2435
>>  080686ca rb_call0 (839c454, 839c594, 1411, 1411, 0, 0) + e66
>>  08068dd2 rb_call  (839c454, 839c594, 1411, 0, 0, 0) + 196
>>  08062c99 rb_eval  (84281e8, 8425df8) + 1e8d
>>  0806ea95 block_pass (84281e8, 8425e0c) + 3a1
>>  080636c8 rb_eval  (84281e8, 8425fd8) + 28bc
>>  080686ca rb_call0 (8422414, 84281e8, 13c1, 13c1, 0, 0) + e66
>>  08068dd2 rb_call  (8422414, 84281e8, 13c1, 0, 0, 0) + 196
>>  08062c99 rb_eval  (842865c, 8280b88) + 1e8d
>>  080686ca rb_call0 (827ccf4, 842865c, 13c1, 13c1, 0, 0) + e66
>>  08068dd2 rb_call  (827ccf4, 842865c, 13c1, 0, 0, 0) + 196
>>  08062c99 rb_eval  (8430104, 842ee1c) + 1e8d
>>  080686ca rb_call0 (84286c0, 8430104, 13c1, 13c1, 0, 8041444) + e66
>>  08068dd2 rb_call  (84286c0, 8430104, 13c1, 0, 8041444, 1) + 196
>>  08068fbc rb_f_send (1, 8041440, 8430104) + f4
>>  0806820d rb_call0 (80c8bb0, 8430104, fd1, fd1, 1, 8041440) + 9a9
>>  08068dd2 rb_call  (80c8bb0, 8430104, fd1, 1, 8041440, 0) + 196
>>  08062c99 rb_eval  (8286cb8, 834ba90) + 1e8d
>>  0806344a rb_eval  (8286cb8, 834c2b0) + 263e
>>  080686ca rb_call0 (8286114, 8286cb8, 6351, 6351, 0, 0) + e66
>>  08068dd2 rb_call  (8286114, 8286cb8, 6351, 0, 0, 2) + 196
>>  080627f9 rb_eval  (8286cb8, 834c508) + 19ed
>>  080686ca rb_call0 (8286114, 8286cb8, 4ecf, 4ecf, 0, 0) + e66
>>  08068dd2 rb_call  (8286114, 8286cb8, 4ecf, 0, 0, 0) + 196
>>  08062c99 rb_eval  (80c78b4, 83d7978) + 1e8d
>>  0805dfeb eval_node (80c78b4, 83d7978) + 3f
>>  0806a819 rb_load  (83d7ba8, 0) + 391
>>  0806aaf4 rb_f_load (1, 80469c0, 80c78b4) + 48
>>  0806820d rb_call0 (80c8bb0, 80c78b4, 25c1, 25c1, 1, 80469c0) + 9a9
>>  08068dd2 rb_call  (80c8bb0, 80c78b4, 25c1, 1, 80469c0, 1) + 196
>>  08062a39 rb_eval  (80c78b4, 80b761c) + 1c2d
>>  0805dfeb eval_node (80c78b4, 80b761c) + 3f
>>  0805e6d6 ruby_exec_internal (808a254, 80c8c50, 0, 0, 0, a00000) + d6
>>  0805e74b ruby_exec (808a254, 8058cbe, 8047d2c, 8058cc6, feffb7e4, feffde70) + 27
>>  0805e773 ruby_run (feffb7e4, feffde70, 0, 8047d2c, 8076d8d, feffb7e4) + 23
>>  08058cc6 main     (5, 8047d60, 8047d78) + 3a
>>  08058bfe _start   (5, 8047e1c, 8047e33, 8047e53, 8047e56, 8047e61) + 7a
> 

[jira] Resolved: (OLIO-38) Rails app aborts and core dumps when running on Thin on OpenSolaris

Posted by "Mandy Waite (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OLIO-38?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mandy Waite resolved OLIO-38.
-----------------------------

    Resolution: Fixed

Problem with Eventmachine, fixed and fix pushed out the the Eventmachine git repository. Fix will be in next Eventmachine release. Here's details of the fix and the Eventmachine patch:

Francis and I just spent some time investigating the issue and I think
we've finally solved it. This should also fix the long-standing
problem in the kqueue reactor.

Basically, the problem arises when send() fails with an EWOULDBLOCK or
EINPROGRESS. The data is not sent, but its also not put back into
OutboundPages, which causes OutboundDataSize to get out of sync and
triggers the assertion failure. The fix is to always put unsent data
back into the outbound buffer:

===================================================================
--- ed.cpp      (revision 674)
+++ ed.cpp      (working copy)
@@ -609,31 +609,38 @@
        assert (GetSocket() != INVALID_SOCKET);
        int bytes_written = send (GetSocket(), output_buffer, nbytes, 0);

-       if (bytes_written > 0) {
-               OutboundDataSize -= bytes_written;
-               if ((size_t)bytes_written < nbytes) {
-                       int len = nbytes - bytes_written;
-                       char *buffer = (char*) malloc (len + 1);
-                       if (!buffer)
-                               throw std::runtime_error ("bad alloc
throwing back data");
-                       memcpy (buffer, output_buffer + bytes_written, len);
-                       buffer [len] = 0;
-                       OutboundPages.push_front (OutboundPage (buffer, len));
-               }
+       bool err = false;
+       if (bytes_written < 0) {
+               err = true;
+               bytes_written = 0;
+       }

-               #ifdef HAVE_EPOLL
-               EpollEvent.events = (EPOLLIN | (SelectForWrite() ?
EPOLLOUT : 0));
-               assert (MyEventMachine);
-               MyEventMachine->Modify (this);
-               #endif
-               #ifdef HAVE_KQUEUE
-               if (SelectForWrite()) {
-                       MyEventMachine->ArmKqueueWriter (this);
-                       cerr << "POW\n";
-               }
-               #endif
+       assert (bytes_written >= 0);
+       OutboundDataSize -= bytes_written;
+       if ((size_t)bytes_written < nbytes) {
+               int len = nbytes - bytes_written;
+               char *buffer = (char*) malloc (len + 1);
+               if (!buffer)
+                       throw std::runtime_error ("bad alloc throwing
back data");
+               memcpy (buffer, output_buffer + bytes_written, len);
+               buffer [len] = 0;
+               OutboundPages.push_front (OutboundPage (buffer, len));
        }
-       else {
+
+
+       #ifdef HAVE_EPOLL
+       EpollEvent.events = (EPOLLIN | (SelectForWrite() ? EPOLLOUT : 0));
+       assert (MyEventMachine);
+       MyEventMachine->Modify (this);
+       #endif
+       #ifdef HAVE_KQUEUE
+       if (SelectForWrite()) {
+               MyEventMachine->ArmKqueueWriter (this);
+       }
+       #endif
+
+
+       if (err) {
                #ifdef OS_UNIX
                if ((errno != EINPROGRESS) && (errno != EWOULDBLOCK)
&& (errno != EINTR))
                #endif


> Rails app aborts and core dumps when running on Thin on OpenSolaris
> -------------------------------------------------------------------
>
>                 Key: OLIO-38
>                 URL: https://issues.apache.org/jira/browse/OLIO-38
>             Project: Olio
>          Issue Type: Bug
>          Components: rails-app
>         Environment: OpenSolaris 2008.11
>            Reporter: Mandy Waite
>
> Rails app crashes after a random period of time when running on Thin  on OpenSolaris. Error output by Thin is:
> Assertion failed: nbytes > 0, file ed.cpp, line 622
> Abort (core dumped)
> Which indicates that the problem occurred within native code in the eventmachine gem. 
> The assertion checks that there are bytes available to be written on a call to eventmachine's Write(). At the time of the crash OutboundPages.size() = 0 
> OutboundPages is the deque used to queue pages of data for writing. 
> Stack trace:
> core 'core' of 13659:   /usr/ruby/1.8/bin/ruby /var/ruby/1.8/gem_home/bin/thin -e production s
>  fecb17c5 _lwp_kill (1, 6, 8034760, fec5a02e) + 15
>  fec5a03a raise    (6, 0, 80347b0, fec315ea) + 22
>  fec3160a abort    (65737341, 6f697472, 6166206e, 64656c69, 626e203a, 73657479) + f2
>  fec3185a _assert  (feaa4c96, feaa4bf6, 26e, 0, 6d7a, a17edb8) + 82
>  fea9c2dd _ZN20ConnectionDescriptor18_WriteOutboundDataEv (a17ed88, 8038b0c, 8038b8c, fea9c548) + 391
>  fea9c575 _ZN20ConnectionDescriptor5WriteEv (a17ed88, fed3f000, 8038b10, fec4ae14, 2d, 9a5a744) + 3d
>  fea9ebe3 _ZN14EventMachine_t14_RunSelectOnceEv (9a5a720, bc00a, fecb0615, fea9ee91, 9a5a760, feab51b8) + 20b
>  fea9eead _ZN14EventMachine_t8_RunOnceEv (9a5a720, 8039234, 0, feaa18dd) + 29
>  feaa191e _ZN14EventMachine_t3RunEv (9a5a720, 480, 4166135, fea8eeba) + 4e
>  fea8ef2d evma_run_machine (0, 808a254, 80391dc, 80681fc, 839c594, 808a254) + 81
>  fea912ba _Z29t_run_machine_without_threadsm (839c594) + 16
>  080681fc rb_call0 (839c454, 839c594, 5161, 5161, 0, 0) + 998
>  08068dd2 rb_call  (839c454, 839c594, 5161, 0, 0, 2) + 196
>  080627f9 rb_eval  (839c594, 83b218c) + 19ed
>  08063241 rb_eval  (839c594, 83b2628) + 2435
>  080686ca rb_call0 (839c454, 839c594, 1411, 1411, 0, 0) + e66
>  08068dd2 rb_call  (839c454, 839c594, 1411, 0, 0, 0) + 196
>  08062c99 rb_eval  (84281e8, 8425df8) + 1e8d
>  0806ea95 block_pass (84281e8, 8425e0c) + 3a1
>  080636c8 rb_eval  (84281e8, 8425fd8) + 28bc
>  080686ca rb_call0 (8422414, 84281e8, 13c1, 13c1, 0, 0) + e66
>  08068dd2 rb_call  (8422414, 84281e8, 13c1, 0, 0, 0) + 196
>  08062c99 rb_eval  (842865c, 8280b88) + 1e8d
>  080686ca rb_call0 (827ccf4, 842865c, 13c1, 13c1, 0, 0) + e66
>  08068dd2 rb_call  (827ccf4, 842865c, 13c1, 0, 0, 0) + 196
>  08062c99 rb_eval  (8430104, 842ee1c) + 1e8d
>  080686ca rb_call0 (84286c0, 8430104, 13c1, 13c1, 0, 8041444) + e66
>  08068dd2 rb_call  (84286c0, 8430104, 13c1, 0, 8041444, 1) + 196
>  08068fbc rb_f_send (1, 8041440, 8430104) + f4
>  0806820d rb_call0 (80c8bb0, 8430104, fd1, fd1, 1, 8041440) + 9a9
>  08068dd2 rb_call  (80c8bb0, 8430104, fd1, 1, 8041440, 0) + 196
>  08062c99 rb_eval  (8286cb8, 834ba90) + 1e8d
>  0806344a rb_eval  (8286cb8, 834c2b0) + 263e
>  080686ca rb_call0 (8286114, 8286cb8, 6351, 6351, 0, 0) + e66
>  08068dd2 rb_call  (8286114, 8286cb8, 6351, 0, 0, 2) + 196
>  080627f9 rb_eval  (8286cb8, 834c508) + 19ed
>  080686ca rb_call0 (8286114, 8286cb8, 4ecf, 4ecf, 0, 0) + e66
>  08068dd2 rb_call  (8286114, 8286cb8, 4ecf, 0, 0, 0) + 196
>  08062c99 rb_eval  (80c78b4, 83d7978) + 1e8d
>  0805dfeb eval_node (80c78b4, 83d7978) + 3f
>  0806a819 rb_load  (83d7ba8, 0) + 391
>  0806aaf4 rb_f_load (1, 80469c0, 80c78b4) + 48
>  0806820d rb_call0 (80c8bb0, 80c78b4, 25c1, 25c1, 1, 80469c0) + 9a9
>  08068dd2 rb_call  (80c8bb0, 80c78b4, 25c1, 1, 80469c0, 1) + 196
>  08062a39 rb_eval  (80c78b4, 80b761c) + 1c2d
>  0805dfeb eval_node (80c78b4, 80b761c) + 3f
>  0805e6d6 ruby_exec_internal (808a254, 80c8c50, 0, 0, 0, a00000) + d6
>  0805e74b ruby_exec (808a254, 8058cbe, 8047d2c, 8058cc6, feffb7e4, feffde70) + 27
>  0805e773 ruby_run (feffb7e4, feffde70, 0, 8047d2c, 8076d8d, feffb7e4) + 23
>  08058cc6 main     (5, 8047d60, 8047d78) + 3a
>  08058bfe _start   (5, 8047e1c, 8047e33, 8047e53, 8047e56, 8047e61) + 7a

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (OLIO-38) Rails app aborts and core dumps when running on Thin on OpenSolaris

Posted by "Shanti Subramanyam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OLIO-38?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shanti Subramanyam closed OLIO-38.
----------------------------------


> Rails app aborts and core dumps when running on Thin on OpenSolaris
> -------------------------------------------------------------------
>
>                 Key: OLIO-38
>                 URL: https://issues.apache.org/jira/browse/OLIO-38
>             Project: Olio
>          Issue Type: Bug
>          Components: rails-app
>         Environment: OpenSolaris 2008.11
>            Reporter: Mandy Waite
>
> Rails app crashes after a random period of time when running on Thin  on OpenSolaris. Error output by Thin is:
> Assertion failed: nbytes > 0, file ed.cpp, line 622
> Abort (core dumped)
> Which indicates that the problem occurred within native code in the eventmachine gem. 
> The assertion checks that there are bytes available to be written on a call to eventmachine's Write(). At the time of the crash OutboundPages.size() = 0 
> OutboundPages is the deque used to queue pages of data for writing. 
> Stack trace:
> core 'core' of 13659:   /usr/ruby/1.8/bin/ruby /var/ruby/1.8/gem_home/bin/thin -e production s
>  fecb17c5 _lwp_kill (1, 6, 8034760, fec5a02e) + 15
>  fec5a03a raise    (6, 0, 80347b0, fec315ea) + 22
>  fec3160a abort    (65737341, 6f697472, 6166206e, 64656c69, 626e203a, 73657479) + f2
>  fec3185a _assert  (feaa4c96, feaa4bf6, 26e, 0, 6d7a, a17edb8) + 82
>  fea9c2dd _ZN20ConnectionDescriptor18_WriteOutboundDataEv (a17ed88, 8038b0c, 8038b8c, fea9c548) + 391
>  fea9c575 _ZN20ConnectionDescriptor5WriteEv (a17ed88, fed3f000, 8038b10, fec4ae14, 2d, 9a5a744) + 3d
>  fea9ebe3 _ZN14EventMachine_t14_RunSelectOnceEv (9a5a720, bc00a, fecb0615, fea9ee91, 9a5a760, feab51b8) + 20b
>  fea9eead _ZN14EventMachine_t8_RunOnceEv (9a5a720, 8039234, 0, feaa18dd) + 29
>  feaa191e _ZN14EventMachine_t3RunEv (9a5a720, 480, 4166135, fea8eeba) + 4e
>  fea8ef2d evma_run_machine (0, 808a254, 80391dc, 80681fc, 839c594, 808a254) + 81
>  fea912ba _Z29t_run_machine_without_threadsm (839c594) + 16
>  080681fc rb_call0 (839c454, 839c594, 5161, 5161, 0, 0) + 998
>  08068dd2 rb_call  (839c454, 839c594, 5161, 0, 0, 2) + 196
>  080627f9 rb_eval  (839c594, 83b218c) + 19ed
>  08063241 rb_eval  (839c594, 83b2628) + 2435
>  080686ca rb_call0 (839c454, 839c594, 1411, 1411, 0, 0) + e66
>  08068dd2 rb_call  (839c454, 839c594, 1411, 0, 0, 0) + 196
>  08062c99 rb_eval  (84281e8, 8425df8) + 1e8d
>  0806ea95 block_pass (84281e8, 8425e0c) + 3a1
>  080636c8 rb_eval  (84281e8, 8425fd8) + 28bc
>  080686ca rb_call0 (8422414, 84281e8, 13c1, 13c1, 0, 0) + e66
>  08068dd2 rb_call  (8422414, 84281e8, 13c1, 0, 0, 0) + 196
>  08062c99 rb_eval  (842865c, 8280b88) + 1e8d
>  080686ca rb_call0 (827ccf4, 842865c, 13c1, 13c1, 0, 0) + e66
>  08068dd2 rb_call  (827ccf4, 842865c, 13c1, 0, 0, 0) + 196
>  08062c99 rb_eval  (8430104, 842ee1c) + 1e8d
>  080686ca rb_call0 (84286c0, 8430104, 13c1, 13c1, 0, 8041444) + e66
>  08068dd2 rb_call  (84286c0, 8430104, 13c1, 0, 8041444, 1) + 196
>  08068fbc rb_f_send (1, 8041440, 8430104) + f4
>  0806820d rb_call0 (80c8bb0, 8430104, fd1, fd1, 1, 8041440) + 9a9
>  08068dd2 rb_call  (80c8bb0, 8430104, fd1, 1, 8041440, 0) + 196
>  08062c99 rb_eval  (8286cb8, 834ba90) + 1e8d
>  0806344a rb_eval  (8286cb8, 834c2b0) + 263e
>  080686ca rb_call0 (8286114, 8286cb8, 6351, 6351, 0, 0) + e66
>  08068dd2 rb_call  (8286114, 8286cb8, 6351, 0, 0, 2) + 196
>  080627f9 rb_eval  (8286cb8, 834c508) + 19ed
>  080686ca rb_call0 (8286114, 8286cb8, 4ecf, 4ecf, 0, 0) + e66
>  08068dd2 rb_call  (8286114, 8286cb8, 4ecf, 0, 0, 0) + 196
>  08062c99 rb_eval  (80c78b4, 83d7978) + 1e8d
>  0805dfeb eval_node (80c78b4, 83d7978) + 3f
>  0806a819 rb_load  (83d7ba8, 0) + 391
>  0806aaf4 rb_f_load (1, 80469c0, 80c78b4) + 48
>  0806820d rb_call0 (80c8bb0, 80c78b4, 25c1, 25c1, 1, 80469c0) + 9a9
>  08068dd2 rb_call  (80c8bb0, 80c78b4, 25c1, 1, 80469c0, 1) + 196
>  08062a39 rb_eval  (80c78b4, 80b761c) + 1c2d
>  0805dfeb eval_node (80c78b4, 80b761c) + 3f
>  0805e6d6 ruby_exec_internal (808a254, 80c8c50, 0, 0, 0, a00000) + d6
>  0805e74b ruby_exec (808a254, 8058cbe, 8047d2c, 8058cc6, feffb7e4, feffde70) + 27
>  0805e773 ruby_run (feffb7e4, feffde70, 0, 8047d2c, 8076d8d, feffb7e4) + 23
>  08058cc6 main     (5, 8047d60, 8047d78) + 3a
>  08058bfe _start   (5, 8047e1c, 8047e33, 8047e53, 8047e56, 8047e61) + 7a

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.