You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Henri Gomez <he...@gmail.com> on 2007/01/31 18:25:41 UTC

FYI: jk 1.2.20 core on iSeries v5R3

Hi to all,

I built and installed the jk 1.2.20 on iSeries i5/OS V5R3 Apache 2
server to replace the previous 1.2.19.

And it core with this stack trace :

 User Trace Dump for job 421305/QTMHHTTP/APACHEDFT. Size: 300K,
Wrapped 1 times.
 --- 01/27/2007 23:52:06 ---
         000000E7:485816 Stack:  QHTTPSVR   / QZSRCORE    MPMT_PTHRE
21    : request_thread
         000000E7:485824 Stack:  QHTTPSVR   / QZSRCORE    HTTP_ASYNC
6     : ap_process_IOCP
         000000E7:485832 Stack:  QHTTPSVR   / QZSRCORE    HTTP_ASYNC
47    : asyncIO_wait
         000000E7:485840 Stack:  QHTTPSVR   / QZSRCORE    CONNECTION
5     : ap_run_process_connection
         000000E7:485848 Stack:  QHTTPSVR   / QZSRCORE    HTTP_CORE
12    : ap_process_http_connection
         000000E7:485856 Stack:  QHTTPSVR   / QZSRCORE    HTTP_REQUE
19    : ap_process_request
         000000E7:485864 Stack:  QHTTPSVR   / QZSRCORE    HTTP_CONFI
27    : ap_invoke_handler
         000000E7:493096 Stack:  QHTTPSVR   / QZSRCORE    HTTP_CONFI
5     : ap_run_handler
         000000E7:493104 Stack:  QHTTPSVR   / MOD_JK1220  MOD_JK
83    : jk_handler
         000000E7:493112 Stack:  QHTTPSVR   / MOD_JK1220  JK_AJP_COM
51    : ajp_service
         000000E7:493120 Stack:  QHTTPSVR   / MOD_JK1220  JK_AJP_COM
39    : ajp_get_reply
         000000E7:493128 Stack:  QHTTPSVR   / MOD_JK1220  JK_AJP_COM
30    : ajp_process_callback
         000000E7:493136 Stack:  QHTTPSVR   / MOD_JK1220  MOD_JK
21    : ws_write
         000000E7:493144 Stack:  QHTTPSVR   / MOD_JK1220  JK_UTIL
37    : jk_log
         000000E7:493152 Stack:  QHTTPSVR   / QZSRCORE    MPMT_PTHRE
11    : Request_Thread_Excp_Handler
         000000E7:493160 Stack:  QHTTPSVR   / QZSRCORE    MPMT_PTHRE
18    : Common_Thread_Excp_Handler_Code
         000000E7:493168 Stack:  QHTTPSVR   / QZSRAPR     OS400TRACE
1     : apr_dstack
      000000E7:493176 Stack:  QSYS       / QP0ZCPA     QP0ZUDBG    3
  : Qp0zDumpStack
      000000E7:493184 Stack:  QSYS       / QP0ZSCPA    QP0ZSDBG    3
  : Qp0zSUDumpStack
      000000E7:493192 Stack:  QSYS       / QP0ZSCPA    QP0ZSDBG    24
  : Qp0zSUDumpTargetStack
      000000E7:493192 Stack:  Completed
      000000E7:514856 FF4ED5DBC3:005D10 L:00D0
mpmt_pthread.c:Exception handler threads communications area
      000000E7:514880  FF4ED5DBC3:005D10  80000000 00000000 FF4ED5DB
C3005BF0 SPP *.........+N.C.$0*
      000000E7:514888  FF4ED5DBC3:005D20  000000D0 00000000 00000000
00000000     *...è............*
      000000E7:514904  FF4ED5DBC3:005D30  A0000000 00000000 E03EDEA4
A1001038     *........ç..u¨...*
      000000E7:514920  FF4ED5DBC3:005D40  A0000000 00000000 E03EDEA4
A1001030     *........ç..u¨...*
      000000E7:514936  FF4ED5DBC3:005D50  80000000 00000000 FF4ED5DB
C3001090 SPP *.........+N.C...*
      000000E7:514952  FF4ED5DBC3:005D60  C3D7C6D4 C3C8F3F6 F0F1D4D6
C46DD1D2     *CPFMCH3601MOD_JK*
      000000E7:514968  FF4ED5DBC3:005D70  F1F2F2F0 00000000 0908F0F0
F0F0F040     *1220......00000 *
      000000E7:514984  FF4ED5DBC3:005D80  C3D7C6F9 F9F9F900 00000000
00100000     *CPF9999.........*
      000000E7:515000  FF4ED5DBC3:005D90  00000909 9999001F 059B0028
00000000     *....rr..........*
      000000E7:515016  FF4ED5DBC3:005DA0  00000162 C7CCFFFF FFFFFEAE
F4D00D00     *....G.......4è..*
      000000E7:515032  FF4ED5DBC3:005DB0  A3000000 01D00000 3A945470
AE28A290     *t....è...m....s.*
      000000E7:515048  FF4ED5DBC3:005DC0  D4C3C8F3 F6F0F1D4 D6C46DD1
D2F1F2F2     *MCH3601MOD_JK122*
      000000E7:515064  FF4ED5DBC3:005DD0  F0000000 000908F0 F0F0F0F0
F0F0F0F3     *0......000000003*

       000000E7:515072 Stack:  mpmt_pthread.c:request thread exception
       000000E7:515096 Stack:  Library    / Program     Module
Stmt    Procedure
       000000E7:515104 Stack:  QSYS       / QLESPI      QLECRTTH    20
   : LE_Create_Thread2__FP12crtth_parm_t
       000000E7:515112 Stack:  QSYS       / QP0WPINT    QP0WSPTHR   18
   : pthread_create_part2
       000000E7:515112 Stack:  QHTTPSVR   / QZSRAPR     THREAD      2
   : dummy_worker
       000000E7:515120 Stack:  QHTTPSVR   / QZSRCORE    MPMT_PTHRE  21
   : request_thread
       000000E7:515128 Stack:  QHTTPSVR   / QZSRCORE    HTTP_ASYNC  6
   : ap_process_IOCP
       000000E7:515640 Stack:  QHTTPSVR   / QZSRCORE    HTTP_ASYNC  47
   : asyncIO_wait
       000000E7:515648 Stack:  QHTTPSVR   / QZSRCORE    CONNECTION  5
   : ap_run_process_connection
       000000E7:515656 Stack:  QHTTPSVR   / QZSRCORE    HTTP_CORE   12
   : ap_process_http_connection
       000000E7:515664 Stack:  QHTTPSVR   / QZSRCORE    HTTP_REQUE  19
   : ap_process_request
       000000E7:515664 Stack:  QHTTPSVR   / QZSRCORE    HTTP_CONFI  27
   : ap_invoke_handler
       000000E7:515672 Stack:  QHTTPSVR   / QZSRCORE    HTTP_CONFI  5
   : ap_run_handler
       000000E7:515680 Stack:  QHTTPSVR   / MOD_JK1220  MOD_JK      83
   : jk_handler
       000000E7:515688 Stack:  QHTTPSVR   / MOD_JK1220  JK_AJP_COM  51
   : ajp_service
       000000E7:515696 Stack:  QHTTPSVR   / MOD_JK1220  JK_AJP_COM  39
   : ajp_get_reply
                                                  TRCTCPAPP Output
       000000E7:515696 Stack:  QHTTPSVR   / MOD_JK1220  JK_AJP_COM  30
   : ajp_process_callback
       000000E7:515704 Stack:  QHTTPSVR   / MOD_JK1220  MOD_JK      21
   : ws_write
        000000E7:515712 Stack:  QHTTPSVR   / MOD_JK1220  JK_UTIL
37    : jk_log
        000000E7:515720 Stack:  QHTTPSVR   / QZSRCORE    MPMT_PTHRE
11    : Request_Thread_Excp_Handler
        000000E7:515720 Stack:  QHTTPSVR   / QZSRCORE    MPMT_PTHRE
18    : Common_Thread_Excp_Handler_Code
        000000E7:515728 Stack:  QHTTPSVR   / QZSRAPR     OS400TRACE  1
    : apr_dstack
        000000E7:515736 Stack:  QSYS       / QP0ZCPA     QP0ZUDBG    3
    : Qp0zDumpStack
        000000E7:515744 Stack:  QSYS       / QP0ZSCPA    QP0ZSDBG    3
    : Qp0zSUDumpStack
        000000E7:515744 Stack:  QSYS       / QP0ZSCPA    QP0ZSDBG
24    : Qp0zSUDumpTargetStack
        000000E7:515752 Stack:  Completed
--- 01/27/2007 23:56:07 ---


Any idea why the 1.2.20 failed where the 1.2.19 still works ?

Did there is new defines between 1.2.19 and 1.2.20 to be added in the
build process ( i5/OS couldn't use configure and makefile ;(

Regards

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Henri Gomez <he...@gmail.com>.
2007/1/31, Rainer Jung <ra...@kippdata.de>:
> Hi Henri,
>
> there have been two bug fixes concerning string formatting, which have
> core dump potential. Both only occur with log level info or above. I
> think they are *not* the reason (see below). The code is in
> common/jk_ajp_common.c:
>
> 1) Wrong order of arguments; should only be relevant, if new feature
> "fail_on_status" is being used.
>
> @@ -1850,10 +1864,10 @@
>              else if (err == JK_STATUS_ERROR) {
>                  jk_log(l, JK_LOG_INFO,
>                          "(%s) request failed, "
>                          "because of response status %d, "
>                          "recoverable operation attempt=%d",
> -                       p->worker->http_status_fail,
> -                       p->worker->name, i);
> +                       p->worker->name,
> +                       p->worker->http_status_fail, i);
>
> 2) Missing "s" in "%s". Should not produce a core dump.

Well may be since I see this kind of error with bad printf.

I'll patch this one

> @@ -1108,7 +1122,7 @@
>
>       if ((len = ajp_read_fully_from_server(r, l, read_buf, len)) < 0) {
>           jk_log(l, JK_LOG_INFO,
> -               "(%) receiving data from client failed. "
> +               "(%s) receiving data from client failed. "
>                  "Connection aborted or network problems",
>                  ae->worker->name);
>           JK_TRACE_EXIT(l);
>
> We don't have any other core dump reports for 1.2.20 at the moment. From
> your core analysis I expect another reason. Between 1.2.19 and 1.2.20
> there was the big virtual host cleanup. That included per virtual host
> loggers. The method ws_write has not been changed, but some of the
> config parsing, overloading and initialization.
>
> Could you check, if there are any startup errors in your apache or
> mod_jk logs? Use JkLogLevel debug and LogLevel at least info. Please try
> first with a basic config with no vhosts. Does it crash during startup,
> or when running requests?
>
> Some changes:
>
> 1) If no log file is configured, it tries to use logs/mod_jk.log.
> 2) Log definitions get inherited from the global server to virtual
> servers, but are overwritten by explicit virtual server configs.
> 3) The mod_jk time stamp formats and its own request logging are also
> per virtual server
>
> Do we have a chace to find out the line numbers in ws_write, where we
> called the jk_log? There are tw possibilities:
>
>     399              if (!p->response_started) {
>     400                  if (main_log)
>     401                      jk_log(main_log, JK_LOG_INFO,
>     402                             "Write without start, starting with
> defaults");
>     403                  if (!s->start_response(s, 200, NULL, NULL,
> NULL, 0)) {
>     404                      return JK_FALSE;
>     405                  }
>     406              }
>
>
> and one debug message:
>
>     435                  if (JK_IS_DEBUG_LEVEL(main_log))
>     436                      jk_log(main_log, JK_LOG_DEBUG,
>     437                             "written %d out of %d", r, ll);
>     438
>
> Does the crash go away, if JkLogLevel is info?

If was allready at info :

JkWorkersFile /www/apachedft/conf/workers.properties
JkLogFile /www/apachedft/logs/jk.log
JkShmFile        /www/apachedft/logs/jk.shm
JkLogLevel info

> I'm afraid we have to debug this iteratively. I had no problems on *nix
> platforms and up to now also no user reports.
>
> Henri Gomez wrote:
> > Thanks.
> >
> > Did the 1.2.21 is expected soon or should I use the trunk ?
>
> I think 1.2.21 is still a couple of weeks ago, but I expect trunk to
> produce the same problem, because I don't see anything we changed, that
> might have fixed it.

I'll try to patch the 1.2.20 source and test with new config

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Rainer Jung <ra...@kippdata.de>.
Hi Henri,

do you have any information, if the crash problem for iSeries has been 
resolved by the format string fix? Anything else we need to do before 
1.2.21?

Regards,

Rainer

Henri Gomez wrote:
> the problem on iSeries with the sprintf is very common since on this
> 64bits boxes integer and pointers are very different (32 bits int and
> 128 bits pointers).
> 
> The Apache 2 with modified mod_jk should be tested today so I could
> tell you if the patches solve the problem.
> 
> Regards
> 
> 2007/2/2, Eric Wertman <we...@comcast.net>:
>> I did rebuild the sources as prefork,  it seems to have stopped the
>> problem.  I did also make those 2 corrections you pointed out earlier in
>> the thread.
>>
>> Eric Wertman wrote:

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Henri Gomez <he...@gmail.com>.
the problem on iSeries with the sprintf is very common since on this
64bits boxes integer and pointers are very different (32 bits int and
128 bits pointers).

The Apache 2 with modified mod_jk should be tested today so I could
tell you if the patches solve the problem.

Regards

2007/2/2, Eric Wertman <we...@comcast.net>:
> I did rebuild the sources as prefork,  it seems to have stopped the
> problem.  I did also make those 2 corrections you pointed out earlier in
> the thread.
>
> Eric Wertman wrote:
> > Connector:
> >
> > <Connector port="8010" protocol="AJP/1.3" enableLookups="false"
> > maxThreads="800" minThreads="100" bufferSize="8192" backlog="256" />
> >
> > I've tried all sorts of permutations of this... None of the values
> > make much difference.   My ab test is -n 1000 -c 200.
> >
> > ---------------------------------------
> >
> > workers.properties:
> >
> > # PlatformServer
> > worker.PlatformServer.type=ajp13
> > worker.PlatformServer.host=localhost
> > worker.PlatformServer.port=8010
> > worker.PlatformServer.connection_pool_timeout=60
> > worker.PlatformServer.socket_timeout=60
> > worker.PlatformServer.retries=5
> >
> > I've been through a lot of different settings here too.  I've tried
> > some almost insane values for maxThreads
> > ---------------------------------------
> >
> > httpd.conf (Using virtual servers, forgot to mention that before)
> >
> > <IfModule jk_module>
> > JkWorkersFile conf/workers.properties
> > JkShmFile     logs/mod_jk.shm
> > JkShmSize     8192
> > JkLogFile     logs/mod_jk.log
> > JkLogLevel    debug
> > </IfModule>
> >
> > ---------------------------------------
> >
> > mod_jk startup at debug level.  I only included it because it shows
> > what mod_jk thinks of my connector.
> >
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (239):
> > creating worker PlatformServer
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (144):
> > about to create instance PlatformServer of ajp13
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (157):
> > about to validate and init PlatformServer
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (1971): worker PlatformServer contact is 'localhost:8010'
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2098): setting endpoint options:
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2101): keepalive:        0
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2105): timeout:          60
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2109): buffer size:      0
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2113): pool timeout:     60
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2117): connect timeout:  0
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2121): reply timeout:    0
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2125): prepost timeout:  0
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2129): recovery options: 0
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2133): retries:          5
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2137): max packet size:  8192
> > [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c
> > (2008): setting connection pool size to 25 with min 13
> >
> > ---------------------------------------
> >
> > Can't reproduce at debug level.  This is info (got it on the first try).
> >
> > [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c
> > (1504): Unknown AJP protocol code: 41
> > [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c
> > (1842): (PlatformServer) request failed, because of server error
> > without recovery in send loop attempt=0
> > [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c
> > (970): wrong message format 0x3837 from 127.0.0.1:8010
> > [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c
> > (1566): (PlatformServer) Tomcat is down or refused connection. No
> > response has been sent to the client (yet)
> > [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c
> > (1877): (PlatformServer) receiving from tomcat failed, recoverable
> > operation attempt=0
> > [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c
> > (1916): (PlatformServer) sending request to tomcat failed,
> > recoverable operation attempt=1
> > [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  mod_jk.c (2142):
> > Service error=-5 for worker=PlatformServer
> > [Wed Jan 31 18:31:11 2007] [2539586:0000] [info]  mod_jk.c (401):
> > Write without start, starting with defaults
> > [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c
> > (970): wrong message format 0x0400 from 127.0.0.1:8010
> > [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c
> > (1566): (PlatformServer) Tomcat is down or refused connection. No
> > response has been sent to the client (yet)
> > [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c
> > (1504): Unknown AJP protocol code: 02
> > [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c
> > (1877): (PlatformServer) receiving from tomcat failed, recoverable
> > operation attempt=0
> > [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c
> > (1916): (PlatformServer) sending request to tomcat failed,
> > recoverable operation attempt=1
> > [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c
> > (1842): (PlatformServer) request failed, because of server error
> > without recovery in send loop attempt=0
> > [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  mod_jk.c (2142):
> > Service error=-5 for worker=PlatformServer
> >
> > ---------------------------------------
> >
> > Here's the Tomcat log:
> >
> > Jan 31, 2007 6:31:10 PM org.apache.jk.common.HandlerRequest invoke
> > INFO: Unknown message 0
> > Jan 31, 2007 6:31:12 PM org.apache.jk.common.HandlerRequest invoke
> > INFO: Unknown message 0
> >
> > ---------------------------------------
> >
> >
> > And here's the only thing I know about dbx:
> >
> > dbx ../bin/httpd core
> > Type 'help' for help.
> > warning: The core file is not a fullcore. Some info may not be available.
> > [using memory image in core]
> > reading symbolic information ...
> >
> >
> > Segmentation fault in sig_coredump at 0x10037590
> > 0x10037590 (sig_coredump+0x84) 80410014        lwz   r2,0x14(r1)
> > (dbx) where
> > sig_coredump() at 0x10037590
> > malloc_y.malloc_y(0x2000, 0x0, 0x1fe8, 0x30411700, 0xf879614d,
> > 0x7858c51, 0x17, 0x0) at 0xd03002c4
> > malloc_common.malloc_common_53_36(??) at 0xd02fd8b8
> > jk_pool_dyn_alloc() at 0xdc05a4c0
> > jk_pool_alloc() at 0xdc05a5b4
> > jk_b_set_buffer_size() at 0xdc06c0fc
> > ajp_service() at 0xdc064c28
> > jk_handler() at 0xdc050e60
> > ap_run_handler() at 0x10003a24
> > ap_invoke_handler() at 0x10004810
> > ap_process_request() at 0x1002fd88
> > ap_process_http_connection() at 0x1004d0dc
> > ap_run_process_connection() at 0x10013dc0
> > ap_process_connection() at 0x100144f0
> > process_socket() at 0x1000b03c
> > worker_thread() at 0x1000a8e0
> > dummy_worker() at 0xdbdb8384
> >
> > ---------------------------------------
> >
> > I'll have a look and see if I can't get more information out of
> > tomcat.. there must be a way to increase the logging there.  I am
> > using mpm=worker.  If you still think it's worth the time I will
> > re-compile with prefork and take another shot.
> >
> > Thanks again for your help.
> >
> > Eric
> >
> > Rainer Jung wrote:
> >> Hi Eric,
> >>
> >> maybe your observation is in fact related to Henri's, since you also
> >> get core dumps (segmentation faults), which is quite unusual. See
> >> further comments inline.
> >>
> >> Eric Wertman wrote:
> >>> Sorry to jump in, I'm new here.  I started watching this list
> >>> because of a problem I'm having with the mod_jk 1.2.20 as well.  I'm
> >>> not getting core files, but I do have problems that I can't
> >>> reproduce at log level trace or debug.
> >>
> >> Apache logs segmentation faults. You might be able to produce a core
> >> (which would be very helpful) by setting the coredumpdirectory in
> >> apache and maybe tweaking your os config. One point that might help
> >> to produce cores, would be to not run apache as root.
> >>
> >>> I'm running it on AIX 5.3 (ml05).  I compiled the apache 2.2.4 and
> >>> apr-1.2.8 along with the mod_jk, and using tomcat 5.5.20 with an IBM
> >>> 1.5 JRE.
> >>>
> >>> I compiled them all using the IBM  cc_r compiler.  I've actually
> >>> tried a number of different flags trying to resolve my problem,
> >>> assuming initially that I was doing something incorrectly.  The
> >>> behavior persists, though, and my compilations have all been clean.
> >>> Apache without mod_jk doesn't give me any problems.
> >>>
> >>> Superficially it seems to work fine.  Once I run tests with ab, it
> >>> starts to get a little ugly.  I get a number of failed requests, and
> >>> these types of errors:
> >>>
> >>>
> >>>
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (1504): Unknown AJP protocol code: 41
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (970): wrong message format 0x3837 from 127.0.0.1:8010
> >>
> >> hose mean, taht something is fundamentally wrong in the answer
> >> mopd_jk receives from tomcat. It doesn't really look like AJP/1.3. Is
> >> your MPM worker or prefork? Could you try again with prefork, in case
> >> it is worker?
> >>
> >> Do you get error message on the tomcat side?
> >>
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (1566): (PlatformServer) Tomcat is down or refused connection. No
> >>> response has been sent to the client (yet)
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1877): (PlatformServer) receiving from tomcat failed, recoverable
> >>> operation attempt=0
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1916): (PlatformServer) sending request to tomcat failed,
> >>> recoverable operation attempt=1
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1842): (PlatformServer) request failed, because of server error
> >>> without recovery in send loop attempt=0
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142):
> >>> Service error=-5 for worker=PlatformServer
> >>
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (401):
> >>> Write without start, starting with defaults
> >>
> >> Ths one is very unusual, and it is the log statement, that is the
> >> only one, that could have produced Henri's core dump, although we
> >> still do not know why. It correlates to a protocol error.
> >>
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (970): wrong message format 0x031a from 127.0.0.1:8010
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (1592): (PlatformServer) Tomcat is down or network problems. Part of
> >>> the response has already been sent to the cli
> >>> ent
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1877): (PlatformServer) receiving from tomcat failed, recoverable
> >>> operation attempt=1
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (1504): Unknown AJP protocol code: D8
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1842): (PlatformServer) request failed, because of server error
> >>> without recovery in send loop attempt=0
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142):
> >>> Service error=-5 for worker=PlatformServer
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1916): (PlatformServer) sending request to tomcat failed,
> >>> recoverable operation attempt=2
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (915): sendfull returned -14 with errno=14
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (948): (PlatformServer) can't receive the response message from
> >>> tomcat, network problems or tomcat is down (127.0.
> >>> 0.1:8010), err=-9
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (1566): (PlatformServer) Tomcat is down or refused connection. No
> >>> response has been sent to the client (yet)
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1877): (PlatformServer) receiving from tomcat failed, recoverable
> >>> operation attempt=0
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (1292): (PlatformServer) failed resending request body (812760852)
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1916): (PlatformServer) sending request to tomcat failed,
> >>> recoverable operation attempt=3
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1916): (PlatformServer) sending request to tomcat failed,
> >>> recoverable operation attempt=1
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (915): sendfull returned -9 with errno=9
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c
> >>> (1201): (PlatformServer) error sending request. Will try another
> >>> pooled connection
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (915): sendfull returned -14 with errno=14
> >>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c
> >>> (948): (PlatformServer) can't receive the response message from
> >>> tomcat, network problems or tomcat is down (127.0.
> >>>
> >>> And these in the apache error_log:
> >>>
> >>> [Tue Jan 30 03:02:28 2007] [info] Shared memory session cache
> >>> initialised
> >>> [Tue Jan 30 03:02:28 2007] [info] Init: Initializing (virtual)
> >>> servers for SSL
> >>> [Tue Jan 30 03:02:28 2007] [info] Server: Apache/2.2.4, Interface:
> >>> mod_ssl/2.2.4, Library: OpenSSL/0.9.8d
> >>> [Tue Jan 30 03:02:28 2007] [notice] Apache/2.2.4 (Unix)
> >>> mod_jk/1.2.20 mod_ssl/2.2.4 OpenSSL/0.9.8d configured -- resuming
> >>> normal operations
> >>> [Tue Jan 30 03:02:28 2007] [info] Server built: Jan 29 2007 23:44:30
> >>> [Tue Jan 30 03:02:28 2007] [debug] worker.c(1740): AcceptMutex:
> >>> sysvsem (default: sysvsem)
> >>> [Tue Jan 30 03:03:54 2007] [notice] child pid 2203720 exit signal
> >>> Segmentation fault (11)
> >>> [Tue Jan 30 03:04:00 2007] [debug] worker.c(1083): the listener
> >>> thread didn't exit
> >>> [Tue Jan 30 03:04:01 2007] [debug] worker.c(1083): the listener
> >>> thread didn't exit
> >>> [Tue Jan 30 03:04:13 2007] [debug] worker.c(1354): taking over
> >>> scoreboard slot from 1552520 (quiescing)
> >>> [Tue Jan 30 03:04:19 2007] [notice] child pid 2576392 exit signal
> >>> Segmentation fault (11)
> >>> [Tue Jan 30 03:04:23 2007] [notice] child pid 2310294 exit signal
> >>> Segmentation fault (11)
> >>> [Tue Jan 30 03:04:26 2007] [notice] child pid 2576394 exit signal
> >>> Segmentation fault (11)
> >>> [Tue Jan 30 03:29:30 2007] [notice] child pid 1552524 exit signal
> >>> Segmentation fault (11)
> >>> [Tue Jan 30 04:09:22 2007] [notice] child pid 2584796 exit signal
> >>> Segmentation fault (11)
> >>>
> >>>
> >>> Again,  I thought I would be able to track them down by changing the
> >>> log level to "trace", but the problem disappears at that log level.
> >>> I had assumed it was slowing the connection rate down enough to
> >>> mitigate the problem.  Not sure if any of that helps.. if you can
> >>> help me or want more info let me know.
> >>
> >> With JkLogLevel debug (which logs more than info but less than trace)
> >> we would see, which packets you receive from tomcat. Maybe debug
> >> still shows the problem?
> >>
> >> Could you please post your config (Jk* from apache, workers.roperties
> >> and Connector from tomcat).
> >>
> >>>
> >>> Thanks!
> >>>
> >>> Eric
> >>
> >> Regards,
> >>
> >> Rainer
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> >> For additional commands, e-mail: dev-help@tomcat.apache.org
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> > For additional commands, e-mail: dev-help@tomcat.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Eric Wertman <we...@comcast.net>.
I did rebuild the sources as prefork,  it seems to have stopped the 
problem.  I did also make those 2 corrections you pointed out earlier in 
the thread.

Eric Wertman wrote:
> Connector:
>
> <Connector port="8010" protocol="AJP/1.3" enableLookups="false" 
> maxThreads="800" minThreads="100" bufferSize="8192" backlog="256" />
>
> I've tried all sorts of permutations of this... None of the values 
> make much difference.   My ab test is -n 1000 -c 200.
>
> ---------------------------------------
>
> workers.properties:
>
> # PlatformServer
> worker.PlatformServer.type=ajp13
> worker.PlatformServer.host=localhost
> worker.PlatformServer.port=8010
> worker.PlatformServer.connection_pool_timeout=60
> worker.PlatformServer.socket_timeout=60
> worker.PlatformServer.retries=5
>
> I've been through a lot of different settings here too.  I've tried 
> some almost insane values for maxThreads
> ---------------------------------------
>
> httpd.conf (Using virtual servers, forgot to mention that before)
>
> <IfModule jk_module>
> JkWorkersFile conf/workers.properties
> JkShmFile     logs/mod_jk.shm
> JkShmSize     8192
> JkLogFile     logs/mod_jk.log
> JkLogLevel    debug
> </IfModule>
>
> ---------------------------------------
>
> mod_jk startup at debug level.  I only included it because it shows 
> what mod_jk thinks of my connector.
>
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (239): 
> creating worker PlatformServer
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (144): 
> about to create instance PlatformServer of ajp13
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (157): 
> about to validate and init PlatformServer
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (1971): worker PlatformServer contact is 'localhost:8010'
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2098): setting endpoint options:
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2101): keepalive:        0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2105): timeout:          60
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2109): buffer size:      0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2113): pool timeout:     60
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2117): connect timeout:  0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2121): reply timeout:    0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2125): prepost timeout:  0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2129): recovery options: 0
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2133): retries:          5
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2137): max packet size:  8192
> [Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
> (2008): setting connection pool size to 25 with min 13
>
> ---------------------------------------
>
> Can't reproduce at debug level.  This is info (got it on the first try).
>
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
> (1504): Unknown AJP protocol code: 41
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1842): (PlatformServer) request failed, because of server error 
> without recovery in send loop attempt=0
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
> (970): wrong message format 0x3837 from 127.0.0.1:8010
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
> (1566): (PlatformServer) Tomcat is down or refused connection. No 
> response has been sent to the client (yet)
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
> operation attempt=0
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1916): (PlatformServer) sending request to tomcat failed,  
> recoverable operation attempt=1
> [Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  mod_jk.c (2142): 
> Service error=-5 for worker=PlatformServer
> [Wed Jan 31 18:31:11 2007] [2539586:0000] [info]  mod_jk.c (401): 
> Write without start, starting with defaults
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
> (970): wrong message format 0x0400 from 127.0.0.1:8010
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
> (1566): (PlatformServer) Tomcat is down or refused connection. No 
> response has been sent to the client (yet)
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
> (1504): Unknown AJP protocol code: 02
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
> operation attempt=0
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1916): (PlatformServer) sending request to tomcat failed,  
> recoverable operation attempt=1
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
> (1842): (PlatformServer) request failed, because of server error 
> without recovery in send loop attempt=0
> [Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  mod_jk.c (2142): 
> Service error=-5 for worker=PlatformServer
>
> ---------------------------------------
>
> Here's the Tomcat log:
>
> Jan 31, 2007 6:31:10 PM org.apache.jk.common.HandlerRequest invoke
> INFO: Unknown message 0
> Jan 31, 2007 6:31:12 PM org.apache.jk.common.HandlerRequest invoke
> INFO: Unknown message 0
>
> ---------------------------------------
>
>
> And here's the only thing I know about dbx:
>
> dbx ../bin/httpd core
> Type 'help' for help.
> warning: The core file is not a fullcore. Some info may not be available.
> [using memory image in core]
> reading symbolic information ...
>
>
> Segmentation fault in sig_coredump at 0x10037590
> 0x10037590 (sig_coredump+0x84) 80410014        lwz   r2,0x14(r1)
> (dbx) where
> sig_coredump() at 0x10037590
> malloc_y.malloc_y(0x2000, 0x0, 0x1fe8, 0x30411700, 0xf879614d, 
> 0x7858c51, 0x17, 0x0) at 0xd03002c4
> malloc_common.malloc_common_53_36(??) at 0xd02fd8b8
> jk_pool_dyn_alloc() at 0xdc05a4c0
> jk_pool_alloc() at 0xdc05a5b4
> jk_b_set_buffer_size() at 0xdc06c0fc
> ajp_service() at 0xdc064c28
> jk_handler() at 0xdc050e60
> ap_run_handler() at 0x10003a24
> ap_invoke_handler() at 0x10004810
> ap_process_request() at 0x1002fd88
> ap_process_http_connection() at 0x1004d0dc
> ap_run_process_connection() at 0x10013dc0
> ap_process_connection() at 0x100144f0
> process_socket() at 0x1000b03c
> worker_thread() at 0x1000a8e0
> dummy_worker() at 0xdbdb8384
>
> ---------------------------------------
>
> I'll have a look and see if I can't get more information out of 
> tomcat.. there must be a way to increase the logging there.  I am 
> using mpm=worker.  If you still think it's worth the time I will 
> re-compile with prefork and take another shot.
>
> Thanks again for your help.
>
> Eric
>
> Rainer Jung wrote:
>> Hi Eric,
>>
>> maybe your observation is in fact related to Henri's, since you also 
>> get core dumps (segmentation faults), which is quite unusual. See 
>> further comments inline.
>>
>> Eric Wertman wrote:
>>> Sorry to jump in, I'm new here.  I started watching this list 
>>> because of a problem I'm having with the mod_jk 1.2.20 as well.  I'm 
>>> not getting core files, but I do have problems that I can't 
>>> reproduce at log level trace or debug.
>>
>> Apache logs segmentation faults. You might be able to produce a core 
>> (which would be very helpful) by setting the coredumpdirectory in 
>> apache and maybe tweaking your os config. One point that might help 
>> to produce cores, would be to not run apache as root.
>>
>>> I'm running it on AIX 5.3 (ml05).  I compiled the apache 2.2.4 and 
>>> apr-1.2.8 along with the mod_jk, and using tomcat 5.5.20 with an IBM 
>>> 1.5 JRE.
>>>
>>> I compiled them all using the IBM  cc_r compiler.  I've actually 
>>> tried a number of different flags trying to resolve my problem, 
>>> assuming initially that I was doing something incorrectly.  The 
>>> behavior persists, though, and my compilations have all been clean.  
>>> Apache without mod_jk doesn't give me any problems.
>>>
>>> Superficially it seems to work fine.  Once I run tests with ab, it 
>>> starts to get a little ugly.  I get a number of failed requests, and 
>>> these types of errors:
>>>
>>>
>>>
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1504): Unknown AJP protocol code: 41
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (970): wrong message format 0x3837 from 127.0.0.1:8010
>>
>> hose mean, taht something is fundamentally wrong in the answer 
>> mopd_jk receives from tomcat. It doesn't really look like AJP/1.3. Is 
>> your MPM worker or prefork? Could you try again with prefork, in case 
>> it is worker?
>>
>> Do you get error message on the tomcat side?
>>
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1566): (PlatformServer) Tomcat is down or refused connection. No 
>>> response has been sent to the client (yet)
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>>> operation attempt=0
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1916): (PlatformServer) sending request to tomcat failed,  
>>> recoverable operation attempt=1
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1842): (PlatformServer) request failed, because of server error 
>>> without recovery in send loop attempt=0
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
>>> Service error=-5 for worker=PlatformServer
>>
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (401): 
>>> Write without start, starting with defaults
>>
>> Ths one is very unusual, and it is the log statement, that is the 
>> only one, that could have produced Henri's core dump, although we 
>> still do not know why. It correlates to a protocol error.
>>
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (970): wrong message format 0x031a from 127.0.0.1:8010
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1592): (PlatformServer) Tomcat is down or network problems. Part of 
>>> the response has already been sent to the cli
>>> ent
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>>> operation attempt=1
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1504): Unknown AJP protocol code: D8
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1842): (PlatformServer) request failed, because of server error 
>>> without recovery in send loop attempt=0
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
>>> Service error=-5 for worker=PlatformServer
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1916): (PlatformServer) sending request to tomcat failed,  
>>> recoverable operation attempt=2
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (915): sendfull returned -14 with errno=14
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (948): (PlatformServer) can't receive the response message from 
>>> tomcat, network problems or tomcat is down (127.0.
>>> 0.1:8010), err=-9
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1566): (PlatformServer) Tomcat is down or refused connection. No 
>>> response has been sent to the client (yet)
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>>> operation attempt=0
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (1292): (PlatformServer) failed resending request body (812760852)
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1916): (PlatformServer) sending request to tomcat failed,  
>>> recoverable operation attempt=3
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1916): (PlatformServer) sending request to tomcat failed,  
>>> recoverable operation attempt=1
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (915): sendfull returned -9 with errno=9
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>>> (1201): (PlatformServer) error sending request. Will try another 
>>> pooled connection
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (915): sendfull returned -14 with errno=14
>>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>>> (948): (PlatformServer) can't receive the response message from 
>>> tomcat, network problems or tomcat is down (127.0.
>>>
>>> And these in the apache error_log:
>>>
>>> [Tue Jan 30 03:02:28 2007] [info] Shared memory session cache 
>>> initialised
>>> [Tue Jan 30 03:02:28 2007] [info] Init: Initializing (virtual) 
>>> servers for SSL
>>> [Tue Jan 30 03:02:28 2007] [info] Server: Apache/2.2.4, Interface: 
>>> mod_ssl/2.2.4, Library: OpenSSL/0.9.8d
>>> [Tue Jan 30 03:02:28 2007] [notice] Apache/2.2.4 (Unix) 
>>> mod_jk/1.2.20 mod_ssl/2.2.4 OpenSSL/0.9.8d configured -- resuming 
>>> normal operations
>>> [Tue Jan 30 03:02:28 2007] [info] Server built: Jan 29 2007 23:44:30
>>> [Tue Jan 30 03:02:28 2007] [debug] worker.c(1740): AcceptMutex: 
>>> sysvsem (default: sysvsem)
>>> [Tue Jan 30 03:03:54 2007] [notice] child pid 2203720 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 03:04:00 2007] [debug] worker.c(1083): the listener 
>>> thread didn't exit
>>> [Tue Jan 30 03:04:01 2007] [debug] worker.c(1083): the listener 
>>> thread didn't exit
>>> [Tue Jan 30 03:04:13 2007] [debug] worker.c(1354): taking over 
>>> scoreboard slot from 1552520 (quiescing)
>>> [Tue Jan 30 03:04:19 2007] [notice] child pid 2576392 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 03:04:23 2007] [notice] child pid 2310294 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 03:04:26 2007] [notice] child pid 2576394 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 03:29:30 2007] [notice] child pid 1552524 exit signal 
>>> Segmentation fault (11)
>>> [Tue Jan 30 04:09:22 2007] [notice] child pid 2584796 exit signal 
>>> Segmentation fault (11)
>>>
>>>
>>> Again,  I thought I would be able to track them down by changing the 
>>> log level to "trace", but the problem disappears at that log level.  
>>> I had assumed it was slowing the connection rate down enough to 
>>> mitigate the problem.  Not sure if any of that helps.. if you can 
>>> help me or want more info let me know.
>>
>> With JkLogLevel debug (which logs more than info but less than trace) 
>> we would see, which packets you receive from tomcat. Maybe debug 
>> still shows the problem?
>>
>> Could you please post your config (Jk* from apache, workers.roperties 
>> and Connector from tomcat).
>>
>>>
>>> Thanks!
>>>
>>> Eric
>>
>> Regards,
>>
>> Rainer
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: dev-help@tomcat.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Rainer Jung <ra...@kippdata.de>.
Only depending on the number of workers (including load balancing sub 
workers). The unit is Kilobytes and the default is calculated for 64 
workers, although, it might be slightly less at the moment. If you 
assume 1KB per worker, you should be on the safe side.

Regards,

Rainer

Eric Wertman wrote:
> 
> 
> Mladen Turk wrote:
>> Eric Wertman wrote:
>>> Connector:
>>>
>>>
>>> <IfModule jk_module>
>>> JkWorkersFile conf/workers.properties
>>> JkShmFile     logs/mod_jk.shm
>>> JkShmSize     8192
>>
>> This is 8MB of shared memory.
>> Are you sure you have 10000 workers?
>>
>> It wasn't obvious to me what a reasonable value of this was.. I tried 
>> everything from the default to commenting it out.  At one point I set 
>> it to 1048576.  It didn't seem to make much difference in any case.. 
>> what's the reccomendation for that setting?  Is it purely based on 
>> number of workers, or does the size and type of data also count?
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Eric Wertman <we...@comcast.net>.

Mladen Turk wrote:
> Eric Wertman wrote:
>> Connector:
>>
>>
>> <IfModule jk_module>
>> JkWorkersFile conf/workers.properties
>> JkShmFile     logs/mod_jk.shm
>> JkShmSize     8192
>
> This is 8MB of shared memory.
> Are you sure you have 10000 workers?
>
> It wasn't obvious to me what a reasonable value of this was.. I tried 
> everything from the default to commenting it out.  At one point I set 
> it to 1048576.  It didn't seem to make much difference in any case.. 
> what's the reccomendation for that setting?  Is it purely based on 
> number of workers, or does the size and type of data also count?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Henri Gomez <he...@gmail.com>.
2007/2/1, Mladen Turk <mt...@apache.org>:
> Eric Wertman wrote:
> > Connector:
> >
> >
> > <IfModule jk_module>
> > JkWorkersFile conf/workers.properties
> > JkShmFile     logs/mod_jk.shm
> > JkShmSize     8192
>
> This is 8MB of shared memory.
> Are you sure you have 10000 workers?

Yep, I'll we reduce it :)

1K/worker, 250 worker => 512k

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Mladen Turk <mt...@apache.org>.
Eric Wertman wrote:
> Connector:
> 
> 
> <IfModule jk_module>
> JkWorkersFile conf/workers.properties
> JkShmFile     logs/mod_jk.shm
> JkShmSize     8192

This is 8MB of shared memory.
Are you sure you have 10000 workers?

> JkLogFile     logs/mod_jk.log
> JkLogLevel    debug
> </IfModule>
> 
> 
> Segmentation fault in sig_coredump at 0x10037590
> 0x10037590 (sig_coredump+0x84) 80410014        lwz   r2,0x14(r1)
> (dbx) where
> sig_coredump() at 0x10037590
> malloc_y.malloc_y(0x2000, 0x0, 0x1fe8, 0x30411700, 0xf879614d, 
> 0x7858c51, 0x17, 0x0) at 0xd03002c4
> malloc_common.malloc_common_53_36(??) at 0xd02fd8b8
> jk_pool_dyn_alloc() at 0xdc05a4c0
> jk_pool_alloc() at 0xdc05a5b4
> jk_b_set_buffer_size() at 0xdc06c0fc

This looks like you run out of memory.
We would probably need to add some checking
code to the allocated buffers (if they are really
allocated or not).

Regards,
Mladen

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Eric Wertman <we...@comcast.net>.
Connector:

<Connector port="8010" protocol="AJP/1.3" enableLookups="false" 
maxThreads="800" minThreads="100" bufferSize="8192" backlog="256" />

I've tried all sorts of permutations of this... None of the values make 
much difference.   My ab test is -n 1000 -c 200.

---------------------------------------

workers.properties:

# PlatformServer
worker.PlatformServer.type=ajp13
worker.PlatformServer.host=localhost
worker.PlatformServer.port=8010
worker.PlatformServer.connection_pool_timeout=60
worker.PlatformServer.socket_timeout=60
worker.PlatformServer.retries=5

I've been through a lot of different settings here too.  I've tried some 
almost insane values for maxThreads
---------------------------------------

httpd.conf (Using virtual servers, forgot to mention that before)

<IfModule jk_module>
JkWorkersFile conf/workers.properties
JkShmFile     logs/mod_jk.shm
JkShmSize     8192
JkLogFile     logs/mod_jk.log
JkLogLevel    debug
</IfModule>

---------------------------------------

mod_jk startup at debug level.  I only included it because it shows what 
mod_jk thinks of my connector.

[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (239): 
creating worker PlatformServer
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (144): 
about to create instance PlatformServer of ajp13
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (157): 
about to validate and init PlatformServer
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(1971): worker PlatformServer contact is 'localhost:8010'
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2098): setting endpoint options:
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2101): keepalive:        0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2105): timeout:          60
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2109): buffer size:      0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2113): pool timeout:     60
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2117): connect timeout:  0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2121): reply timeout:    0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2125): prepost timeout:  0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2129): recovery options: 0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2133): retries:          5
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2137): max packet size:  8192
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2008): setting connection pool size to 25 with min 13

---------------------------------------

Can't reproduce at debug level.  This is info (got it on the first try).

[Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
(1504): Unknown AJP protocol code: 41
[Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1842): (PlatformServer) request failed, because of server error without 
recovery in send loop attempt=0
[Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c (970): 
wrong message format 0x3837 from 127.0.0.1:8010
[Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
(1566): (PlatformServer) Tomcat is down or refused connection. No 
response has been sent to the client (yet)
[Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1877): (PlatformServer) receiving from tomcat failed, recoverable 
operation attempt=0
[Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1916): (PlatformServer) sending request to tomcat failed,  recoverable 
operation attempt=1
[Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  mod_jk.c (2142): 
Service error=-5 for worker=PlatformServer
[Wed Jan 31 18:31:11 2007] [2539586:0000] [info]  mod_jk.c (401): Write 
without start, starting with defaults
[Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c (970): 
wrong message format 0x0400 from 127.0.0.1:8010
[Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
(1566): (PlatformServer) Tomcat is down or refused connection. No 
response has been sent to the client (yet)
[Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
(1504): Unknown AJP protocol code: 02
[Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1877): (PlatformServer) receiving from tomcat failed, recoverable 
operation attempt=0
[Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1916): (PlatformServer) sending request to tomcat failed,  recoverable 
operation attempt=1
[Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1842): (PlatformServer) request failed, because of server error without 
recovery in send loop attempt=0
[Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  mod_jk.c (2142): 
Service error=-5 for worker=PlatformServer

---------------------------------------

Here's the Tomcat log:

Jan 31, 2007 6:31:10 PM org.apache.jk.common.HandlerRequest invoke
INFO: Unknown message 0
Jan 31, 2007 6:31:12 PM org.apache.jk.common.HandlerRequest invoke
INFO: Unknown message 0

---------------------------------------


And here's the only thing I know about dbx:

dbx ../bin/httpd core
Type 'help' for help.
warning: The core file is not a fullcore. Some info may not be available.
[using memory image in core]
reading symbolic information ...


Segmentation fault in sig_coredump at 0x10037590
0x10037590 (sig_coredump+0x84) 80410014        lwz   r2,0x14(r1)
(dbx) where
sig_coredump() at 0x10037590
malloc_y.malloc_y(0x2000, 0x0, 0x1fe8, 0x30411700, 0xf879614d, 
0x7858c51, 0x17, 0x0) at 0xd03002c4
malloc_common.malloc_common_53_36(??) at 0xd02fd8b8
jk_pool_dyn_alloc() at 0xdc05a4c0
jk_pool_alloc() at 0xdc05a5b4
jk_b_set_buffer_size() at 0xdc06c0fc
ajp_service() at 0xdc064c28
jk_handler() at 0xdc050e60
ap_run_handler() at 0x10003a24
ap_invoke_handler() at 0x10004810
ap_process_request() at 0x1002fd88
ap_process_http_connection() at 0x1004d0dc
ap_run_process_connection() at 0x10013dc0
ap_process_connection() at 0x100144f0
process_socket() at 0x1000b03c
worker_thread() at 0x1000a8e0
dummy_worker() at 0xdbdb8384

---------------------------------------

I'll have a look and see if I can't get more information out of tomcat.. 
there must be a way to increase the logging there.  I am using 
mpm=worker.  If you still think it's worth the time I will re-compile 
with prefork and take another shot.

Thanks again for your help.

Eric

Rainer Jung wrote:
> Hi Eric,
>
> maybe your observation is in fact related to Henri's, since you also 
> get core dumps (segmentation faults), which is quite unusual. See 
> further comments inline.
>
> Eric Wertman wrote:
>> Sorry to jump in, I'm new here.  I started watching this list because 
>> of a problem I'm having with the mod_jk 1.2.20 as well.  I'm not 
>> getting core files, but I do have problems that I can't reproduce at 
>> log level trace or debug.
>
> Apache logs segmentation faults. You might be able to produce a core 
> (which would be very helpful) by setting the coredumpdirectory in 
> apache and maybe tweaking your os config. One point that might help to 
> produce cores, would be to not run apache as root.
>
>> I'm running it on AIX 5.3 (ml05).  I compiled the apache 2.2.4 and 
>> apr-1.2.8 along with the mod_jk, and using tomcat 5.5.20 with an IBM 
>> 1.5 JRE.
>>
>> I compiled them all using the IBM  cc_r compiler.  I've actually 
>> tried a number of different flags trying to resolve my problem, 
>> assuming initially that I was doing something incorrectly.  The 
>> behavior persists, though, and my compilations have all been clean.  
>> Apache without mod_jk doesn't give me any problems.
>>
>> Superficially it seems to work fine.  Once I run tests with ab, it 
>> starts to get a little ugly.  I get a number of failed requests, and 
>> these types of errors:
>>
>>
>>
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1504): Unknown AJP protocol code: 41
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (970): wrong message format 0x3837 from 127.0.0.1:8010
>
> hose mean, taht something is fundamentally wrong in the answer mopd_jk 
> receives from tomcat. It doesn't really look like AJP/1.3. Is your MPM 
> worker or prefork? Could you try again with prefork, in case it is 
> worker?
>
> Do you get error message on the tomcat side?
>
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1566): (PlatformServer) Tomcat is down or refused connection. No 
>> response has been sent to the client (yet)
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>> operation attempt=0
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1916): (PlatformServer) sending request to tomcat failed,  
>> recoverable operation attempt=1
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1842): (PlatformServer) request failed, because of server error 
>> without recovery in send loop attempt=0
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
>> Service error=-5 for worker=PlatformServer
>
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (401): 
>> Write without start, starting with defaults
>
> Ths one is very unusual, and it is the log statement, that is the only 
> one, that could have produced Henri's core dump, although we still do 
> not know why. It correlates to a protocol error.
>
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (970): wrong message format 0x031a from 127.0.0.1:8010
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1592): (PlatformServer) Tomcat is down or network problems. Part of 
>> the response has already been sent to the cli
>> ent
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>> operation attempt=1
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1504): Unknown AJP protocol code: D8
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1842): (PlatformServer) request failed, because of server error 
>> without recovery in send loop attempt=0
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
>> Service error=-5 for worker=PlatformServer
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1916): (PlatformServer) sending request to tomcat failed,  
>> recoverable operation attempt=2
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (915): sendfull returned -14 with errno=14
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (948): (PlatformServer) can't receive the response message from 
>> tomcat, network problems or tomcat is down (127.0.
>> 0.1:8010), err=-9
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1566): (PlatformServer) Tomcat is down or refused connection. No 
>> response has been sent to the client (yet)
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>> operation attempt=0
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1292): (PlatformServer) failed resending request body (812760852)
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1916): (PlatformServer) sending request to tomcat failed,  
>> recoverable operation attempt=3
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1916): (PlatformServer) sending request to tomcat failed,  
>> recoverable operation attempt=1
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (915): sendfull returned -9 with errno=9
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1201): (PlatformServer) error sending request. Will try another 
>> pooled connection
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (915): sendfull returned -14 with errno=14
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (948): (PlatformServer) can't receive the response message from 
>> tomcat, network problems or tomcat is down (127.0.
>>
>> And these in the apache error_log:
>>
>> [Tue Jan 30 03:02:28 2007] [info] Shared memory session cache 
>> initialised
>> [Tue Jan 30 03:02:28 2007] [info] Init: Initializing (virtual) 
>> servers for SSL
>> [Tue Jan 30 03:02:28 2007] [info] Server: Apache/2.2.4, Interface: 
>> mod_ssl/2.2.4, Library: OpenSSL/0.9.8d
>> [Tue Jan 30 03:02:28 2007] [notice] Apache/2.2.4 (Unix) mod_jk/1.2.20 
>> mod_ssl/2.2.4 OpenSSL/0.9.8d configured -- resuming normal operations
>> [Tue Jan 30 03:02:28 2007] [info] Server built: Jan 29 2007 23:44:30
>> [Tue Jan 30 03:02:28 2007] [debug] worker.c(1740): AcceptMutex: 
>> sysvsem (default: sysvsem)
>> [Tue Jan 30 03:03:54 2007] [notice] child pid 2203720 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 03:04:00 2007] [debug] worker.c(1083): the listener 
>> thread didn't exit
>> [Tue Jan 30 03:04:01 2007] [debug] worker.c(1083): the listener 
>> thread didn't exit
>> [Tue Jan 30 03:04:13 2007] [debug] worker.c(1354): taking over 
>> scoreboard slot from 1552520 (quiescing)
>> [Tue Jan 30 03:04:19 2007] [notice] child pid 2576392 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 03:04:23 2007] [notice] child pid 2310294 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 03:04:26 2007] [notice] child pid 2576394 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 03:29:30 2007] [notice] child pid 1552524 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 04:09:22 2007] [notice] child pid 2584796 exit signal 
>> Segmentation fault (11)
>>
>>
>> Again,  I thought I would be able to track them down by changing the 
>> log level to "trace", but the problem disappears at that log level.  
>> I had assumed it was slowing the connection rate down enough to 
>> mitigate the problem.  Not sure if any of that helps.. if you can 
>> help me or want more info let me know.
>
> With JkLogLevel debug (which logs more than info but less than trace) 
> we would see, which packets you receive from tomcat. Maybe debug still 
> shows the problem?
>
> Could you please post your config (Jk* from apache, workers.roperties 
> and Connector from tomcat).
>
>>
>> Thanks!
>>
>> Eric
>
> Regards,
>
> Rainer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Rainer Jung <ra...@kippdata.de>.
Hi Eric,

maybe your observation is in fact related to Henri's, since you also get 
core dumps (segmentation faults), which is quite unusual. See further 
comments inline.

Eric Wertman wrote:
> Sorry to jump in, I'm new here.  I started watching this list because of 
> a problem I'm having with the mod_jk 1.2.20 as well.  I'm not getting 
> core files, but I do have problems that I can't reproduce at log level 
> trace or debug.

Apache logs segmentation faults. You might be able to produce a core 
(which would be very helpful) by setting the coredumpdirectory in apache 
and maybe tweaking your os config. One point that might help to produce 
cores, would be to not run apache as root.

> I'm running it on AIX 5.3 (ml05).  I compiled the apache 2.2.4 and 
> apr-1.2.8 along with the mod_jk, and using tomcat 5.5.20 with an IBM 1.5 
> JRE.
> 
> I compiled them all using the IBM  cc_r compiler.  I've actually tried a 
> number of different flags trying to resolve my problem, assuming 
> initially that I was doing something incorrectly.  The behavior 
> persists, though, and my compilations have all been clean.  Apache 
> without mod_jk doesn't give me any problems.
> 
> Superficially it seems to work fine.  Once I run tests with ab, it 
> starts to get a little ugly.  I get a number of failed requests, and 
> these types of errors:
> 
> 
> 
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
> (1504): Unknown AJP protocol code: 41
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (970): 
> wrong message format 0x3837 from 127.0.0.1:8010

hose mean, taht something is fundamentally wrong in the answer mopd_jk 
receives from tomcat. It doesn't really look like AJP/1.3. Is your MPM 
worker or prefork? Could you try again with prefork, in case it is worker?

Do you get error message on the tomcat side?

> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
> (1566): (PlatformServer) Tomcat is down or refused connection. No 
> response has been sent to the client (yet)
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
> operation attempt=0
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1916): (PlatformServer) sending request to tomcat failed,  recoverable 
> operation attempt=1
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1842): (PlatformServer) request failed, because of server error without 
> recovery in send loop attempt=0
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
> Service error=-5 for worker=PlatformServer

> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (401): Write 
> without start, starting with defaults

Ths one is very unusual, and it is the log statement, that is the only 
one, that could have produced Henri's core dump, although we still do 
not know why. It correlates to a protocol error.

> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (970): 
> wrong message format 0x031a from 127.0.0.1:8010
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
> (1592): (PlatformServer) Tomcat is down or network problems. Part of the 
> response has already been sent to the cli
> ent
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
> operation attempt=1
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
> (1504): Unknown AJP protocol code: D8
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1842): (PlatformServer) request failed, because of server error without 
> recovery in send loop attempt=0
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
> Service error=-5 for worker=PlatformServer
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1916): (PlatformServer) sending request to tomcat failed,  recoverable 
> operation attempt=2
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (915): 
> sendfull returned -14 with errno=14
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (948): 
> (PlatformServer) can't receive the response message from tomcat, network 
> problems or tomcat is down (127.0.
> 0.1:8010), err=-9
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
> (1566): (PlatformServer) Tomcat is down or refused connection. No 
> response has been sent to the client (yet)
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
> operation attempt=0
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
> (1292): (PlatformServer) failed resending request body (812760852)
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1916): (PlatformServer) sending request to tomcat failed,  recoverable 
> operation attempt=3
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1916): (PlatformServer) sending request to tomcat failed,  recoverable 
> operation attempt=1
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (915): 
> sendfull returned -9 with errno=9
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
> (1201): (PlatformServer) error sending request. Will try another pooled 
> connection
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (915): 
> sendfull returned -14 with errno=14
> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (948): 
> (PlatformServer) can't receive the response message from tomcat, network 
> problems or tomcat is down (127.0.
> 
> And these in the apache error_log:
> 
> [Tue Jan 30 03:02:28 2007] [info] Shared memory session cache initialised
> [Tue Jan 30 03:02:28 2007] [info] Init: Initializing (virtual) servers 
> for SSL
> [Tue Jan 30 03:02:28 2007] [info] Server: Apache/2.2.4, Interface: 
> mod_ssl/2.2.4, Library: OpenSSL/0.9.8d
> [Tue Jan 30 03:02:28 2007] [notice] Apache/2.2.4 (Unix) mod_jk/1.2.20 
> mod_ssl/2.2.4 OpenSSL/0.9.8d configured -- resuming normal operations
> [Tue Jan 30 03:02:28 2007] [info] Server built: Jan 29 2007 23:44:30
> [Tue Jan 30 03:02:28 2007] [debug] worker.c(1740): AcceptMutex: sysvsem 
> (default: sysvsem)
> [Tue Jan 30 03:03:54 2007] [notice] child pid 2203720 exit signal 
> Segmentation fault (11)
> [Tue Jan 30 03:04:00 2007] [debug] worker.c(1083): the listener thread 
> didn't exit
> [Tue Jan 30 03:04:01 2007] [debug] worker.c(1083): the listener thread 
> didn't exit
> [Tue Jan 30 03:04:13 2007] [debug] worker.c(1354): taking over 
> scoreboard slot from 1552520 (quiescing)
> [Tue Jan 30 03:04:19 2007] [notice] child pid 2576392 exit signal 
> Segmentation fault (11)
> [Tue Jan 30 03:04:23 2007] [notice] child pid 2310294 exit signal 
> Segmentation fault (11)
> [Tue Jan 30 03:04:26 2007] [notice] child pid 2576394 exit signal 
> Segmentation fault (11)
> [Tue Jan 30 03:29:30 2007] [notice] child pid 1552524 exit signal 
> Segmentation fault (11)
> [Tue Jan 30 04:09:22 2007] [notice] child pid 2584796 exit signal 
> Segmentation fault (11)
> 
> 
> Again,  I thought I would be able to track them down by changing the log 
> level to "trace", but the problem disappears at that log level.  I had 
> assumed it was slowing the connection rate down enough to mitigate the 
> problem.  Not sure if any of that helps.. if you can help me or want 
> more info let me know.

With JkLogLevel debug (which logs more than info but less than trace) we 
would see, which packets you receive from tomcat. Maybe debug still 
shows the problem?

Could you please post your config (Jk* from apache, workers.roperties 
and Connector from tomcat).

> 
> Thanks!
> 
> Eric

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Eric Wertman <we...@comcast.net>.
Sorry to jump in, I'm new here.  I started watching this list because of 
a problem I'm having with the mod_jk 1.2.20 as well.  I'm not getting 
core files, but I do have problems that I can't reproduce at log level 
trace or debug.

I'm running it on AIX 5.3 (ml05).  I compiled the apache 2.2.4 and 
apr-1.2.8 along with the mod_jk, and using tomcat 5.5.20 with an IBM 1.5 
JRE.

I compiled them all using the IBM  cc_r compiler.  I've actually tried a 
number of different flags trying to resolve my problem, assuming 
initially that I was doing something incorrectly.  The behavior 
persists, though, and my compilations have all been clean.  Apache 
without mod_jk doesn't give me any problems.

Superficially it seems to work fine.  Once I run tests with ab, it 
starts to get a little ugly.  I get a number of failed requests, and 
these types of errors:



[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
(1504): Unknown AJP protocol code: 41
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (970): 
wrong message format 0x3837 from 127.0.0.1:8010
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
(1566): (PlatformServer) Tomcat is down or refused connection. No 
response has been sent to the client (yet)
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1877): (PlatformServer) receiving from tomcat failed, recoverable 
operation attempt=0
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1916): (PlatformServer) sending request to tomcat failed,  recoverable 
operation attempt=1
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1842): (PlatformServer) request failed, because of server error without 
recovery in send loop attempt=0
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
Service error=-5 for worker=PlatformServer
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (401): Write 
without start, starting with defaults
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (970): 
wrong message format 0x031a from 127.0.0.1:8010
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
(1592): (PlatformServer) Tomcat is down or network problems. Part of the 
response has already been sent to the cli
ent
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1877): (PlatformServer) receiving from tomcat failed, recoverable 
operation attempt=1
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
(1504): Unknown AJP protocol code: D8
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1842): (PlatformServer) request failed, because of server error without 
recovery in send loop attempt=0
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
Service error=-5 for worker=PlatformServer
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1916): (PlatformServer) sending request to tomcat failed,  recoverable 
operation attempt=2
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (915): 
sendfull returned -14 with errno=14
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (948): 
(PlatformServer) can't receive the response message from tomcat, network 
problems or tomcat is down (127.0.
0.1:8010), err=-9
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
(1566): (PlatformServer) Tomcat is down or refused connection. No 
response has been sent to the client (yet)
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1877): (PlatformServer) receiving from tomcat failed, recoverable 
operation attempt=0
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
(1292): (PlatformServer) failed resending request body (812760852)
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1916): (PlatformServer) sending request to tomcat failed,  recoverable 
operation attempt=3
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1916): (PlatformServer) sending request to tomcat failed,  recoverable 
operation attempt=1
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (915): 
sendfull returned -9 with errno=9
[Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
(1201): (PlatformServer) error sending request. Will try another pooled 
connection
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (915): 
sendfull returned -14 with errno=14
[Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c (948): 
(PlatformServer) can't receive the response message from tomcat, network 
problems or tomcat is down (127.0.

And these in the apache error_log:

[Tue Jan 30 03:02:28 2007] [info] Shared memory session cache initialised
[Tue Jan 30 03:02:28 2007] [info] Init: Initializing (virtual) servers 
for SSL
[Tue Jan 30 03:02:28 2007] [info] Server: Apache/2.2.4, Interface: 
mod_ssl/2.2.4, Library: OpenSSL/0.9.8d
[Tue Jan 30 03:02:28 2007] [notice] Apache/2.2.4 (Unix) mod_jk/1.2.20 
mod_ssl/2.2.4 OpenSSL/0.9.8d configured -- resuming normal operations
[Tue Jan 30 03:02:28 2007] [info] Server built: Jan 29 2007 23:44:30
[Tue Jan 30 03:02:28 2007] [debug] worker.c(1740): AcceptMutex: sysvsem 
(default: sysvsem)
[Tue Jan 30 03:03:54 2007] [notice] child pid 2203720 exit signal 
Segmentation fault (11)
[Tue Jan 30 03:04:00 2007] [debug] worker.c(1083): the listener thread 
didn't exit
[Tue Jan 30 03:04:01 2007] [debug] worker.c(1083): the listener thread 
didn't exit
[Tue Jan 30 03:04:13 2007] [debug] worker.c(1354): taking over 
scoreboard slot from 1552520 (quiescing)
[Tue Jan 30 03:04:19 2007] [notice] child pid 2576392 exit signal 
Segmentation fault (11)
[Tue Jan 30 03:04:23 2007] [notice] child pid 2310294 exit signal 
Segmentation fault (11)
[Tue Jan 30 03:04:26 2007] [notice] child pid 2576394 exit signal 
Segmentation fault (11)
[Tue Jan 30 03:29:30 2007] [notice] child pid 1552524 exit signal 
Segmentation fault (11)
[Tue Jan 30 04:09:22 2007] [notice] child pid 2584796 exit signal 
Segmentation fault (11)


Again,  I thought I would be able to track them down by changing the log 
level to "trace", but the problem disappears at that log level.  I had 
assumed it was slowing the connection rate down enough to mitigate the 
problem.  Not sure if any of that helps.. if you can help me or want 
more info let me know.

Thanks!

Eric

>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Rainer Jung <ra...@kippdata.de>.
Hi Henri,

there have been two bug fixes concerning string formatting, which have 
core dump potential. Both only occur with log level info or above. I 
think they are *not* the reason (see below). The code is in 
common/jk_ajp_common.c:

1) Wrong order of arguments; should only be relevant, if new feature 
"fail_on_status" is being used.

@@ -1850,10 +1864,10 @@
             else if (err == JK_STATUS_ERROR) {
                 jk_log(l, JK_LOG_INFO,
                         "(%s) request failed, "
                         "because of response status %d, "
                         "recoverable operation attempt=%d",
-                       p->worker->http_status_fail,
-                       p->worker->name, i);
+                       p->worker->name,
+                       p->worker->http_status_fail, i);

2) Missing "s" in "%s". Should not produce a core dump.

@@ -1108,7 +1122,7 @@

      if ((len = ajp_read_fully_from_server(r, l, read_buf, len)) < 0) {
          jk_log(l, JK_LOG_INFO,
-               "(%) receiving data from client failed. "
+               "(%s) receiving data from client failed. "
                 "Connection aborted or network problems",
                 ae->worker->name);
          JK_TRACE_EXIT(l);

We don't have any other core dump reports for 1.2.20 at the moment. From 
your core analysis I expect another reason. Between 1.2.19 and 1.2.20 
there was the big virtual host cleanup. That included per virtual host 
loggers. The method ws_write has not been changed, but some of the 
config parsing, overloading and initialization.

Could you check, if there are any startup errors in your apache or 
mod_jk logs? Use JkLogLevel debug and LogLevel at least info. Please try 
first with a basic config with no vhosts. Does it crash during startup, 
or when running requests?

Some changes:

1) If no log file is configured, it tries to use logs/mod_jk.log.
2) Log definitions get inherited from the global server to virtual 
servers, but are overwritten by explicit virtual server configs.
3) The mod_jk time stamp formats and its own request logging are also 
per virtual server

Do we have a chace to find out the line numbers in ws_write, where we 
called the jk_log? There are tw possibilities:

    399              if (!p->response_started) {
    400                  if (main_log)
    401                      jk_log(main_log, JK_LOG_INFO,
    402                             "Write without start, starting with 
defaults");
    403                  if (!s->start_response(s, 200, NULL, NULL, 
NULL, 0)) {
    404                      return JK_FALSE;
    405                  }
    406              }


and one debug message:

    435                  if (JK_IS_DEBUG_LEVEL(main_log))
    436                      jk_log(main_log, JK_LOG_DEBUG,
    437                             "written %d out of %d", r, ll);
    438

Does the crash go away, if JkLogLevel is info?

I'm afraid we have to debug this iteratively. I had no problems on *nix 
platforms and up to now also no user reports.

Henri Gomez wrote:
> Thanks.
> 
> Did the 1.2.21 is expected soon or should I use the trunk ?

I think 1.2.21 is still a couple of weeks ago, but I expect trunk to 
produce the same problem, because I don't see anything we changed, that 
might have fixed it.

Regards,

Rainer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Henri Gomez <he...@gmail.com>.
Thanks.

Did the 1.2.21 is expected soon or should I use the trunk ?

2007/1/31, Mladen Turk <mt...@apache.org>:
> Henri Gomez wrote:
> > Hi to all,
> >
> > I built and installed the jk 1.2.20 on iSeries i5/OS V5R3 Apache 2
> > server to replace the previous 1.2.19.
> >
>
> Think there was some problem with wrong printf format.
> Rainer fixed that post 1.2.20, but he should know better.
>
> Regards,
> Mladen.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: FYI: jk 1.2.20 core on iSeries v5R3

Posted by Mladen Turk <mt...@apache.org>.
Henri Gomez wrote:
> Hi to all,
> 
> I built and installed the jk 1.2.20 on iSeries i5/OS V5R3 Apache 2
> server to replace the previous 1.2.19.
>

Think there was some problem with wrong printf format.
Rainer fixed that post 1.2.20, but he should know better.

Regards,
Mladen.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org