You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2013/12/18 22:56:07 UTC

[Bug 55907] New: Unterminated loop during Poller creation due to cyclic references in native allocator

https://issues.apache.org/bugzilla/show_bug.cgi?id=55907

            Bug ID: 55907
           Summary: Unterminated loop during Poller creation due to cyclic
                    references in native allocator
           Product: Tomcat Native
           Version: 1.1.29
          Hardware: Sun
                OS: Solaris
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Library
          Assignee: dev@tomcat.apache.org
          Reporter: rainer.jung@kippdata.de

The loop happens during poller thread startup. The following shutdown hangs due
to waiting for a native lock that the poller holds. The looping is not
reproducible, it was observed during one of many runs of TestAsyncContextImpl. 

tcnative version used was 1.1.29 plus apr 1.4.8, platform Solaris 10 Sparc, JDK
1.7.0_45 32 Bits.

I add the data to this issue primarily such that we can come back here if the
problem gets observed in the wild.

In the logs the incomplete poller startup results in the message

[junit] 18-Dec-2013 20:50:28.194 SEVERE [http-apr-127.0.0.1-auto-27-Poller]
org.apache.tomcat.util.net.AprEndpoint$Poller.run Poller failed with error [9]
: [Bad file number]

about a second after the loop starts.

The looping thread is:

"http-apr-127.0.0.1-auto-27-Poller" daemon prio=3 tid=0x007c9400 nid=0x134
runnable [0xb407f000]
   java.lang.Thread.State: RUNNABLE
  apr_palloc (38a540, 28, 58a258, 28000, 42c8a0, 42c8a0) + 29c
  impl_pollset_create (44c918, 2000, 58a258, 44c950, 20000, 44c918) + b8
  apr_pollset_create_ex (b407f78c, 2000, 58a258, 2, 3, b51d4ec4) + c8
  Java_org_apache_tomcat_jni_Poll_create (7c9530, b407f78c, 2000, 2, 58a258, 0)
+ 38
        at org.apache.tomcat.jni.Poll.create(Native Method)
        at
org.apache.tomcat.util.net.AprEndpoint.allocatePoller(AprEndpoint.java:804)
        at
org.apache.tomcat.util.net.AprEndpoint$Poller.run(AprEndpoint.java:1969)
        at java.lang.Thread.run(Thread.java:744)


The main thread later blocks during shutdown:


"main" prio=3 tid=0x00029800 nid=0x2 runnable [0xfdf7e000]
   java.lang.Thread.State: RUNNABLE
  lwp_park (0, 0, 0)
  apr_pool_destroy (75c680, fdf7e958, 0, 75c680, fdf7e9d4, 372d08) + c4
        at org.apache.tomcat.jni.Pool.destroy(Native Method)
        at
org.apache.tomcat.util.net.AprEndpoint$Sendfile.destroy(AprEndpoint.java:2098)
        at
org.apache.tomcat.util.net.AprEndpoint.stopInternal(AprEndpoint.java:704)
        at
org.apache.tomcat.util.net.AbstractEndpoint.stop(AbstractEndpoint.java:740)
        at org.apache.coyote.AbstractProtocol.stop(AbstractProtocol.java:515)
        at
org.apache.catalina.connector.Connector.stopInternal(Connector.java:1014)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:232)
        - locked <0xe69c1760> (a org.apache.catalina.connector.Connector)
        at
org.apache.catalina.core.StandardService.stopInternal(StandardService.java:516)
        - locked <0xe69ca618> (a java.lang.Object)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:232)
        - locked <0xe69ca570> (a org.apache.catalina.core.StandardService)
        at
org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:766)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:232)
        - locked <0xe69c9ec8> (a org.apache.catalina.core.StandardServer)
        at org.apache.catalina.startup.Tomcat.stop(Tomcat.java:350)
        at
org.apache.catalina.startup.TomcatBaseTest.tearDown(TomcatBaseTest.java:163)
        at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
        at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
        at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:38)
        at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523)
        at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063)
        at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914)


Debugger says:

#0  allocator_alloc (in_size=163840, allocator=0x38a540) at
memory/unix/apr_pools.c:315
        node = 0x42c8a0
        ref = 0x42c8a0
        max_index = <optimized out>
        i = <optimized out>
        size = 167936
        index = 40
#1  apr_palloc (pool=pool@entry=0x58a258, in_size=in_size@entry=163840) at
memory/unix/apr_pools.c:696
        active = 0x42c8a0
        node = <optimized out>
        mem = <optimized out>
        size = 163840
        free_index = <optimized out>
#2  0xb51bccc4 in impl_pollset_create (pollset=0x44c918, size=8192, p=0x58a258,
flags=<optimized out>) at poll/unix/port.c:199
No locals.
#3  0xb51bc5a0 in apr_pollset_create_ex (ret_pollset=0xb407f78c, size=8192,
p=0x58a258, flags=2, method=APR_POLLSET_PORT) at poll/unix/pollset.c:251
        rv = <optimized out>
        pollset = 0x44c918
        provider = 0xb51d4ec4
#4  0xb53e1330 in Java_org_apache_tomcat_jni_Poll_create (e=0x7c9530,
o=<optimized out>, size=8192, pool=<optimized out>, flags=<optimized out>,
default_timeout=-1000)
    at src/poll.c:126
        R = 8192
        p = 0x58a258
        pollset = <optimized out>
        tps = 0x0
        f = 2


The loop happens in memory/unix/apr_pools.c:315

        while ((node = *ref) != NULL && index > node->index)
            ref = &node->next;

because node->next is the same as node:

(gdb) print node
$4 = (apr_memnode_t *) 0x42c8a0
(gdb) print *node
$3 = {next = 0x42c8a0, ref = 0x42c8a0, index = 32, free_index = 0, first_avail
= 0x44c8b8 "", endp = 0x44d8a0 ""}

The root cause will have happened earlier.

Some additional data:

(gdb) print *allocator
$5 = {max_index = 1, max_free_index = 0, current_free_index = 0, mutex =
0x372d08, owner = 0x372cc8, free = {0x49c8f8, 0x58a240, 0x0 <repeats 18
times>}}

The associated pool hierarchy:

(gdb) print pool
$15 = (apr_pool_t *) 0x58a258

(gdb) print *pool
$6 = {parent = 0x179920, child = 0x0, sibling = 0x0, ref = 0x75c688, cleanups =
0x0, free_cleanups = 0x0, allocator = 0x38a540, subprocesses = 0x0, abort_fn =
0,
  user_data = 0x0, tag = 0x0, active = 0x42c8a0, self = 0x58a240,
self_first_avail = 0x58a298 "", pre_cleanups = 0x0}

(gdb) print *pool->parent
$7 = {parent = 0x790028, child = 0x75c680, sibling = 0x0, ref = 0x180250,
cleanups = 0x0, free_cleanups = 0x0, allocator = 0x38a540, subprocesses = 0x0,
abort_fn = 0,
  user_data = 0x0, tag = 0x0, active = 0x179908, self = 0x179908,
self_first_avail = 0x179960 "", pre_cleanups = 0x0}

(gdb) print *pool->parent->parent
$8 = {parent = 0x409898, child = 0x180248, sibling = 0x0, ref = 0x40989c,
cleanups = 0x0, free_cleanups = 0x0, allocator = 0x38a540, subprocesses = 0x0,
abort_fn = 0,
  user_data = 0x0, tag = 0x0, active = 0x790010, self = 0x790010,
self_first_avail = 0x790068 "", pre_cleanups = 0x0}

(gdb) print *pool->parent->parent->parent
$9 = {parent = 0x372cc8, child = 0x790028, sibling = 0x7eac40, ref = 0x372ccc,
cleanups = 0x40a150, free_cleanups = 0x0, allocator = 0x38a540, subprocesses =
0x0,
  abort_fn = 0, user_data = 0x0, tag = 0x0, active = 0x409880, self = 0x409880,
self_first_avail = 0x4098d8 "/dev/urandom", pre_cleanups = 0x0}

(gdb) print *pool->parent->parent->parent->parent
$10 = {parent = 0x0, child = 0x409898, sibling = 0x0, ref = 0x0, cleanups =
0x372d28, free_cleanups = 0x0, allocator = 0x38a540, subprocesses = 0x0,
abort_fn = 0,
  user_data = 0x0, tag = 0xb51c2808 "apr_global_pool", active = 0x372cb0, self
= 0x372cb0, self_first_avail = 0x372d08 "", pre_cleanups = 0x0}


The blocked main thread tries to acquire the lock of the allocator that the
other thread loops in (and has the lock acquired).

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 55907] Unterminated loop during Poller creation due to cyclic references in native allocator

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55907

Rainer Jung <ra...@kippdata.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |LATER

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org