You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by "Jörg Henne (JIRA)" <ji...@apache.org> on 2006/08/02 18:39:15 UTC

[jira] Commented: (DIRSERVER-586) Reliable hang of DS during query

    [ http://issues.apache.org/jira/browse/DIRSERVER-586?page=comments#action_12425285 ] 
            
Jörg Henne commented on DIRSERVER-586:
--------------------------------------

I just investigated this issue a bit further. It is definitely a problem with the network layer, but I could not yet figure out what is going on.

What I have found:

A hanging client thread looks like this:

Thread [Thread-10] (Suspended)
	waiting for: com.sun.jndi.ldap.LdapRequest  (id=37)
	java.lang.Object.wait(long) line: not available [native method]
	com.sun.jndi.ldap.Connection.readReply(com.sun.jndi.ldap.LdapRequest) line: 418
	com.sun.jndi.ldap.LdapClient.processReply(com.sun.jndi.ldap.LdapRequest, com.sun.jndi.ldap.LdapResult, int) line: 857
	com.sun.jndi.ldap.LdapClient.add(com.sun.jndi.ldap.LdapEntry, javax.naming.ldap.Control[]) line: 1008
	com.sun.jndi.ldap.LdapCtx.c_bind(javax.naming.Name, java.lang.Object, javax.naming.directory.Attributes, com.sun.jndi.toolkit.ctx.Continuation) line: 375
	com.sun.jndi.ldap.LdapCtx(com.sun.jndi.toolkit.ctx.ComponentDirContext).p_bind(javax.naming.Name, java.lang.Object, javax.naming.directory.Attributes, com.sun.jndi.toolkit.ctx.Continuation) line: 277
	com.sun.jndi.ldap.LdapCtx(com.sun.jndi.toolkit.ctx.PartialCompositeDirContext).bind(javax.naming.Name, java.lang.Object, javax.naming.directory.Attributes) line: 197
	com.sun.jndi.ldap.LdapCtx(com.sun.jndi.toolkit.ctx.PartialCompositeDirContext).bind(java.lang.String, java.lang.Object, javax.naming.directory.Attributes) line: 186
	javax.naming.directory.InitialDirContext.bind(java.lang.String, java.lang.Object, javax.naming.directory.Attributes) line: 158
	com.levigo.tcat.test.directory.TestHang.create(java.lang.String) line: 77
	com.levigo.tcat.test.directory.TestHang.createAndDelete() line: 93
	com.levigo.tcat.test.directory.TestHang.access$0(com.levigo.tcat.test.directory.TestHang) line: 83
	com.levigo.tcat.test.directory.TestHang$MyRunner.run() line: 41

There is a corresponding com.sun.jndi.ldap.Connection worker thread, that goes along with it - in this case the following one:

Thread [Thread-17] (Suspended)
	owns: java.io.BufferedInputStream  (id=36)
	java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) line: not available [native method]
	java.net.SocketInputStream.read(byte[], int, int) line: 129
	java.io.BufferedInputStream.fill() line: 218
	java.io.BufferedInputStream.read1(byte[], int, int) line: 256
	java.io.BufferedInputStream.read(byte[], int, int) line: 313
	com.sun.jndi.ldap.Connection.run() line: 780 [local variables unavailable]
	java.lang.Thread.run() line: 595

This worker thread looks normal - it is only used for reading, never for writing.

When a hang occurs, it is due to a message gone missing. The message has been sent by com.sun.jndi.ldap.LdapClient, prior to calling Connection.readReply().

I added some debugging output directly in org.apache.mina.transport.socket.nio.process(Set). Thus I can see the excact port numbers for which data becomes available. 
What I learned, is that data never becomes available for the corresponding socket (except for the initial bind call). After some more debugging I found that the channel has, for some reason, gine missing from the SocketIoProcessor's Selector.

Well, and now the big question is, of couse: why does the channel vanish? Maybe this rings a bell with you one of you?

> Reliable hang of DS during query
> --------------------------------
>
>                 Key: DIRSERVER-586
>                 URL: http://issues.apache.org/jira/browse/DIRSERVER-586
>             Project: Directory ApacheDS
>          Issue Type: Bug
>         Environment: DS 0.9.3, Windows, JDK 1.5
>            Reporter: Jörg Henne
>         Assigned To: Alex Karasulu
>         Attachments: bugreport.zip, TestHang.java
>
>
> When running the attached test, the directory server hangs after executing a slew of operations when searching for objects.
> First of all, some background on the test case:
> The attached test case (in the form of an exported eclipse project) is, unfortunately, based on quite a few classes. They are part of a project I am currently working on: an object to ldap mapper with a similar approach as castor for XML or hibernate for RDBMS, albeit a lot more modest in complexity (I'll, hopefully, one day be able to open-source it - for now it is still much to immature). I have supplied all that stuff mainly for your reference.
> To run the test case, please make sure that the constant "URL" in LDAPDirectoryTest points to a valid directory. The URL the context points to must exist. It will, however, subsequently create lots of nodes below it.
> The hang seems to be related to some kind of deadlock, since it doesn't occur once the whole test is run via a single context only. To achieve this, set the constant "ONE_CONTEXT" to true (each LDAPDirectory uses its own set of contexts).
> If you have any problems running the test, please don't hesitate to contact me.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira