You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by c....@gmx.de on 2011/02/18 12:31:47 UTC

"Connection already in use" when crawling Windows Share repositories

I'm running MCF rev. 1071535 with Apache Tomcat 7. Solr and MCF are using HTTPS. When crawling data from a Windows Share Connection with Active Directory Authority after some time of crawling my job aborts with the status "Error: Couldn't set up SSL connection to ingestion API: Address already in use: connect".

In the mcf logfile I get:

[2011-02-18 11:24:33,695]ERROR Exception tossed: Couldn't set up SSL connection to ingestion API: Address already in use: connect
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Couldn't set up SSL connection to ingestion API: Address already in use: connect
	at org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:736)
	at org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:1006)
Caused by: java.net.BindException: Address already in use: connect
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at com.sun.net.ssl.internal.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:550)
	at com.sun.net.ssl.internal.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:353)
	at com.sun.net.ssl.internal.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:71)
	at org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:728)
	... 1 more

I use the same LDAP user
- to authenticate at MCF
- login credentials for the Solr Connection
- credentials for authority connection
- and for connecting to our LDAP server itself (in tomcat JNDI Realm config).

It seems to me that the connection to LDAP might be a problem. I tried to configure connection pooling for tomcat. (I'm afraid I'm not so familiar with that.)
"-Dcom.sun.jndi.ldap.connect.pool=true -Dcom.jndi.ldap.connect.pool.protocol=ssl"

Did I miss sth in the config? Or how do I have to configure MCF?

Thanks in advance.

Carina
-- 
GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit 
gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

Re: "Connection already in use" when crawling Windows Share repositories

Posted by Karl Wright <da...@gmail.com>.
I should also point out that setting up and tearing down connections
at high rates is what leads to a problem of lots of sockets in
TIME_WAIT.  The most elegant solution is to use a connection pool
rather than create and tear down connections on demand.  That way,
sockets are only really closed infrequently.

Karl


On Tue, Feb 22, 2011 at 6:09 AM, Karl Wright <da...@gmail.com> wrote:
> I've seen this before, when I was using HTTP and setting a connection
> up and tearing it down for each communication.  This is why HTTP
> keep-alive was invented in fact.
>
> TIME_WAIT is the result of an oddity in the way sockets are managed.
> A socket that has been closed but has not received an ACK from the
> other end is in this state.  Read here:
> http://stackoverflow.com/questions/813790/too-many-time-wait-connections
>
> If you can figure out what socket is being left dangling, then the
> right thing to do is create the socket using NO_WAIT, provided you
> have access to that code.  It's one of the socket parameters that you
> can use only at socket create time.  It tells the socket to not stick
> around after close.
>
> If this is in the ManifoldCF code, or in JCifs, I'd love to know about it! ;-)
>
> Thanks,
> Karl
>
>
> On Tue, Feb 22, 2011 at 4:40 AM,  <c....@gmx.de> wrote:
>> Hi Karl,
>>
>> your guess was right. There has been a problem with too many connections in status "TIME_WAIT".
>> Even if I set max. connections per JVM to 1 for the Repository and Authority Connection I hat up to 3500 connections waiting.
>> When investingating this further I found out that an unbind response from the LDAP server was missing.
>>
>> Sometimes this error occurs and sometimes it works fine. It's a bit weird, I wasn't able to figure out where it came from.
>>
>> Thanks for your hint.
>>
>> Carina
>>
>>
>> -------- Original-Nachricht --------
>>> Datum: Fri, 18 Feb 2011 06:50:36 -0500
>>> Von: Karl Wright <da...@gmail.com>
>>> An: connectors-user@incubator.apache.org
>>> Betreff: Re: "Connection already in use" when crawling Windows Share repositories
>>
>>> I don't think this is directly a ManifoldCF issue.  My guess is that
>>> you are running out of file handles.  If you are on Windows, you might
>>> want to track the number of handles you are using via netstat.  If on
>>> linux, then lsof is the right tool.
>>>
>>> If you decide that this is indeed what the issue is, then on linux you
>>> need to either rebuild the kernel, or do stuff with ulimit to set the
>>> number of handles.  On Windows, I don't know how to do it.  Limiting
>>> the number of handles used by ManifoldCF is done by setting the number
>>> of connections under the "Throttling" tab of the various connection
>>> definitions you create, but that's not going to help with Tomcat.
>>>
>>> It may be worth checking if you aren't closing something.  That could
>>> easily cause this problem, eventually.  Since it is happening at
>>> crawling time, the Active Directory Authority is not actually
>>> communicating with the LDAP server yet.  JCIFS probably is, though.
>>>
>>> Let us know what you find.
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>> On Fri, Feb 18, 2011 at 6:31 AM,  <c....@gmx.de> wrote:
>>> > I'm running MCF rev. 1071535 with Apache Tomcat 7. Solr and MCF are
>>> using HTTPS. When crawling data from a Windows Share Connection with Active
>>> Directory Authority after some time of crawling my job aborts with the status
>>> "Error: Couldn't set up SSL connection to ingestion API: Address already in
>>> use: connect".
>>> >
>>> > In the mcf logfile I get:
>>> >
>>> > [2011-02-18 11:24:33,695]ERROR Exception tossed: Couldn't set up SSL
>>> connection to ingestion API: Address already in use: connect
>>> > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Couldn't set
>>> up SSL connection to ingestion API: Address already in use: connect
>>> >        at
>>> org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:736)
>>> >        at
>>> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:1006)
>>> > Caused by: java.net.BindException: Address already in use: connect
>>> >        at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> >        at
>>> java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>> >        at
>>> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>> >        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>> >        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>> >        at java.net.Socket.connect(Socket.java:529)
>>> >        at
>>> com.sun.net.ssl.internal.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:550)
>>> >        at
>>> com.sun.net.ssl.internal.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:353)
>>> >        at
>>> com.sun.net.ssl.internal.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:71)
>>> >        at
>>> org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:728)
>>> >        ... 1 more
>>> >
>>> > I use the same LDAP user
>>> > - to authenticate at MCF
>>> > - login credentials for the Solr Connection
>>> > - credentials for authority connection
>>> > - and for connecting to our LDAP server itself (in tomcat JNDI Realm
>>> config).
>>> >
>>> > It seems to me that the connection to LDAP might be a problem. I tried
>>> to configure connection pooling for tomcat. (I'm afraid I'm not so familiar
>>> with that.)
>>> > "-Dcom.sun.jndi.ldap.connect.pool=true
>>> -Dcom.jndi.ldap.connect.pool.protocol=ssl"
>>> >
>>> > Did I miss sth in the config? Or how do I have to configure MCF?
>>> >
>>> > Thanks in advance.
>>> >
>>> > Carina
>>> > --
>>> > GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit
>>> > gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
>>> >
>>
>> --
>> GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit
>> gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
>>
>

Re: "Connection already in use" when crawling Windows Share repositories

Posted by Karl Wright <da...@gmail.com>.
I've seen this before, when I was using HTTP and setting a connection
up and tearing it down for each communication.  This is why HTTP
keep-alive was invented in fact.

TIME_WAIT is the result of an oddity in the way sockets are managed.
A socket that has been closed but has not received an ACK from the
other end is in this state.  Read here:
http://stackoverflow.com/questions/813790/too-many-time-wait-connections

If you can figure out what socket is being left dangling, then the
right thing to do is create the socket using NO_WAIT, provided you
have access to that code.  It's one of the socket parameters that you
can use only at socket create time.  It tells the socket to not stick
around after close.

If this is in the ManifoldCF code, or in JCifs, I'd love to know about it! ;-)

Thanks,
Karl


On Tue, Feb 22, 2011 at 4:40 AM,  <c....@gmx.de> wrote:
> Hi Karl,
>
> your guess was right. There has been a problem with too many connections in status "TIME_WAIT".
> Even if I set max. connections per JVM to 1 for the Repository and Authority Connection I hat up to 3500 connections waiting.
> When investingating this further I found out that an unbind response from the LDAP server was missing.
>
> Sometimes this error occurs and sometimes it works fine. It's a bit weird, I wasn't able to figure out where it came from.
>
> Thanks for your hint.
>
> Carina
>
>
> -------- Original-Nachricht --------
>> Datum: Fri, 18 Feb 2011 06:50:36 -0500
>> Von: Karl Wright <da...@gmail.com>
>> An: connectors-user@incubator.apache.org
>> Betreff: Re: "Connection already in use" when crawling Windows Share repositories
>
>> I don't think this is directly a ManifoldCF issue.  My guess is that
>> you are running out of file handles.  If you are on Windows, you might
>> want to track the number of handles you are using via netstat.  If on
>> linux, then lsof is the right tool.
>>
>> If you decide that this is indeed what the issue is, then on linux you
>> need to either rebuild the kernel, or do stuff with ulimit to set the
>> number of handles.  On Windows, I don't know how to do it.  Limiting
>> the number of handles used by ManifoldCF is done by setting the number
>> of connections under the "Throttling" tab of the various connection
>> definitions you create, but that's not going to help with Tomcat.
>>
>> It may be worth checking if you aren't closing something.  That could
>> easily cause this problem, eventually.  Since it is happening at
>> crawling time, the Active Directory Authority is not actually
>> communicating with the LDAP server yet.  JCIFS probably is, though.
>>
>> Let us know what you find.
>>
>> Thanks,
>> Karl
>>
>>
>> On Fri, Feb 18, 2011 at 6:31 AM,  <c....@gmx.de> wrote:
>> > I'm running MCF rev. 1071535 with Apache Tomcat 7. Solr and MCF are
>> using HTTPS. When crawling data from a Windows Share Connection with Active
>> Directory Authority after some time of crawling my job aborts with the status
>> "Error: Couldn't set up SSL connection to ingestion API: Address already in
>> use: connect".
>> >
>> > In the mcf logfile I get:
>> >
>> > [2011-02-18 11:24:33,695]ERROR Exception tossed: Couldn't set up SSL
>> connection to ingestion API: Address already in use: connect
>> > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Couldn't set
>> up SSL connection to ingestion API: Address already in use: connect
>> >        at
>> org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:736)
>> >        at
>> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:1006)
>> > Caused by: java.net.BindException: Address already in use: connect
>> >        at java.net.PlainSocketImpl.socketConnect(Native Method)
>> >        at
>> java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>> >        at
>> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>> >        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>> >        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>> >        at java.net.Socket.connect(Socket.java:529)
>> >        at
>> com.sun.net.ssl.internal.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:550)
>> >        at
>> com.sun.net.ssl.internal.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:353)
>> >        at
>> com.sun.net.ssl.internal.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:71)
>> >        at
>> org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:728)
>> >        ... 1 more
>> >
>> > I use the same LDAP user
>> > - to authenticate at MCF
>> > - login credentials for the Solr Connection
>> > - credentials for authority connection
>> > - and for connecting to our LDAP server itself (in tomcat JNDI Realm
>> config).
>> >
>> > It seems to me that the connection to LDAP might be a problem. I tried
>> to configure connection pooling for tomcat. (I'm afraid I'm not so familiar
>> with that.)
>> > "-Dcom.sun.jndi.ldap.connect.pool=true
>> -Dcom.jndi.ldap.connect.pool.protocol=ssl"
>> >
>> > Did I miss sth in the config? Or how do I have to configure MCF?
>> >
>> > Thanks in advance.
>> >
>> > Carina
>> > --
>> > GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit
>> > gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
>> >
>
> --
> GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit
> gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
>

Re: "Connection already in use" when crawling Windows Share repositories

Posted by c....@gmx.de.
Hi Karl,

your guess was right. There has been a problem with too many connections in status "TIME_WAIT".
Even if I set max. connections per JVM to 1 for the Repository and Authority Connection I hat up to 3500 connections waiting.
When investingating this further I found out that an unbind response from the LDAP server was missing.

Sometimes this error occurs and sometimes it works fine. It's a bit weird, I wasn't able to figure out where it came from.

Thanks for your hint.

Carina


-------- Original-Nachricht --------
> Datum: Fri, 18 Feb 2011 06:50:36 -0500
> Von: Karl Wright <da...@gmail.com>
> An: connectors-user@incubator.apache.org
> Betreff: Re: "Connection already in use" when crawling Windows Share repositories

> I don't think this is directly a ManifoldCF issue.  My guess is that
> you are running out of file handles.  If you are on Windows, you might
> want to track the number of handles you are using via netstat.  If on
> linux, then lsof is the right tool.
> 
> If you decide that this is indeed what the issue is, then on linux you
> need to either rebuild the kernel, or do stuff with ulimit to set the
> number of handles.  On Windows, I don't know how to do it.  Limiting
> the number of handles used by ManifoldCF is done by setting the number
> of connections under the "Throttling" tab of the various connection
> definitions you create, but that's not going to help with Tomcat.
> 
> It may be worth checking if you aren't closing something.  That could
> easily cause this problem, eventually.  Since it is happening at
> crawling time, the Active Directory Authority is not actually
> communicating with the LDAP server yet.  JCIFS probably is, though.
> 
> Let us know what you find.
> 
> Thanks,
> Karl
> 
> 
> On Fri, Feb 18, 2011 at 6:31 AM,  <c....@gmx.de> wrote:
> > I'm running MCF rev. 1071535 with Apache Tomcat 7. Solr and MCF are
> using HTTPS. When crawling data from a Windows Share Connection with Active
> Directory Authority after some time of crawling my job aborts with the status
> "Error: Couldn't set up SSL connection to ingestion API: Address already in
> use: connect".
> >
> > In the mcf logfile I get:
> >
> > [2011-02-18 11:24:33,695]ERROR Exception tossed: Couldn't set up SSL
> connection to ingestion API: Address already in use: connect
> > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Couldn't set
> up SSL connection to ingestion API: Address already in use: connect
> >        at
> org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:736)
> >        at
> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:1006)
> > Caused by: java.net.BindException: Address already in use: connect
> >        at java.net.PlainSocketImpl.socketConnect(Native Method)
> >        at
> java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> >        at
> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> >        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> >        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> >        at java.net.Socket.connect(Socket.java:529)
> >        at
> com.sun.net.ssl.internal.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:550)
> >        at
> com.sun.net.ssl.internal.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:353)
> >        at
> com.sun.net.ssl.internal.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:71)
> >        at
> org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:728)
> >        ... 1 more
> >
> > I use the same LDAP user
> > - to authenticate at MCF
> > - login credentials for the Solr Connection
> > - credentials for authority connection
> > - and for connecting to our LDAP server itself (in tomcat JNDI Realm
> config).
> >
> > It seems to me that the connection to LDAP might be a problem. I tried
> to configure connection pooling for tomcat. (I'm afraid I'm not so familiar
> with that.)
> > "-Dcom.sun.jndi.ldap.connect.pool=true
> -Dcom.jndi.ldap.connect.pool.protocol=ssl"
> >
> > Did I miss sth in the config? Or how do I have to configure MCF?
> >
> > Thanks in advance.
> >
> > Carina
> > --
> > GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit
> > gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
> >

-- 
GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit 
gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

Re: "Connection already in use" when crawling Windows Share repositories

Posted by Karl Wright <da...@gmail.com>.
I don't think this is directly a ManifoldCF issue.  My guess is that
you are running out of file handles.  If you are on Windows, you might
want to track the number of handles you are using via netstat.  If on
linux, then lsof is the right tool.

If you decide that this is indeed what the issue is, then on linux you
need to either rebuild the kernel, or do stuff with ulimit to set the
number of handles.  On Windows, I don't know how to do it.  Limiting
the number of handles used by ManifoldCF is done by setting the number
of connections under the "Throttling" tab of the various connection
definitions you create, but that's not going to help with Tomcat.

It may be worth checking if you aren't closing something.  That could
easily cause this problem, eventually.  Since it is happening at
crawling time, the Active Directory Authority is not actually
communicating with the LDAP server yet.  JCIFS probably is, though.

Let us know what you find.

Thanks,
Karl


On Fri, Feb 18, 2011 at 6:31 AM,  <c....@gmx.de> wrote:
> I'm running MCF rev. 1071535 with Apache Tomcat 7. Solr and MCF are using HTTPS. When crawling data from a Windows Share Connection with Active Directory Authority after some time of crawling my job aborts with the status "Error: Couldn't set up SSL connection to ingestion API: Address already in use: connect".
>
> In the mcf logfile I get:
>
> [2011-02-18 11:24:33,695]ERROR Exception tossed: Couldn't set up SSL connection to ingestion API: Address already in use: connect
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Couldn't set up SSL connection to ingestion API: Address already in use: connect
>        at org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:736)
>        at org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:1006)
> Caused by: java.net.BindException: Address already in use: connect
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:550)
>        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:353)
>        at com.sun.net.ssl.internal.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:71)
>        at org.apache.manifoldcf.agents.output.solr.HttpPoster.createSocket(HttpPoster.java:728)
>        ... 1 more
>
> I use the same LDAP user
> - to authenticate at MCF
> - login credentials for the Solr Connection
> - credentials for authority connection
> - and for connecting to our LDAP server itself (in tomcat JNDI Realm config).
>
> It seems to me that the connection to LDAP might be a problem. I tried to configure connection pooling for tomcat. (I'm afraid I'm not so familiar with that.)
> "-Dcom.sun.jndi.ldap.connect.pool=true -Dcom.jndi.ldap.connect.pool.protocol=ssl"
>
> Did I miss sth in the config? Or how do I have to configure MCF?
>
> Thanks in advance.
>
> Carina
> --
> GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit
> gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
>