You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Ingo Meyer <dj...@gmx.net> on 2005/11/14 09:52:26 UTC

proxy multithreaded problem

Hi,

I'm new here and this is my first question...

I want to use httpclient for downloading a big amount of pages all about 80k
from several servers of our company around the world.
This works great with httpclient.
But now I'm adviced to use proxies and this works very bad. I need about 300
connections simultaniously and the speed decrease very much until it stops
nearly.
The difference between with and without proxies is 50-100x, >100k without p.
and <5k with proxies.

I have a static instance of multithreadedconnectionmanager and httpclient.
My guess is that the connections are handled by mtcm for every proxy not for
the real destination. In my case this is very bad cause I often change the
proxy and this may be the reason why my speed is to slow.
When i increase the connections to get more speed overall it doesn't work.
It seems that the
socket are limited by java (i tested on xp AND linux)?! As more connections
i give to mtcm in his configuration as more bad the speed is!?
Please tell me If you need some code fragment, but i have the standart
implementation.


Any suggestions will be appreciated...


Greats
Ingo Meyer


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


Re: AW: proxy multithreaded problem

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2005-11-14 at 11:14 +0100, Ingo Meyer wrote:
> > -----Ursprüngliche Nachricht-----
> > Von: Roland Weber [mailto:ROLWEBER@de.ibm.com] 
> > Gesendet: Montag, 14. November 2005 10:56
> > An: HttpClient User Discussion
> > Betreff: Re: proxy multithreaded problem
> > 
> > Hi Ingo,
> > 
> > there are two limits to the MTCM: number of connections and 
> > number of connections per host. The latter is set to a rather 
> > low value, I think 2. Make sure to increase both.
> > 
> > hope that helps,
> >   Roland
> > 
> 
> Sorry, this is what I mean with increasing the connection.
> E.g. when I want to dl from two hosts using lets say 200 proxies
> and 50 threads i give the MTCM 100 as number of connections (2x50) and 
> 50 number of connections per host. The proxies are shared by
> a pool so that a proxy is only used once cause I think there are
> proxies who limits the connections for a certain ip.
> 

Ingo,

The MTCM pools connection based on host / port / local address / proxy /
proxy port combination. So, the max connections per host setting does
not indiscriminately apply to all proxy connections. 

Can you try to compare the throughput of a single direction connection
versus that of a single proxied connection to the same host in order to
eliminate the possibility of the proxy being the bottleneck rather than
the connection pool?

Oleg

> This doesn't work!
> When i dl without proxies it works fine. Thats why i suppose the 
> MTCM tries to keep the connection open for every proxy not for the
> real destination, so when i change the proxy, it keeps the connection
> open and cause i using 200 there are many unused opened connections.
> 
> What i've done now is to overwrite the mtcm releaseConnection() so
> he will close always. There was a thread here some days ago about this
> topic...
> 
> But how to do this? My first try:
> 
> package org.ac.net.http;
> 
> <the class>...
> import org.apache.commons.httpclient.HttpConnection;
> import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager;
> 
> public class AlwaysClosingMultiThreadedHttpConnectionManager
> 	extends MultiThreadedHttpConnectionManager
> {
> 	public void releaseConnection (
> 		HttpConnection conn)
> 	{
> 		conn.close ();
> 		super.releaseConnection (conn);
> 	}
> 
> }
> <end>
> 
> Is this ok or do I need some more code?
> 
> 
> Thanks for help
> 
> Ingo Meyer
> 
> 
> > 
> > 
> > 
> > "Ingo Meyer" <dj...@gmx.net>
> > 14.11.2005 09:52
> > Please respond to
> > "HttpClient User Discussion"
> > 
> > 
> > To
> > <ht...@jakarta.apache.org>
> > cc
> > 
> > Subject
> > proxy multithreaded problem
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Hi,
> > 
> > I'm new here and this is my first question...
> > 
> > I want to use httpclient for downloading a big amount of 
> > pages all about 
> > 80k
> > from several servers of our company around the world.
> > This works great with httpclient.
> > But now I'm adviced to use proxies and this works very bad. I 
> > need about 
> > 300
> > connections simultaniously and the speed decrease very much 
> > until it stops
> > nearly.
> > The difference between with and without proxies is 50-100x, 
> > >100k without 
> > p.
> > and <5k with proxies.
> > 
> > I have a static instance of multithreadedconnectionmanager 
> > and httpclient.
> > My guess is that the connections are handled by mtcm for 
> > every proxy not 
> > for
> > the real destination. In my case this is very bad cause I 
> > often change the
> > proxy and this may be the reason why my speed is to slow.
> > When i increase the connections to get more speed overall it 
> > doesn't work.
> > It seems that the
> > socket are limited by java (i tested on xp AND linux)?! As more 
> > connections
> > i give to mtcm in his configuration as more bad the speed is!?
> > Please tell me If you need some code fragment, but i have the standart
> > implementation.
> > 
> > 
> > Any suggestions will be appreciated...
> > 
> > 
> > Greats
> > Ingo Meyer
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: 
> > httpclient-user-help@jakarta.apache.org
> > 
> > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: 
> > httpclient-user-help@jakarta.apache.org
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


AW: proxy multithreaded problem

Posted by Ingo Meyer <dj...@gmx.net>.
> -----Ursprüngliche Nachricht-----
> Von: Roland Weber [mailto:ROLWEBER@de.ibm.com] 
> Gesendet: Montag, 14. November 2005 10:56
> An: HttpClient User Discussion
> Betreff: Re: proxy multithreaded problem
> 
> Hi Ingo,
> 
> there are two limits to the MTCM: number of connections and 
> number of connections per host. The latter is set to a rather 
> low value, I think 2. Make sure to increase both.
> 
> hope that helps,
>   Roland
> 

Sorry, this is what I mean with increasing the connection.
E.g. when I want to dl from two hosts using lets say 200 proxies
and 50 threads i give the MTCM 100 as number of connections (2x50) and 
50 number of connections per host. The proxies are shared by
a pool so that a proxy is only used once cause I think there are
proxies who limits the connections for a certain ip.

This doesn't work!
When i dl without proxies it works fine. Thats why i suppose the 
MTCM tries to keep the connection open for every proxy not for the
real destination, so when i change the proxy, it keeps the connection
open and cause i using 200 there are many unused opened connections.

What i've done now is to overwrite the mtcm releaseConnection() so
he will close always. There was a thread here some days ago about this
topic...

But how to do this? My first try:

package org.ac.net.http;

<the class>...
import org.apache.commons.httpclient.HttpConnection;
import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager;

public class AlwaysClosingMultiThreadedHttpConnectionManager
	extends MultiThreadedHttpConnectionManager
{
	public void releaseConnection (
		HttpConnection conn)
	{
		conn.close ();
		super.releaseConnection (conn);
	}

}
<end>

Is this ok or do I need some more code?


Thanks for help

Ingo Meyer


> 
> 
> 
> "Ingo Meyer" <dj...@gmx.net>
> 14.11.2005 09:52
> Please respond to
> "HttpClient User Discussion"
> 
> 
> To
> <ht...@jakarta.apache.org>
> cc
> 
> Subject
> proxy multithreaded problem
> 
> 
> 
> 
> 
> 
> 
> Hi,
> 
> I'm new here and this is my first question...
> 
> I want to use httpclient for downloading a big amount of 
> pages all about 
> 80k
> from several servers of our company around the world.
> This works great with httpclient.
> But now I'm adviced to use proxies and this works very bad. I 
> need about 
> 300
> connections simultaniously and the speed decrease very much 
> until it stops
> nearly.
> The difference between with and without proxies is 50-100x, 
> >100k without 
> p.
> and <5k with proxies.
> 
> I have a static instance of multithreadedconnectionmanager 
> and httpclient.
> My guess is that the connections are handled by mtcm for 
> every proxy not 
> for
> the real destination. In my case this is very bad cause I 
> often change the
> proxy and this may be the reason why my speed is to slow.
> When i increase the connections to get more speed overall it 
> doesn't work.
> It seems that the
> socket are limited by java (i tested on xp AND linux)?! As more 
> connections
> i give to mtcm in his configuration as more bad the speed is!?
> Please tell me If you need some code fragment, but i have the standart
> implementation.
> 
> 
> Any suggestions will be appreciated...
> 
> 
> Greats
> Ingo Meyer
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: 
> httpclient-user-help@jakarta.apache.org
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: 
> httpclient-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


Re: proxy multithreaded problem

Posted by Roland Weber <RO...@de.ibm.com>.
Hi Ingo,

there are two limits to the MTCM: number of connections
and number of connections per host. The latter is set to a
rather low value, I think 2. Make sure to increase both.

hope that helps,
  Roland




"Ingo Meyer" <dj...@gmx.net> 
14.11.2005 09:52
Please respond to
"HttpClient User Discussion"


To
<ht...@jakarta.apache.org>
cc

Subject
proxy multithreaded problem







Hi,

I'm new here and this is my first question...

I want to use httpclient for downloading a big amount of pages all about 
80k
from several servers of our company around the world.
This works great with httpclient.
But now I'm adviced to use proxies and this works very bad. I need about 
300
connections simultaniously and the speed decrease very much until it stops
nearly.
The difference between with and without proxies is 50-100x, >100k without 
p.
and <5k with proxies.

I have a static instance of multithreadedconnectionmanager and httpclient.
My guess is that the connections are handled by mtcm for every proxy not 
for
the real destination. In my case this is very bad cause I often change the
proxy and this may be the reason why my speed is to slow.
When i increase the connections to get more speed overall it doesn't work.
It seems that the
socket are limited by java (i tested on xp AND linux)?! As more 
connections
i give to mtcm in his configuration as more bad the speed is!?
Please tell me If you need some code fragment, but i have the standart
implementation.


Any suggestions will be appreciated...


Greats
Ingo Meyer


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org