You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Roman Gavrilov <ro...@aduva.com> on 2004/10/20 17:08:10 UTC
mod_proxy reverse proxy optimization/performance question
I am using a reverse proxy to cache a remote site. The files are mostly
rpms, with varying sizes: 3-30M or more.
Now if you have a number of requests for the same file which is not yet
cached locally, all of these requests will download the requested file
from the remote site. It will slow down the speed of each download as
the throughput of the line will be split among all processes.
So if there are lots of processes to download the same rpm from a remote
site, this can take lots of time to complete a request.
This can bring apache to a state where it can not serve other requests,
as all available processes are already busy.
In my opinion it would be more efficient to let one process complete the
request (using maximum line throughput) and return some busy code to
other identical, simultaneous requests until the file is cached locally.
As anyone run into a similar situation? What solution did you find?
I have created a solution, as I did not find anything else already
existing. I would like to discuss it here and get your opinions.
1. When a request for a file that is not yet in the local cache is
accepted by the proxy, a temporary lock file is created (based on the
proxy's pathname of the file, changed from directory slashes to
underscores).
2. Other processes requesting the same file will check first for the
lock file. If found, they will return a busy code (ie: 408 Request
Timeout), and the request should be sent repeatedly until successful.
Please let me know what you think of this approach, especially if you
have done or seen something similar.
Apache version 1.3.x
Thank you
Roman
--
-------------------------------------------------------------
I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
Posted by Igor Sysoev <is...@rambler-co.ru>.
On Thu, 21 Oct 2004, Roman Gavrilov wrote:
> No, when https request gets to the server(apache), its being decrypted
> first then passed through apache routines, when it gets
> to the proxy part the URI already decrypted. proxy in its turn issues a
> request to the backend https server and returns the answer to the client
> of course after caching it.
Well, it's the same as I described.
No, mod_accel can not connect to backend using https.
> Roman
>
> Igor Sysoev wrote:
>
> >On Thu, 21 Oct 2004, Roman Gavrilov wrote:
> >
> >
> >
> >>I don't see any problem using it, actually I am doing it. I am not
> >>talking about proxying between http and https.
> >>Mostly its used for mirroring (both frontend and backend use https only)
> >>no redirections on backend though :)
> >>
> >>
> >>ProxyPass /foo/bar https:/mydomain/foobar/
> >>ProxyPassReverse https:/mydomain/foobar/ /foo/bar
> >>
> >>I'll be more then glad to discuss it with you.
> >>
> >>
> >
> >So proxy should decrypt the stream, find URI, then encrypt it, and
> >pass it encrypted to backend ?
Igor Sysoev
http://sysoev.ru/en/
Re: mod_proxy reverse proxy optimization/performance question
Posted by Roman Gavrilov <ro...@aduva.com>.
No, when https request gets to the server(apache), its being decrypted
first then passed through apache routines, when it gets
to the proxy part the URI already decrypted. proxy in its turn issues a
request to the backend https server and returns the answer to the client
of course after caching it.
Roman
Igor Sysoev wrote:
>On Thu, 21 Oct 2004, Roman Gavrilov wrote:
>
>
>
>>I don't see any problem using it, actually I am doing it. I am not
>>talking about proxying between http and https.
>>Mostly its used for mirroring (both frontend and backend use https only)
>>no redirections on backend though :)
>>
>>
>>ProxyPass /foo/bar https:/mydomain/foobar/
>>ProxyPassReverse https:/mydomain/foobar/ /foo/bar
>>
>>I'll be more then glad to discuss it with you.
>>
>>
>
>So proxy should decrypt the stream, find URI, then encrypt it, and
>pass it encrypted to backend ?
>
>
>Igor Sysoev
>http://sysoev.ru/en/
>
>
>
>
--
-------------------------------------------------------------
I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
Posted by Igor Sysoev <is...@rambler-co.ru>.
On Thu, 21 Oct 2004, Roman Gavrilov wrote:
> I don't see any problem using it, actually I am doing it. I am not
> talking about proxying between http and https.
> Mostly its used for mirroring (both frontend and backend use https only)
> no redirections on backend though :)
>
>
> ProxyPass /foo/bar https:/mydomain/foobar/
> ProxyPassReverse https:/mydomain/foobar/ /foo/bar
>
> I'll be more then glad to discuss it with you.
So proxy should decrypt the stream, find URI, then encrypt it, and
pass it encrypted to backend ?
Igor Sysoev
http://sysoev.ru/en/
Re: mod_proxy reverse proxy optimization/performance question
Posted by Roman Gavrilov <ro...@aduva.com>.
I don't see any problem using it, actually I am doing it. I am not
talking about proxying between http and https.
Mostly its used for mirroring (both frontend and backend use https only)
no redirections on backend though :)
ProxyPass /foo/bar https:/mydomain/foobar/
ProxyPassReverse https:/mydomain/foobar/ /foo/bar
I'll be more then glad to discuss it with you.
Regards
Roman
Igor Sysoev wrote:
>On Thu, 21 Oct 2004, Roman Gavrilov wrote:
>
>
>
>>after checking the mod_accel I found out that it works only with http,
>>we need the cache & proxy to work both with http and https.
>>What was the reason for disabling https proxying & caching ?
>>
>>
>
>How do you think to do https reverse proxying ?
>
>
>Igor Sysoev
>http://sysoev.ru/en/
>
>
>
>
--
-------------------------------------------------------------
I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
Posted by Igor Sysoev <is...@rambler-co.ru>.
On Thu, 21 Oct 2004, Roman Gavrilov wrote:
> after checking the mod_accel I found out that it works only with http,
> we need the cache & proxy to work both with http and https.
> What was the reason for disabling https proxying & caching ?
How do you think to do https reverse proxying ?
Igor Sysoev
http://sysoev.ru/en/
Re: mod_proxy reverse proxy optimization/performance question
Posted by Roman Gavrilov <ro...@aduva.com>.
after checking the mod_accel I found out that it works only with http,
we need the cache & proxy to work both with http and https.
What was the reason for disabling https proxying & caching ?
Regards,
Roman
Igor Sysoev wrote:
>On Thu, 21 Oct 2004, Roman Gavrilov wrote:
>
>
>
>>so what would you suggest I should do ?
>>implement it by myself ?
>>
>>
>
>No, just look at http://sysoev.ru/mod_accel/
>It's Apache 1.3 module as you need.
>
>Igor Sysoev
>http://sysoev.ru/en/
>
>
>
>
>>Bill Stoddard wrote:
>>
>>
>>
>>>Graham Leggett wrote:
>>>
>>>
>>>
>>>>Roman Gavrilov wrote:
>>>>
>>>>
>>>>
>>>>>In my opinion it would be more efficient to let one process complete
>>>>>the request (using maximum line throughput) and return some busy
>>>>>code to other identical, simultaneous requests until the file is
>>>>>cached locally.
>>>>>As anyone run into a similar situation? What solution did you find?
>>>>>
>>>>>
>>>>In the original design for mod_cache, the second and subsequent
>>>>connections to a file that was still in the process of being
>>>>downloaded into the cache would shadow the cached file - in other
>>>>words it would serve content from the cached file as and when it was
>>>>received by the original request.
>>>>
>>>>The file in the cache was to be marked as "still busy downloading",
>>>>which meant threads/processes serving from the cached file would know
>>>>to keep trying to serve the cached file until the "still busy
>>>>downloading" status was cleared by the initial request. Timeouts
>>>>would sanity check the process.
>>>>
>>>>This prevents the "load spike" that occurs just after a file is
>>>>downloaded anew, but before that download is done.
>>>>
>>>>Whether this was implemented fully I am not sure - anyone?
>>>>
>>>>
>>>It was never implemented.
>>>
>>>
>
>
>
>
--
-------------------------------------------------------------
I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
Posted by Igor Sysoev <is...@rambler-co.ru>.
On Thu, 21 Oct 2004, Roman Gavrilov wrote:
> so what would you suggest I should do ?
> implement it by myself ?
No, just look at http://sysoev.ru/mod_accel/
It's Apache 1.3 module as you need.
Igor Sysoev
http://sysoev.ru/en/
> Bill Stoddard wrote:
>
> > Graham Leggett wrote:
> >
> >> Roman Gavrilov wrote:
> >>
> >>> In my opinion it would be more efficient to let one process complete
> >>> the request (using maximum line throughput) and return some busy
> >>> code to other identical, simultaneous requests until the file is
> >>> cached locally.
> >>> As anyone run into a similar situation? What solution did you find?
> >>
> >> In the original design for mod_cache, the second and subsequent
> >> connections to a file that was still in the process of being
> >> downloaded into the cache would shadow the cached file - in other
> >> words it would serve content from the cached file as and when it was
> >> received by the original request.
> >>
> >> The file in the cache was to be marked as "still busy downloading",
> >> which meant threads/processes serving from the cached file would know
> >> to keep trying to serve the cached file until the "still busy
> >> downloading" status was cleared by the initial request. Timeouts
> >> would sanity check the process.
> >>
> >> This prevents the "load spike" that occurs just after a file is
> >> downloaded anew, but before that download is done.
> >>
> >> Whether this was implemented fully I am not sure - anyone?
> >
> >
> > It was never implemented.
Re: mod_proxy reverse proxy optimization/performance question
Posted by Graham Leggett <mi...@sharp.fm>.
Roman Gavrilov wrote:
> so what would you suggest I should do ?
> implement it by myself ?
At the moment that's probably your best option.
Is this for Apache v1.3 or v2.0?
Regards,
Graham
--
Re: mod_proxy reverse proxy optimization/performance question
Posted by Roman Gavrilov <ro...@aduva.com>.
so what would you suggest I should do ?
implement it by myself ?
Bill Stoddard wrote:
> Graham Leggett wrote:
>
>> Roman Gavrilov wrote:
>>
>>> In my opinion it would be more efficient to let one process complete
>>> the request (using maximum line throughput) and return some busy
>>> code to other identical, simultaneous requests until the file is
>>> cached locally.
>>> As anyone run into a similar situation? What solution did you find?
>>
>>
>>
>> In the original design for mod_cache, the second and subsequent
>> connections to a file that was still in the process of being
>> downloaded into the cache would shadow the cached file - in other
>> words it would serve content from the cached file as and when it was
>> received by the original request.
>>
>> The file in the cache was to be marked as "still busy downloading",
>> which meant threads/processes serving from the cached file would know
>> to keep trying to serve the cached file until the "still busy
>> downloading" status was cleared by the initial request. Timeouts
>> would sanity check the process.
>>
>> This prevents the "load spike" that occurs just after a file is
>> downloaded anew, but before that download is done.
>>
>> Whether this was implemented fully I am not sure - anyone?
>
>
> It was never implemented.
>
> Bill
>
>
--
-------------------------------------------------------------
I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
Posted by Bill Stoddard <bi...@wstoddard.com>.
Graham Leggett wrote:
> Roman Gavrilov wrote:
>
>> In my opinion it would be more efficient to let one process complete
>> the request (using maximum line throughput) and return some busy code
>> to other identical, simultaneous requests until the file is cached
>> locally.
>> As anyone run into a similar situation? What solution did you find?
>
>
> In the original design for mod_cache, the second and subsequent
> connections to a file that was still in the process of being downloaded
> into the cache would shadow the cached file - in other words it would
> serve content from the cached file as and when it was received by the
> original request.
>
> The file in the cache was to be marked as "still busy downloading",
> which meant threads/processes serving from the cached file would know to
> keep trying to serve the cached file until the "still busy downloading"
> status was cleared by the initial request. Timeouts would sanity check
> the process.
>
> This prevents the "load spike" that occurs just after a file is
> downloaded anew, but before that download is done.
>
> Whether this was implemented fully I am not sure - anyone?
It was never implemented.
Bill
Re: mod_proxy reverse proxy optimization/performance question
Posted by Graham Leggett <mi...@sharp.fm>.
Roman Gavrilov wrote:
> In my opinion it would be more efficient to let one process complete the
> request (using maximum line throughput) and return some busy code to
> other identical, simultaneous requests until the file is cached locally.
> As anyone run into a similar situation? What solution did you find?
In the original design for mod_cache, the second and subsequent
connections to a file that was still in the process of being downloaded
into the cache would shadow the cached file - in other words it would
serve content from the cached file as and when it was received by the
original request.
The file in the cache was to be marked as "still busy downloading",
which meant threads/processes serving from the cached file would know to
keep trying to serve the cached file until the "still busy downloading"
status was cleared by the initial request. Timeouts would sanity check
the process.
This prevents the "load spike" that occurs just after a file is
downloaded anew, but before that download is done.
Whether this was implemented fully I am not sure - anyone?
Regards,
Graham
--