You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by allan juul <al...@muly.dk> on 2005/04/20 21:18:01 UTC

advice needed: mod_perl reverse proxy

hi

i need advice before i waste too much time on the bleeding obvious.

we have a setup where we will reverse proxy content both to our own 
backend-servers (which run on IIS) and other external servers which 
content we dont control. one of the reasons we proxy is because of 
speed/performance

we have an Apache 2.054 up front on port 80 and the backend is on the 
same machine which is running windows 2004


we need to fix broken img src, and absolute links and that sort of thing 
coming from the external servers

i have fiddled with mod_proxy_html to rewrite stuff and that works ok, 
but have some features that doesn't mix well with our solution (content 
-type is encoded utf-8, where we proxy to iso-8859-1 for instance. or 
some html tags are stripped etc.) also caching becomes slower because of 
this output filter it seems (i guess because of unknown content-length)

it seems way overkill to have a mod_perl enabled frontend, but i'm 
pretty confident we could write a mod_perl filter to do the content 
rewrites we need.

so, is a mod_perl-enabled Apache acting as a proxy just a sick idea. it 
will proxy content and the filter will have to scan all response content

hope someone can wipe this down right away ;)

./allan

Re: compile mod_perl with Apache::DBI support

Posted by Michael Peters <mp...@plusthree.com>.
jiesheng zhang wrote:

> By this configuration, without the "use Apache::DBI ();" and with
> "$Apache::DBI::DEBUG=1" in the startup.pl, I could not see the debug
> output in the error log.

looks like you hit it on the head. If you don't have
$Apache::DBI::DEBUG=1 in your startup, then you wont see anything in the
errorlog.

from the doc:
"To enable debugging the variable $Apache::DBI::DEBUG must be set. This
can either be done in startup.pl or in the user script."

-- 
Michael Peters
Developer
Plus Three, LP


Re: compile mod_perl with Apache::DBI support

Posted by jiesheng zhang <ji...@bioteam.net>.

Perrin Harkins wrote:

>On Tue, 2005-05-03 at 00:21 +0800, jiesheng zhang wrote:
>  
>
>>I saw debug information in the apache log file after I added the
>>
>>use Apache::DBI to the startup.pl. However, this is not mentioned in the Apache::DBI documentation. The documentation only mentioned that I should add
>>PerlModule Apache::DBI to the http.conf.
>>    
>>
>
>Either way works.  If you were doing it with PerlModule before, you
>should have seen the debug information.
>  
>
The PerlModule seemed not working in my system. My system is suse 9.1, 
apache-2.0.49. mod_perl 1.99 release12.
Here is my mod_perl.conf which is included to httpd.conf

<IfModule mod_perl.c>
    PerlModule  Apache::DBI
#    PerlTrace  all
    PerlRequire "/etc/apache2/mod_perl-startup.pl"
#    PerlOptions +OpenLogs +Log
    ScriptAlias /perl/ "/srv/www/cgi-bin/"
    <Location /perl/>
        # mod_perl mode
        SetHandler perl-script
        PerlResponseHandler ModPerl::Registry
        PerlOptions +ParseHeaders
        Options +ExecCGI
    </Location>

    ScriptAlias /cgi-perl/ "/srv/www/cgi-bin/"
    <Location /cgi-perl>
        # perl cgi mode
        SetHandler  perl-script
        PerlResponseHandler ModPerl::PerlRun
        PerlOptions +ParseHeaders
        Options +ExecCGI
    </Location>

    # The /cgi-bin/ ScriptAlias is already set up in httpd.conf

</IfModule>

By this configuration, without the "use Apache::DBI ();" and with 
"$Apache::DBI::DEBUG=1" in the startup.pl, I could not see the debug 
output in the error log.

>- Perrin
>  
>

Re: compile mod_perl with Apache::DBI support

Posted by Perrin Harkins <pe...@elem.com>.
On Tue, 2005-05-03 at 00:21 +0800, jiesheng zhang wrote:
> I saw debug information in the apache log file after I added the
> 
> use Apache::DBI to the startup.pl. However, this is not mentioned in the Apache::DBI documentation. The documentation only mentioned that I should add
> PerlModule Apache::DBI to the http.conf.

Either way works.  If you were doing it with PerlModule before, you
should have seen the debug information.

- Perrin


Re: compile mod_perl with Apache::DBI support

Posted by jiesheng zhang <ji...@bioteam.net>.

Perrin Harkins wrote:

>On Mon, 2005-05-02 at 20:10 +0800, jiesheng zhang wrote:
>  
>
>>I indeed set the $Apache::DBI::DEBUG=2 in the 
>>/etc/apache2/mod_perl-startup.pl. However I did not see any debug output 
>>in the apache error log file
>>    
>>
>
>Did you "use Apache::DBI" in your startup.pl or httpd.conf?
>
>  
>
I saw debug information in the apache log file after I added the

use Apache::DBI to the startup.pl. However, this is not mentioned in the Apache::DBI documentation. The documentation only mentioned that I should add
PerlModule Apache::DBI to the http.conf.



>I think you're not understanding that Apache::DBI is a pure Perl module
>which you load in the normal way, not a part of mod_perl.  You do not
>need to recompile mod_perl in order to use Apache::DBI.
>
>- Perrin
>
>  
>

Thanks

Re: compile mod_perl with Apache::DBI support

Posted by Perrin Harkins <pe...@elem.com>.
On Mon, 2005-05-02 at 20:10 +0800, jiesheng zhang wrote:
> I indeed set the $Apache::DBI::DEBUG=2 in the 
> /etc/apache2/mod_perl-startup.pl. However I did not see any debug output 
> in the apache error log file

Did you "use Apache::DBI" in your startup.pl or httpd.conf?

I think you're not understanding that Apache::DBI is a pure Perl module
which you load in the normal way, not a part of mod_perl.  You do not
need to recompile mod_perl in order to use Apache::DBI.

- Perrin



Re: compile mod_perl with Apache::DBI support

Posted by jiesheng zhang <ji...@bioteam.net>.
>> I am using the SuSE 9.1 which has apache 2.0 and mod_perl 1.99_12.
>> I did not see any Apache::DBI debug information in the apache error log. 
>
>
> jiesheng, please read the module's manpage:
> http://search.cpan.org/dist/Apache-DBI/DBI.pm
>
>   To enable debugging the variable $Apache::DBI::DEBUG must be set.
>   This can either be done in startup.pl or in the user script. Setting
>   the variable to 1, just reports about a new connect. Setting the
>   variable to 2 enables full debug output.

I indeed set the $Apache::DBI::DEBUG=2 in the 
/etc/apache2/mod_perl-startup.pl. However I did not see any debug output 
in the apache error log file

Re: compile mod_perl with Apache::DBI support

Posted by Stas Bekman <st...@stason.org>.
jiesheng zhang wrote:
> I am using the SuSE 9.1 which has apache 2.0 and mod_perl 1.99_12.
> I did not see any Apache::DBI debug information in the apache error log. 

jiesheng, please read the module's manpage:
http://search.cpan.org/dist/Apache-DBI/DBI.pm

   To enable debugging the variable $Apache::DBI::DEBUG must be set.
   This can either be done in startup.pl or in the user script. Setting
   the variable to 1, just reports about a new connect. Setting the
   variable to 2 enables full debug output.

> I guessED the the mod_perl is not complied with the
> EVERYTHING=1 option. I then tried to compile the mod_perl to support 
> Apache::DBI
> The perl configuration command is like
> 
> perl Makefile.PL MP_APXS=/usr/sbin/apxs2 MP_CCOPTS="-O2 -march=i586 
> -mcpu=i686 -fmessage-length=0 -Wall -fPIC -Wall -fno-strict-aliasing 
> -D_LARGEFILE_SOURCE" EVERYTHING=1
> 
> However, I got this error
> ------------------------------------------------------
> Reading Makefile.PL args from @ARGV
>   MP_APXS = /usr/sbin/apxs2
>   MP_CCOPTS = -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC 
> -Wall -fno-strict-aliasing -D_LARGEFILE_SOURCE
> Configuring Apache/2.0.49 mod_perl/1.99_13-dev Perl/v5.8.3
> 'EVERYTHING' is not a known MakeMaker parameter name.
> ----------------------------------------------------------------------------- 
> 
> I also checked the mod_perl build instruction for mod_perl 2.0. It does 
> not mention anything about the option "EVERYTHING".

Because it doesn't exist and not needed in mp2. Why were you trying to do 
that?

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: compile mod_perl with Apache::DBI support

Posted by Jie Gao <J....@isu.usyd.edu.au>.


On Mon, 2 May 2005, jiesheng zhang wrote:

> Date: Mon, 02 May 2005 13:07:18 +0800
> From: jiesheng zhang <ji...@bioteam.net>
> To: modperl@perl.apache.org
> Subject: compile mod_perl with Apache::DBI support
>
> I am using the SuSE 9.1 which has apache 2.0 and mod_perl 1.99_12.
> I did not see any Apache::DBI debug information in the apache error log.
> I guessED the the mod_perl is not complied with the
> EVERYTHING=1 option. I then tried to compile the mod_perl to support
> Apache::DBI
> The perl configuration command is like
>
> perl Makefile.PL MP_APXS=/usr/sbin/apxs2 MP_CCOPTS="-O2 -march=i586
> -mcpu=i686 -fmessage-length=0 -Wall -fPIC -Wall -fno-strict-aliasing
> -D_LARGEFILE_SOURCE" EVERYTHING=1
>
> However, I got this error
> ------------------------------------------------------
> Reading Makefile.PL args from @ARGV
>    MP_APXS = /usr/sbin/apxs2
>    MP_CCOPTS = -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC
> -Wall -fno-strict-aliasing -D_LARGEFILE_SOURCE
> Configuring Apache/2.0.49 mod_perl/1.99_13-dev Perl/v5.8.3
> 'EVERYTHING' is not a known MakeMaker parameter name.
> -----------------------------------------------------------------------------
> I also checked the mod_perl build instruction for mod_perl 2.0. It does
> not mention anything about the option "EVERYTHING".

Shouldn't you be using MP_DEBUG=1 and MP_TRACE=1?

Regards,


Jie

compile mod_perl with Apache::DBI support

Posted by jiesheng zhang <ji...@bioteam.net>.
I am using the SuSE 9.1 which has apache 2.0 and mod_perl 1.99_12.
I did not see any Apache::DBI debug information in the apache error log. 
I guessED the the mod_perl is not complied with the
EVERYTHING=1 option. I then tried to compile the mod_perl to support 
Apache::DBI
The perl configuration command is like

perl Makefile.PL MP_APXS=/usr/sbin/apxs2 MP_CCOPTS="-O2 -march=i586 
-mcpu=i686 -fmessage-length=0 -Wall -fPIC -Wall -fno-strict-aliasing 
-D_LARGEFILE_SOURCE" EVERYTHING=1

However, I got this error
------------------------------------------------------
Reading Makefile.PL args from @ARGV
   MP_APXS = /usr/sbin/apxs2
   MP_CCOPTS = -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC 
-Wall -fno-strict-aliasing -D_LARGEFILE_SOURCE
Configuring Apache/2.0.49 mod_perl/1.99_13-dev Perl/v5.8.3
'EVERYTHING' is not a known MakeMaker parameter name.
-----------------------------------------------------------------------------
I also checked the mod_perl build instruction for mod_perl 2.0. It does 
not mention anything about the option "EVERYTHING".

Any suggestion?

Thanks


jason

Re: advice needed: mod_perl reverse proxy

Posted by Jeff Ambrosino <jb...@gmail.com>.
I've had some similar issues, but gotten around them by using
erro_headers_out.  This worked for me, so you might give it a try:

        $f->r->headers_out->unset('Content-Length');
        $f->r->err_headers_out->set('Content-Length' => length($NewBody));

Obviously "$NewBody" is whatever you're using to buffer the content. 
FYI this is used with an HTTP output filter on mp2 (RC4) within a
reverse mod_proxy.

Jeff


On 5/3/05, allan juul <al...@muly.dk> wrote:
> but about collecting data in a buffer variable. it seems i can actually
> $f->print that buffer, but not actually calculate the length of it. or
> rather: i can calculate the length but when i set any header the value
> is 0 (whether i set it before the $f->print statement or after).
> it seems i must admit that i don't quite get what is going on and when.
> 
> can anyone supply a simple example i then can check on our reverse proxy ?

Re: advice needed: mod_perl reverse proxy

Posted by Stas Bekman <st...@stason.org>.
allan@muly.dk wrote:
[...]
>>>>> can anyone supply a simple example i then can check on our reverse 
>>>>> proxy ?
>>>>
>>>>
>>>>
>>>> Try: t/response/TestApache/content_length_header.pm
>>>> Though I haven't tried to call it from the filter, so may be Jeff's 
>>>> suggestion will work.
>>>
>>>
>>>
>>>
>>> Jeff's suggestion does indeed work. oddly enough ;
>>
>>
>> It just happens to work in certain conditions. The correct solution is 
>> to use a bucket brigade-based filter, which gives you a complete 
>> control. What's happening is that streaming filter API passes FLUSH 
>> buckets through, and you can't control that. We have to do that to 
>> conform to the FLUSH requests, to be a well-behaved filter by default.
>> I've revealed that using 2 debug filters plugged before and after the 
>> filter in question. This is a very useful debugging tool:
>> http://search.cpan.org/dist/Apache-DebugFilter/
> 
> 
> well its not available yet on win32.

what do you mean? It's a pure perl module.

>> So the following solution works just fine, it'll be added shortly to 
>> the mp2 test suite as a pair of tests: t/filter/out_str_buffer.t 
>> t/filter/TestFilter/out_str_buffer.pm
> 
> 
> 
> thanks for the example andf replies. however, i get this error:
> 
>  Can't locate object method "first" via package "APR::Brigade" i have 
> prior to that just uninstalled mod_perl [RC5] and installed mod_perl 
> [RC6] via ppm (randys repository i beleve)

sorry, I didn't paste the part that loads the modules, please see:
http://perl.apache.org/docs/2.0/user/handlers/filters.html#Setting_the_Content_Length_Header_in_Request_Output_Filters

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: advice needed: mod_perl reverse proxy

Posted by al...@muly.dk.
Quoting Stas Bekman <st...@stason.org>:

> allan@muly.dk wrote:
>> Quoting Stas Bekman <st...@stason.org>:
>>
>>> allan juul wrote:
>>> [...]
>>>
>>>>> Use must use $r->set_content_length(). See the mp2 test suite for 
>>>>> examples.
>>>>
>>>>
>>>>
>>>> (i don't have that method available in my mod_perl2)
>>>
>>>
>>> You sure do :)
>>>
>>> % lookup set_content_length
>>> To use method 'set_content_length' add:
>>>         use Apache2::Response ();
>>> http://perl.apache.org/docs/2.0/api/Apache2/Response.html#C_set_content_length_
>>
>>
>> ok maybe i have a screwed installiation or a missing use/namespace . 
>> "method not found" message in my error.log.
>>
>>>> but about collecting data in a buffer variable. it seems i can 
>>>> actually $f->print that buffer, but not actually calculate the 
>>>> length of it. or rather: i can calculate the length but when i set 
>>>> any header the value is 0 (whether i set it before the $f->print 
>>>> statement or after).
>>>> it seems i must admit that i don't quite get what is going on and when.
>>>>
>>>> can anyone supply a simple example i then can check on our reverse proxy ?
>>>
>>>
>>> Try: t/response/TestApache/content_length_header.pm
>>> Though I haven't tried to call it from the filter, so may be Jeff's 
>>> suggestion will work.
>>
>>
>>
>> Jeff's suggestion does indeed work. oddly enough ;
>
> It just happens to work in certain conditions. The correct solution 
> is to use a bucket brigade-based filter, which gives you a complete 
> control. What's happening is that streaming filter API passes FLUSH 
> buckets through, and you can't control that. We have to do that to 
> conform to the FLUSH requests, to be a well-behaved filter by default.
> I've revealed that using 2 debug filters plugged before and after the 
> filter in question. This is a very useful debugging tool:
> http://search.cpan.org/dist/Apache-DebugFilter/

well its not available yet on win32.

> So the following solution works just fine, it'll be added shortly to 
> the mp2 test suite as a pair of tests: t/filter/out_str_buffer.t 
> t/filter/TestFilter/out_str_buffer.pm


thanks for the example andf replies. however, i get this error:

  Can't locate object method "first" via package "APR::Brigade" i have 
prior to that just uninstalled mod_perl [RC5] and installed mod_perl 
[RC6] via ppm (randys repository i beleve)


./allan



> and I'll document it too.
>
> sub flatten_bb {
>     my ($bb) = shift;
>
>     my $seen_eos = 0;
>
>     my @data;
>     for (my $b = $bb->first; $b; $b = $bb->next($b)) {
>         $seen_eos++, last if $b->is_eos;
>         $b->read(my $bdata);
>         push @data, $bdata;
>     }
>     return (join('', @data), $seen_eos);
> }
>
> sub handler {
>     my($filter, $bb) = @_;
>
>     my $ctx = $filter->ctx;
>
>     # no need to unset the C-L header, since this filter makes sure to
>     # correct it before any headers go out.
>     #unless ($ctx) {
>     #    $filter->r->headers_out->unset('Content-Length');
>     #}
>
>     my $data = exists $ctx->{data} ? $ctx->{data} : '';
>     $ctx->{invoked}++;
>     my($bdata, $seen_eos) = flatten_bb($bb);
>     $bdata =~ s/-//g;
>     $data .= $bdata if $bdata;
>
>     if ($seen_eos) {
>         my $len = length $data;
>         $filter->r->headers_out->set('Content-Length', $len);
>         $filter->print($data) if $data;
>     }
>     else {
>         # store context for all but the last invocation
>         $ctx->{data} = $data;
>         $filter->ctx($ctx);
>     }
>
>     return Apache2::Const::OK;
> }
>
>
> -- 
> __________________________________________________________________
> Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
> http://stason.org/     mod_perl Guide ---> http://perl.apache.org
> mailto:stas@stason.org http://use.perl.org http://apacheweek.com
> http://modperlbook.org http://apache.org   http://ticketmaster.com
>




Re: advice needed: mod_perl reverse proxy

Posted by Stas Bekman <st...@stason.org>.
allan@muly.dk wrote:
> Quoting Stas Bekman <st...@stason.org>:
> 
>> allan juul wrote:
>> [...]
>>
>>>> Use must use $r->set_content_length(). See the mp2 test suite for 
>>>> examples.
>>>
>>>
>>>
>>> (i don't have that method available in my mod_perl2)
>>
>>
>> You sure do :)
>>
>> % lookup set_content_length
>> To use method 'set_content_length' add:
>>         use Apache2::Response ();
>> http://perl.apache.org/docs/2.0/api/Apache2/Response.html#C_set_content_length_ 
>>
> 
> 
> ok maybe i have a screwed installiation or a missing use/namespace . 
> "method not found" message in my error.log.
> 
>>> but about collecting data in a buffer variable. it seems i can 
>>> actually $f->print that buffer, but not actually calculate the length 
>>> of it. or rather: i can calculate the length but when i set any 
>>> header the value is 0 (whether i set it before the $f->print 
>>> statement or after).
>>> it seems i must admit that i don't quite get what is going on and when.
>>>
>>> can anyone supply a simple example i then can check on our reverse 
>>> proxy ?
>>
>>
>> Try: t/response/TestApache/content_length_header.pm
>> Though I haven't tried to call it from the filter, so may be Jeff's 
>> suggestion will work.
> 
> 
> 
> Jeff's suggestion does indeed work. oddly enough ;

It just happens to work in certain conditions. The correct solution is to 
use a bucket brigade-based filter, which gives you a complete control. 
What's happening is that streaming filter API passes FLUSH buckets 
through, and you can't control that. We have to do that to conform to the 
FLUSH requests, to be a well-behaved filter by default.
I've revealed that using 2 debug filters plugged before and after the 
filter in question. This is a very useful debugging tool:
http://search.cpan.org/dist/Apache-DebugFilter/

So the following solution works just fine, it'll be added shortly to the 
mp2 test suite as a pair of tests: t/filter/out_str_buffer.t 
t/filter/TestFilter/out_str_buffer.pm

and I'll document it too.

sub flatten_bb {
     my ($bb) = shift;

     my $seen_eos = 0;

     my @data;
     for (my $b = $bb->first; $b; $b = $bb->next($b)) {
         $seen_eos++, last if $b->is_eos;
         $b->read(my $bdata);
         push @data, $bdata;
     }
     return (join('', @data), $seen_eos);
}

sub handler {
     my($filter, $bb) = @_;

     my $ctx = $filter->ctx;

     # no need to unset the C-L header, since this filter makes sure to
     # correct it before any headers go out.
     #unless ($ctx) {
     #    $filter->r->headers_out->unset('Content-Length');
     #}

     my $data = exists $ctx->{data} ? $ctx->{data} : '';
     $ctx->{invoked}++;
     my($bdata, $seen_eos) = flatten_bb($bb);
     $bdata =~ s/-//g;
     $data .= $bdata if $bdata;

     if ($seen_eos) {
         my $len = length $data;
         $filter->r->headers_out->set('Content-Length', $len);
         $filter->print($data) if $data;
     }
     else {
         # store context for all but the last invocation
         $ctx->{data} = $data;
         $filter->ctx($ctx);
     }

     return Apache2::Const::OK;
}


-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: advice needed: mod_perl reverse proxy

Posted by al...@muly.dk.
Quoting Stas Bekman <st...@stason.org>:

> allan juul wrote:
> [...]
>>> Use must use $r->set_content_length(). See the mp2 test suite for examples.
>>
>>
>> (i don't have that method available in my mod_perl2)
>
> You sure do :)
>
> % lookup set_content_length
> To use method 'set_content_length' add:
>         use Apache2::Response ();
> http://perl.apache.org/docs/2.0/api/Apache2/Response.html#C_set_content_length_

ok maybe i have a screwed installiation or a missing use/namespace . 
"method not found" message in my error.log.

>> but about collecting data in a buffer variable. it seems i can 
>> actually $f->print that buffer, but not actually calculate the 
>> length of it. or rather: i can calculate the length but when i set 
>> any header the value is 0 (whether i set it before the $f->print 
>> statement or after).
>> it seems i must admit that i don't quite get what is going on and when.
>>
>> can anyone supply a simple example i then can check on our reverse proxy ?
>
> Try: t/response/TestApache/content_length_header.pm
> Though I haven't tried to call it from the filter, so may be Jeff's 
> suggestion will work.


Jeff's suggestion does indeed work. oddly enough ;


./a




Re: advice needed: mod_perl reverse proxy

Posted by Stas Bekman <st...@stason.org>.
allan@muly.dk wrote:
> hi i don't get it. the below filter does output the content alright it 
> seems, but the setting of the header *value* is incorrect. (?)
> so the $f->print statement prints correct output
> but the calcualtion length(output) is incorrect (since it evaluates 
> length of this exact string "<html><head></head><body></body></html>\n" )
> why is that and how to fix this ??

Allan, I suggest that you spend time reading this document:
http://perl.apache.org/docs/2.0/user/handlers/filters.html

Clearly you don't realize that filters are invoked multiple times.

See my other reply for an example of a working code. If after reading the 
doc you have questions, I'd be glad to answer those.

In particular notice that as soon as something is sent from the request 
handler out, apache will generate its HTTP response headers and you can't 
change those afterwards, since they were already sent to the client.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: advice needed: mod_perl reverse proxy

Posted by al...@muly.dk.
hi i don't get it. the below filter does output the content alright it 
seems, but the setting of the header *value* is incorrect. (?)
so the $f->print statement prints correct output
but the calcualtion length(output) is incorrect (since it evaluates 
length of this exact string "<html><head></head><body></body></html>\n" 
)
why is that and how to fix this ??

many thanks
./allan

# Apache/2.0.54 (Win32) mod_ssl/2.0.53 OpenSSL/0.9.7f proxy_html/2.4 
mod_perl/1.999.22-dev Perl/v5.8.6 configured

# this is the actual 11 bytes of content on URL:
# hello world

# $ get localhost
# hello world


# $ head localhost
# ...
# Server: Microsoft-IIS/6.0
# Content-Length: 40
# Content-Type: text/html; charset=utf-8
# ...


# excerpt of code
my $context;
unless ($f->ctx) {
	$r->headers_out->unset( "Content-Length" );
}

$context ||= $f->ctx;
my $content_length = 0;
my $str = "";

while ($f->read(my $buffer, 1024)) {
	$buffer = $context->{extra} . $buffer if $context->{extra};
	if (($context->{extra}) = $buffer =~ m/(<[^>]*)$/) {
		$buffer = substr($buffer, 0, - length($context->{extra}));
	}
	$str .= $buffer;
}

if ($f->seen_eos) {
	if ( $context->{extra} ) {
		$str .= $context->{extra};
	}
}

else {
	$f->ctx($context);
}



$str = Rewrite::replace_links ( $str );
# at this point $str equals: <html><head></head><body></body></html>

$content_length += length( $str );

# set header
$r->err_headers_out->set('Content-Length', $content_length );

$f->print( $str );
# at this point $str equals: all 42KB of correct html



Re: advice needed: mod_perl reverse proxy

Posted by Stas Bekman <st...@stason.org>.
allan juul wrote:
[...]
>> Use must use $r->set_content_length(). See the mp2 test suite for 
>> examples.
> 
> 
> (i don't have that method available in my mod_perl2)

You sure do :)

% lookup set_content_length
To use method 'set_content_length' add:
         use Apache2::Response ();
http://perl.apache.org/docs/2.0/api/Apache2/Response.html#C_set_content_length_

> but about collecting data in a buffer variable. it seems i can actually 
> $f->print that buffer, but not actually calculate the length of it. or 
> rather: i can calculate the length but when i set any header the value 
> is 0 (whether i set it before the $f->print statement or after).
> it seems i must admit that i don't quite get what is going on and when.
> 
> can anyone supply a simple example i then can check on our reverse proxy ?

Try: t/response/TestApache/content_length_header.pm
Though I haven't tried to call it from the filter, so may be Jeff's 
suggestion will work.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: advice needed: mod_perl reverse proxy

Posted by allan juul <al...@muly.dk>.
Stas Bekman wrote:
> allan juul wrote:
> 
>> hi stas
>>
>> Stas Bekman wrote:
>>
>>> allan juul wrote:
>>> [...]
>>
>>
>>
>>>>> But if you use a mod_perl filter you will still hit the issue of 
>>>>> unknown content-length header.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> yes, of course that's true.
>>>> there goes caching (:
>>>
>>>
>>>
>>>
>>> Not really. Nothing prevents you from buffering up the response, 
>>> process it, set the content-length header and make the document 
>>> cache-able.
>>
>>
>>
>> ok, eh how do i that. you mean instead of printing to STDOUT, collect 
>> data in a buffer, then set the calculated Content-Length, then print 
>> data?
> 
> 
> That's right.
> 
>> anyway, it's pretty strange. it seems i'm able to set the 
>> Content-Length when i use the mod_perl_filter and do *not* reverse 
>> proxy. see both headers below. the strange things is that i'm not 
>> allowed at all to set the standard Content-Length, but indeed allowed 
>> to set a custom one called Content-Length2. and even stranger is that 
>> this custom header presents a correct value when *not* proxying but 
>> "0" when proxying. i use the exact same mod_perl code, also supplied 
>> below. the actual filtering of data content works in both cases.
> 
> 
> Use must use $r->set_content_length(). See the mp2 test suite for examples.

(i don't have that method available in my mod_perl2)


but about collecting data in a buffer variable. it seems i can actually 
$f->print that buffer, but not actually calculate the length of it. or 
rather: i can calculate the length but when i set any header the value 
is 0 (whether i set it before the $f->print statement or after).
it seems i must admit that i don't quite get what is going on and when.

can anyone supply a simple example i then can check on our reverse proxy ?

thanks
./allan







Re: advice needed: mod_perl reverse proxy

Posted by Stas Bekman <st...@stason.org>.
allan juul wrote:
> hi stas
> 
> Stas Bekman wrote:
> 
>> allan juul wrote:
>> [...]
> 
> 
>>>> But if you use a mod_perl filter you will still hit the issue of 
>>>> unknown content-length header.
>>>
>>>
>>>
>>>
>>> yes, of course that's true.
>>> there goes caching (:
>>
>>
>>
>> Not really. Nothing prevents you from buffering up the response, 
>> process it, set the content-length header and make the document 
>> cache-able.
> 
> 
> ok, eh how do i that. you mean instead of printing to STDOUT, collect 
> data in a buffer, then set the calculated Content-Length, then print data?

That's right.

> anyway, it's pretty strange. it seems i'm able to set the Content-Length 
> when i use the mod_perl_filter and do *not* reverse proxy. see both 
> headers below. the strange things is that i'm not allowed at all to set 
> the standard Content-Length, but indeed allowed to set a custom one 
> called Content-Length2. and even stranger is that this custom header 
> presents a correct value when *not* proxying but "0" when proxying. i 
> use the exact same mod_perl code, also supplied below. the actual 
> filtering of data content works in both cases.

Use must use $r->set_content_length(). See the mp2 test suite for examples.



-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: advice needed: mod_perl reverse proxy

Posted by allan juul <al...@muly.dk>.
hi stas

Stas Bekman wrote:
> allan juul wrote:
> [...]

>>> But if you use a mod_perl filter you will still hit the issue of 
>>> unknown content-length header.
>>
>>
>>
>> yes, of course that's true.
>> there goes caching (:
> 
> 
> Not really. Nothing prevents you from buffering up the response, process 
> it, set the content-length header and make the document cache-able.

ok, eh how do i that. you mean instead of printing to STDOUT, collect 
data in a buffer, then set the calculated Content-Length, then print data?

anyway, it's pretty strange. it seems i'm able to set the Content-Length 
when i use the mod_perl_filter and do *not* reverse proxy. see both 
headers below. the strange things is that i'm not allowed at all to set 
the standard Content-Length, but indeed allowed to set a custom one 
called Content-Length2. and even stranger is that this custom header 
presents a correct value when *not* proxying but "0" when proxying. i 
use the exact same mod_perl code, also supplied below. the actual 
filtering of data content works in both cases.

all this is on windows.


./allan


######### CODE ##############
# this is heavily based on
# http://search.cpan.org/~geoff/Apache-Clean-2.00_7/

package Apache::Clean;
use 5.008;
use Apache2::Filter ();
use Apache2::RequestRec ();
use Apache2::RequestUtil ();
use Apache2::Log ();
use APR::Table ();
use Apache2::Const -compile => qw(OK DECLINED);
use strict;
sub handler {
     my $f   = shift;
     my $r   = $f->r;
     my $log = $r->server->log;
     unless ($r->content_type =~ m!text/html!i) {
         $log->info('skipping request to ', $r->uri, ' (not an HTML 
document)');
         return Apache2::DECLINED;
     }
     my $context;
     unless ($f->ctx) {
         $r->headers_out->unset('Content-Length');
     }
     $context ||= $f->ctx;
     my $content_length = 0;
     while ($f->read(my $buffer, 1024)) {
         $buffer = $context->{extra} . $buffer if $context->{extra};
         if (($context->{extra}) = $buffer =~ m/(<[^>]*)$/) {
             $buffer = substr($buffer, 0, - length($context->{extra}));
         }
         my $str = $buffer;
         $str =~  s,OLD,NEW,igs;
         $content_length += length( $str );
         $f->print( ${str} );
     }
     if ($f->seen_eos) {
         $f->print($context->{extra}) if $context->{extra};
         $content_length += length( $context->{extra} );
     }
     else {
         $f->ctx($context);
     }
     $r->headers_out->set('Content-Length', $content_length);
     $r->headers_out->set('Content-Length2', $content_length);
     return Apache2::OK;
}
1;






######### HEADERS ##############

# no rev proxy
$ head localhost
200 OK
Connection: close
Date: Sat, 30 Apr 2005 20:02:17 GMT
Accept-Ranges: bytes
Server: Apache/2.0.54 (Win32) mod_ssl/2.0.53 OpenSSL/0.9.7f 
proxy_html/2.4 mod_perl/1.999.22-dev Perl/v5.8.6
Vary: negotiate,accept-language,accept-charset
Content-Language: en
Content-Length: 1773
Content-Location: index.html.en
Content-Type: text/html
Last-Modified: Sun, 21 Nov 2004 05:35:22 GMT
Client-Date: Sat, 30 Apr 2005 20:02:17 GMT
Client-Peer: 127.0.0.1:80
Client-Response-Num: 1
Content-Length2: 1773




# with rev proxy
$ head localhost
200 OK
Cache-Control: no-cache, no-store
Connection: close
Date: Sat, 30 Apr 2005 20:04:24 GMT
Pragma: no-cache
Server: Microsoft-IIS/6.0
Content-Type: text/html; charset=utf-8
Expires: -1
Client-Date: Sat, 30 Apr 2005 20:04:25 GMT
Client-Peer: 127.0.0.1:80
Client-Response-Num: 1
Content-Length2: 0
Set-Cookie: ASP.NET_SessionId=4nddnr453tjk0355flblp3fg; path=/
X-AspNet-Version: 1.1.4322
X-Powered-By: ASP.NET


Re: advice needed: mod_perl reverse proxy

Posted by Stas Bekman <st...@stason.org>.
allan juul wrote:
[...]
>>> i have fiddled with mod_proxy_html to rewrite stuff and that works 
>>> ok, but have some features that doesn't mix well with our solution 
>>> (content -type is encoded utf-8, where we proxy to iso-8859-1 for 
>>> instance. or some html tags are stripped etc.) also caching becomes 
>>> slower because of this output filter it seems (i guess because of 
>>> unknown content-length)
>>
>>
>>
>> But if you use a mod_perl filter you will still hit the issue of 
>> unknown content-length header.
> 
> 
> yes, of course that's true.
> there goes caching (:

Not really. Nothing prevents you from buffering up the response, process 
it, set the content-length header and make the document cache-able.



-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: advice needed: mod_perl reverse proxy

Posted by allan juul <al...@muly.dk>.
Stas Bekman wrote:
> allan juul wrote:
> 
>> hi
>>
>> i need advice before i waste too much time on the bleeding obvious.
>>
>> we have a setup where we will reverse proxy content both to our own 
>> backend-servers (which run on IIS) and other external servers which 
>> content we dont control. one of the reasons we proxy is because of 
>> speed/performance
>>
>> we have an Apache 2.054 up front on port 80 and the backend is on the 
>> same machine which is running windows 2004
>>
>>
>> we need to fix broken img src, and absolute links and that sort of 
>> thing coming from the external servers
>>
>> i have fiddled with mod_proxy_html to rewrite stuff and that works ok, 
>> but have some features that doesn't mix well with our solution 
>> (content -type is encoded utf-8, where we proxy to iso-8859-1 for 
>> instance. or some html tags are stripped etc.) also caching becomes 
>> slower because of this output filter it seems (i guess because of 
>> unknown content-length)
> 
> 
> But if you use a mod_perl filter you will still hit the issue of unknown 
> content-length header.

yes, of course that's true.
there goes caching (:

>> it seems way overkill to have a mod_perl enabled frontend, but i'm 
>> pretty confident we could write a mod_perl filter to do the content 
>> rewrites we need.
> 
>  >
> 
>> so, is a mod_perl-enabled Apache acting as a proxy just a sick idea. 
>> it will proxy content and the filter will have to scan all response 
>> content
> 
> 
> It shouldn't be too hard to write a quick prototype, run benchmarks and 
> see whether it scales or not. It's really hard to give an answer when 
> you don't know what kind of type/size of code base you deal with, since 
> if your code base is very small and you don't load tons of other things, 
> you may get away with a quite efficient setup.
> 

i'm not sure what you mean, but the backend code is compiled dll's and 
the usual images/js/css/xsl files all running in a .NET environment. 
it's not a gigantic code base but pretty big. the load on the servers 
are not so bad currently because its spread on several identical machine 
setups (load balanced). its a public web portal.

ok, i'll give it a mod_perl shot. i guess my main worries are just the 
memory the "light" frontend would begin to eat.

btw. (and OT): is it in fact correct that i should see the HTTP headers 
for the proxied server (IIS) and not the frontend (apache) ?



thanks
./allan

Re: advice needed: mod_perl reverse proxy

Posted by Stas Bekman <st...@stason.org>.
allan juul wrote:
> hi
> 
> i need advice before i waste too much time on the bleeding obvious.
> 
> we have a setup where we will reverse proxy content both to our own 
> backend-servers (which run on IIS) and other external servers which 
> content we dont control. one of the reasons we proxy is because of 
> speed/performance
> 
> we have an Apache 2.054 up front on port 80 and the backend is on the 
> same machine which is running windows 2004
> 
> 
> we need to fix broken img src, and absolute links and that sort of thing 
> coming from the external servers
> 
> i have fiddled with mod_proxy_html to rewrite stuff and that works ok, 
> but have some features that doesn't mix well with our solution (content 
> -type is encoded utf-8, where we proxy to iso-8859-1 for instance. or 
> some html tags are stripped etc.) also caching becomes slower because of 
> this output filter it seems (i guess because of unknown content-length)

But if you use a mod_perl filter you will still hit the issue of unknown 
content-length header.

> it seems way overkill to have a mod_perl enabled frontend, but i'm 
> pretty confident we could write a mod_perl filter to do the content 
> rewrites we need.
 >
> so, is a mod_perl-enabled Apache acting as a proxy just a sick idea. it 
> will proxy content and the filter will have to scan all response content

It shouldn't be too hard to write a quick prototype, run benchmarks and 
see whether it scales or not. It's really hard to give an answer when you 
don't know what kind of type/size of code base you deal with, since if 
your code base is very small and you don't load tons of other things, you 
may get away with a quite efficient setup.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: advice needed: mod_perl reverse proxy

Posted by Alex Greg <al...@gmail.com>.
On 4/21/05, Dominique Quatravaux <do...@idealx.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> allan juul wrote:
> 
> |
> | so, is a mod_perl-enabled Apache acting as a proxy just a sick
> | idea. it will proxy content and the filter will have to scan all
> | response content
> 
> A reverse-proxy in mod_perl is something I do for a living. When
> scaling up it quickly needs loads of RAM (2 Gb are cheap these days)
> but it is incredibly efficient and flexible for complex scenarios
> (e.g. taking over authentication dialogs, URL rewriting as you need
> them). We are busy porting that to Apache 2. So it's definitely feasible.
> 
> I have no code to offer, sorry (we're a GPL shop but we do not
> redistribute at large, a bit like what MySQL.com does), but anyway
> it's a custom-built thing targeted to WSSO, and you'd be probably
> better off starting from scratch or from HTTP::Proxy by Philippe Bruhat.

Not sure if this is of any use, but it's what LiveJournal.com use as a
reverse HTTP proxy:

http://www.danga.com/perlbal/

-- Alex

Re: advice needed: mod_perl reverse proxy

Posted by Dominique Quatravaux <do...@idealx.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

allan juul wrote:

|
| so, is a mod_perl-enabled Apache acting as a proxy just a sick
| idea. it will proxy content and the filter will have to scan all
| response content

A reverse-proxy in mod_perl is something I do for a living. When
scaling up it quickly needs loads of RAM (2 Gb are cheap these days)
but it is incredibly efficient and flexible for complex scenarios
(e.g. taking over authentication dialogs, URL rewriting as you need
them). We are busy porting that to Apache 2. So it's definitely feasible.

I have no code to offer, sorry (we're a GPL shop but we do not
redistribute at large, a bit like what MySQL.com does), but anyway
it's a custom-built thing targeted to WSSO, and you'd be probably
better off starting from scratch or from HTTP::Proxy by Philippe Bruhat.

Regards,

- --
Dominique QUATRAVAUX                           Ingénieur senior
01 44 42 00 08                                 IDEALX

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCZ2Z8MJAKAU3mjcsRAta6AJwKrr6GJU+CTrf8YrkAufgfUmGMkACeLxvM
s84jc41LABhfj9R/oUSxfhQ=
=Cdrj
-----END PGP SIGNATURE-----