You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Esteban Fernandez Stafford <es...@webtogo.de> on 2003/09/04 10:25:23 UTC
Content encoding when filtering proxyed pages
Hello all,
I have a machine acting as a proxy using mod_perl-1.99_09 with apache
2.0.46. This proxy is supposed to filter all html content. So far I
have achieved most of my project's goals. But there is one issue I
can't get straight, this is when the proxy gets a page that is
encoded (like in www.google.com). My first attempt was to DECLINE
filtering such pages, but the $filter->r()->content_encoding() always
gives me 'undef'. Is this something that is not yet implemented or am
I doing something wrong? (See code below) Then I tried looking at
$filter->r()->headers_out()->{'Content-Encoding'} and everything went
just fine!
On the other hand, is it possible that I could put mod_deflate before
my filter to get the content already decompressed for my filter to
parse?
Thanks a lot in advance
I would like to thank the mod_perl community for mod_perl, it has made
the development of this project fun! And it has kept me from having to
go back to C programming. It was a long time since I last did that.
package WTG::HtmlFilter;
use strict;
use warnings;# FATAL => 'all';
use Apache::RequestRec ();
use Apache::RequestIO ();
use APR::Brigade ();
use APR::Bucket ();
use base qw(Apache::Filter);
use Apache::Const -compile => qw(OK M_POST);
use APR::Const -compile => ':common';
use constant READ_SIZE => 1024;
use HTML::Parser ();
sub handler : FilterRequestHandler
{
my $filter = shift;
my $parser;
# Initialize parser if not already done
unless ($parser = $filter->ctx)
{
# This is the first call of the filter for a particular request
# Can we filter this request?
my $type = $filter->r()->content_type();
if(! defined $type || $type !~ /^text\/html\b/)
{
$filter->remove();
return Apache::DECLINED;
}
# This line gives me undefined
print STDERR $filter->r()->content_type(), "\n";
blah... blah... blah...
E s t e b a n!
:wq
--
Reporting bugs: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
Re: Content encoding when filtering proxyed pages
Posted by Stas Bekman <st...@stason.org>.
Esteban Fernandez Stafford wrote:
>
> Hello all,
>
> I have a machine acting as a proxy using mod_perl-1.99_09 with apache
> 2.0.46. This proxy is supposed to filter all html content. So far I
> have achieved most of my project's goals. But there is one issue I
> can't get straight, this is when the proxy gets a page that is
> encoded (like in www.google.com). My first attempt was to DECLINE
> filtering such pages, but the $filter->r()->content_encoding() always
> gives me 'undef'. Is this something that is not yet implemented or am
> I doing something wrong? (See code below) Then I tried looking at
> $filter->r()->headers_out()->{'Content-Encoding'} and everything went
> just fine!
Looks like it's autogenerated, but probably not working, as I see that the
corresponding call in the test is commented out:
t/response/TestAPI/request_rec.pm: #content_encoding
Need to check why is it so.
> On the other hand, is it possible that I could put mod_deflate before
> my filter to get the content already decompressed for my filter to
> parse?
Sure, you can do that. the mp2 test suite has examples of how to do that
(besides the normal apache docs).
> Thanks a lot in advance
>
> I would like to thank the mod_perl community for mod_perl, it has made
> the development of this project fun! And it has kept me from having to
> go back to C programming. It was a long time since I last did that.
thanks for the kind words ;)
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com