You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Esteban Fernandez Stafford <es...@webtogo.de> on 2003/09/04 10:25:23 UTC

Content encoding when filtering proxyed pages


Hello all,

I have a machine acting as a proxy using mod_perl-1.99_09 with apache
2.0.46. This proxy is supposed to filter all html content. So far I
have achieved most of my project's goals. But there is one issue I
can't get straight, this is when the proxy gets a page that is
encoded (like in www.google.com). My first attempt was to DECLINE
filtering such pages, but the $filter->r()->content_encoding() always
gives me 'undef'. Is this something that is not yet implemented or am
I doing something wrong? (See code below) Then I tried looking at
$filter->r()->headers_out()->{'Content-Encoding'} and everything went
just fine!

On the other hand, is it possible that I could put mod_deflate before
my filter to get the content already decompressed for my filter to
parse?

   Thanks a lot in advance

I would like to thank the mod_perl community for mod_perl, it has made
the development of this project fun! And it has kept me from having to
go back to C programming. It was a long time since I last did that.


package WTG::HtmlFilter;

use strict;
use warnings;# FATAL => 'all';

use Apache::RequestRec ();
use Apache::RequestIO ();

use APR::Brigade ();
use APR::Bucket ();

use base qw(Apache::Filter);

use Apache::Const -compile => qw(OK M_POST);
use APR::Const -compile => ':common';

use constant READ_SIZE  => 1024;

use HTML::Parser ();

sub handler : FilterRequestHandler
{
   my $filter = shift;
   my $parser;

   # Initialize parser if not already done
   unless ($parser = $filter->ctx)
   {
      # This is the first call of the filter for a particular request
      # Can we filter this request?
      my $type = $filter->r()->content_type();
      if(! defined $type || $type !~ /^text\/html\b/)
      {
         $filter->remove();
         return Apache::DECLINED;
      }
      # This line gives me undefined
      print STDERR $filter->r()->content_type(), "\n";

blah... blah... blah...


                           E             s          t      e    b  a n!


:wq



-- 
Reporting bugs: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html


Re: Content encoding when filtering proxyed pages

Posted by Stas Bekman <st...@stason.org>.
Esteban Fernandez Stafford wrote:
> 
> Hello all,
> 
> I have a machine acting as a proxy using mod_perl-1.99_09 with apache
> 2.0.46. This proxy is supposed to filter all html content. So far I
> have achieved most of my project's goals. But there is one issue I
> can't get straight, this is when the proxy gets a page that is
> encoded (like in www.google.com). My first attempt was to DECLINE
> filtering such pages, but the $filter->r()->content_encoding() always
> gives me 'undef'. Is this something that is not yet implemented or am
> I doing something wrong? (See code below) Then I tried looking at
> $filter->r()->headers_out()->{'Content-Encoding'} and everything went
> just fine!

Looks like it's autogenerated, but probably not working, as I see that the 
corresponding call in the test is commented out:

t/response/TestAPI/request_rec.pm:    #content_encoding

Need to check why is it so.

> On the other hand, is it possible that I could put mod_deflate before
> my filter to get the content already decompressed for my filter to
> parse?

Sure, you can do that. the mp2 test suite has examples of how to do that 
(besides the normal apache docs).

>    Thanks a lot in advance
> 
> I would like to thank the mod_perl community for mod_perl, it has made
> the development of this project fun! And it has kept me from having to
> go back to C programming. It was a long time since I last did that.

thanks for the kind words ;)

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com