You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Alexander Charbonnet <al...@charbonnet.com> on 2005/08/26 10:10:14 UTC

Custom deflation gzip filter [mp2]

Hi,

I'm in a situation where I need the ability to flush the output of my CGI 
script to the client, so that it can display a partial page, and I also need 
to use gzip compression (only if the client supports it, of course).  This 
could be done in Apache 1 by using Dynagzip.

I couldn't make the built-in Deflate work, because it ignores flush commands 
and only writes output when its buffer is full.

So I wrote a new filter that generates gzipped output, and pushes data to the 
client upon a flush.  The filter is active only when it detects (and 
disables) the built-in Deflate filter, the idea being that the conditions for 
gzipping could be "borrowed" from existing Deflate options.  It turned out to 
be a lot smaller and easier than I thought it would be.  Even so, it's 
probably too big to post here; see it at: 
http://www.charbonnet.com/ChunkGzip.pm

It appears to work flawlessly with Gecko browsers.  The problem I'm having is 
with Internet Explorer 6.x.  Everything appears to be working normally (it 
displays the data up to each flush) until the page actually finishes 
downloading, at which time IE reloads the page and starts the script all over 
again, for a reason I don't understand.

Here's an extremely simple example of a script that might take advantage of 
this functionality which, only when viewed in IE, displays the odd reloading 
behavior:

use CGI qw(:standard);
my $r = shift;
print
    header,
    start_html,
    p('starting'),
  ;
$r->rflush;
sleep 3;    #"doing work"
print
    p('finished'),
    end_html,
  ;

I'm running the Debian Sarge versions of everything: Apache 2.0.54-4, mod_perl 
1.999.21-1.

This is my first attempt at an Apache filter, so there may simply be a glaring 
mistake or omission that's causing this.  I'd appreciate any advice you have!

Thanks,
Alex
alexander@charbonnet.com

Re: Custom deflation gzip filter [mp2]

Posted by Alexander Charbonnet <al...@charbonnet.com>.
Wow, I hadn't realized there was such a big change that didn't make it to 
Sarge.  What a mess.  Fortunately, there appears to be a backport; that'll 
make things easy.  Thanks for letting me know; that would have been a nasty 
surprise.

Unforunately, it doesn't seem like that would cause my IE problem, but I guess 
it's possible.  I'll give it a try tomorrow.


On Friday 26 August 2005 03:42 am, Philip M. Gollucci wrote:
> Alexander Charbonnet wrote:
> > I'm running the Debian Sarge versions of everything: Apache 2.0.54-4,
> > mod_perl 1.999.21-1.
>
> I'd update to something after RC5 so that you don't use an unsupported API
> of mod_perl2.
>
> see:
> http://perl.apache.org/docs/2.0/rename.html

Re: Custom deflation gzip filter [mp2] [SOLVED]

Posted by Alexander Charbonnet <al...@charbonnet.com>.
In case anyone cares, I finally realized what the difference is between these 
two code snippets: context.  In array context, flush() returns an array 
containing the output as well as the return code.  In scalar context, flush() 
returns just the output.  The print function, of course, asks for an array.  
Kind of a "duh" moment.


On Saturday 27 August 2005 01:18 pm, Malcolm J Harwood wrote:
> On Friday 26 August 2005 06:25 pm, Alexander Charbonnet wrote:
> > It apparently works after I played with the code for the final flush. 
> > I'm not sure why, though.  There was only one change (below).  Anybody
> > see a significant difference?
> >
> >
> > In any case, I'll take it, since it works now.  :-)
> >
> >
> > ------Original (broken) code------------
> >         $f->print(join '',
> >                   $state_ref->{'handle'}->flush(),
> >                   pack("V V",
> >                            crc32($state_ref->{'body'}),
> >                            length($state_ref->{'body'})),
> >                 );
> > -------------------------------------------
>
> My guess is that the above evaluates the pack before the flush in order to
> pass the results to join(), so it's sending out the wrong length/crc.
>
> > ------Working code----------------------
> >         $final_output = $state_ref->{'handle'}->flush();
> >
> >         $f->print(join '',
> >                   $final_output,
> >                   pack("V V",
> >                            crc32($state_ref->{'body'}),
> >                            length($state_ref->{'body'})),
> >                 );
> > ------------------------------------------
>
> This forces the flush to be evaluated first.

Re: Custom deflation gzip filter [mp2]

Posted by Alexander Charbonnet <al...@charbonnet.com>.
Sounds plausible, except that the crc and length are computed based on the 
uncomressed copy of the original text (in $state_ref->{'body'}), which is 
unchanged by the flush.


On Saturday 27 August 2005 01:18 pm, Malcolm J Harwood wrote:
> On Friday 26 August 2005 06:25 pm, Alexander Charbonnet wrote:
> > It apparently works after I played with the code for the final flush. 
> > I'm not sure why, though.  There was only one change (below).  Anybody
> > see a significant difference?
> >
> >
> > In any case, I'll take it, since it works now.  :-)
> >
> >
> > ------Original (broken) code------------
> >         $f->print(join '',
> >                   $state_ref->{'handle'}->flush(),
> >                   pack("V V",
> >                            crc32($state_ref->{'body'}),
> >                            length($state_ref->{'body'})),
> >                 );
> > -------------------------------------------
>
> My guess is that the above evaluates the pack before the flush in order to
> pass the results to join(), so it's sending out the wrong length/crc.
>
> > ------Working code----------------------
> >         $final_output = $state_ref->{'handle'}->flush();
> >
> >         $f->print(join '',
> >                   $final_output,
> >                   pack("V V",
> >                            crc32($state_ref->{'body'}),
> >                            length($state_ref->{'body'})),
> >                 );
> > ------------------------------------------
>
> This forces the flush to be evaluated first.

Re: Custom deflation gzip filter [mp2]

Posted by Malcolm J Harwood <mj...@liminalflux.net>.
On Friday 26 August 2005 06:25 pm, Alexander Charbonnet wrote:

> It apparently works after I played with the code for the final flush.  I'm
> not sure why, though.  There was only one change (below).  Anybody see a
> significant difference?


> In any case, I'll take it, since it works now.  :-)
>
>
> ------Original (broken) code------------
>         $f->print(join '',
>                   $state_ref->{'handle'}->flush(),
>                   pack("V V",
>                            crc32($state_ref->{'body'}),
>                            length($state_ref->{'body'})),
>                 );
> -------------------------------------------

My guess is that the above evaluates the pack before the flush in order to 
pass the results to join(), so it's sending out the wrong length/crc. 

> ------Working code----------------------
>         $final_output = $state_ref->{'handle'}->flush();
>
>         $f->print(join '',
>                   $final_output,
>                   pack("V V",
>                            crc32($state_ref->{'body'}),
>                            length($state_ref->{'body'})),
>                 );
> ------------------------------------------

This forces the flush to be evaluated first.


-- 
Crowley [the demon] had been extremely impressed with the warranties
offered by the computer industry, and had in fact sent a bundle Below
to the department that drew up the Immortal Soul agreements, with a
yellow memo form attached just saying: "Learn, guys."
- Good Omens, Terry Pratchett & Neil Gaiman

Re: Custom deflation gzip filter [mp2]

Posted by Alexander Charbonnet <al...@charbonnet.com>.
It apparently works after I played with the code for the final flush.  I'm not 
sure why, though.  There was only one change (below).  Anybody see a 
significant difference?

In any case, I'll take it, since it works now.  :-)


------Original (broken) code------------
        $f->print(join '',
                  $state_ref->{'handle'}->flush(),
                  pack("V V",
                           crc32($state_ref->{'body'}),
                           length($state_ref->{'body'})),
                );
-------------------------------------------


------Working code----------------------
        $final_output = $state_ref->{'handle'}->flush();

        $f->print(join '',
                  $final_output,
                  pack("V V",
                           crc32($state_ref->{'body'}),
                           length($state_ref->{'body'})),
                );
------------------------------------------


On Friday 26 August 2005 03:11 pm, Alexander Charbonnet wrote:
> Okay, I've tried the new filter on a Gentoo system: Apache 2.0.54-r13,
> mod_perl 2.0.1-r2.  It helped me to port the filter to the correct API
> (which only required a couple of changes, thankfully), but it didn't fix
> the weird IE problem.
>
> I know I'm not alone in wanting this feature (being able to flush
> compressed output), since wherever CGI and compression are discussed, this
> is the main disadvantage.  If we could get this filter working it would
> give mod_perl a big advantage over PHP for applications where this is
> needed.
>
> The updated filter resides at: http://www.charbonnet.com/ChunkGzip.pm
>
> I modified my test script (below) to print the time, as well as to do an
> additional flush just to be sure.
>
> Here's what happens in my IE window:
>
> prints "starting 1125086461"
> pauses 3 seconds
> prints "finished 1125086464"; this is where everything should stop
> status displays "Web site found, waiting for reply"
> screen clears
> prints "starting 1125086464"
> done
>
> The browser got the finished output, and everything was fine for a split
> second, until it decided to reload the page and only print part of the
> output the second time.
>
> Is there something I'm not setting?  Is there a special IE keepalive or
> something that could fix this?  It does still work perfectly in Gecko and
> KHTML.
>
> Thanks again for your help,
> Alex
>
>
> Test script:
> ---------------
> use CGI qw(:standard);
> my $r = shift;
> print
>     header,
>     start_html,
>     p('starting '.time),
>   ;
> $r->rflush;
> sleep 3;    #"doing work"
> print
>     p('finished '.time),
>     end_html,
>   ;
> $r->rflush;
>
> On Friday 26 August 2005 03:42 am, you wrote:
> > Alexander Charbonnet wrote:
> > > I'm running the Debian Sarge versions of everything: Apache 2.0.54-4,
> > > mod_perl 1.999.21-1.
> >
> > I'd update to something after RC5 so that you don't use an unsupported
> > API of mod_perl2.
> >
> > see:
> > http://perl.apache.org/docs/2.0/rename.html

Re: Custom deflation gzip filter [mp2]

Posted by Alexander Charbonnet <al...@charbonnet.com>.
Okay, I've tried the new filter on a Gentoo system: Apache 2.0.54-r13, 
mod_perl 2.0.1-r2.  It helped me to port the filter to the correct API (which 
only required a couple of changes, thankfully), but it didn't fix the weird 
IE problem.

I know I'm not alone in wanting this feature (being able to flush compressed 
output), since wherever CGI and compression are discussed, this is the main 
disadvantage.  If we could get this filter working it would give mod_perl a 
big advantage over PHP for applications where this is needed.

The updated filter resides at: http://www.charbonnet.com/ChunkGzip.pm

I modified my test script (below) to print the time, as well as to do an 
additional flush just to be sure.

Here's what happens in my IE window:

prints "starting 1125086461"
pauses 3 seconds
prints "finished 1125086464"; this is where everything should stop
status displays "Web site found, waiting for reply"
screen clears
prints "starting 1125086464"
done

The browser got the finished output, and everything was fine for a split 
second, until it decided to reload the page and only print part of the output 
the second time.

Is there something I'm not setting?  Is there a special IE keepalive or 
something that could fix this?  It does still work perfectly in Gecko and 
KHTML.

Thanks again for your help,
Alex


Test script:
---------------
use CGI qw(:standard);
my $r = shift;
print
    header,
    start_html,
    p('starting '.time),
  ;
$r->rflush;
sleep 3;    #"doing work"
print
    p('finished '.time),
    end_html,
  ;
$r->rflush;


On Friday 26 August 2005 03:42 am, you wrote:
> Alexander Charbonnet wrote:
> > I'm running the Debian Sarge versions of everything: Apache 2.0.54-4,
> > mod_perl 1.999.21-1.
>
> I'd update to something after RC5 so that you don't use an unsupported API
> of mod_perl2.
>
> see:
> http://perl.apache.org/docs/2.0/rename.html

Re: Custom deflation gzip filter [mp2]

Posted by "Philip M. Gollucci" <pg...@p6m7g8.com>.
Alexander Charbonnet wrote:
> I'm running the Debian Sarge versions of everything: Apache 2.0.54-4, mod_perl 
> 1.999.21-1.
I'd update to something after RC5 so that you don't use an unsupported API of 
mod_perl2.

see:
http://perl.apache.org/docs/2.0/rename.html


-- 
END
------------------------------------------------------------
     What doesn't kill us can only make us stronger.
                 Nothing is impossible.
				
Philip M. Gollucci (pgollucci@p6m7g8.com) 301.254.5198
Consultant / http://p6m7g8.net/Resume/
Senior Developer / Liquidity Services, Inc.
   http://www.liquidityservicesinc.com
        http://www.liquidation.com
        http://www.uksurplus.com
        http://www.govliquidation.com
        http://www.gowholesale.com