You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Bo Berglund <bo...@telia.com> on 2008/06/14 13:02:22 UTC

[users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

On Sat, 14 Jun 2008 09:32:26 +0200, André Warnier <aw...@ice-sa.com>
wrote:

>> 
>> HTTP/1.x 200 OK
>> Date: Sat, 14 Jun 2008 06:33:12 GMT
>> Server: Apache/2.0.53 (Fedora)
>> Last-Modified: Thu, 12 Jun 2008 14:10:29 GMT
>> Etag: "14fc-b9387f40"
>> Accept-Ranges: bytes
>> Content-Length: 5372
>> Cache-Control: no-transform
>> Keep-Alive: timeout=15, max=100
>> Connection: Keep-Alive
>> Content-Type: application/xml
>> Content-Encoding: gzip
>> 
>> ------------------ my test server  ------------------------
>> 
>> HTTP/1.x 200 OK
>> Date: Sat, 14 Jun 2008 06:34:38 GMT
>> Server: Apache/2.0.54 (Win32) PHP/4.4.7
>> Last-Modified: Thu, 12 Jun 2008 22:20:20 GMT
>> Etag: "55084-14fc-91160693"
>> Accept-Ranges: bytes
>> Content-Length: 5372
>> Keep-Alive: timeout=15, max=100
>> Connection: Keep-Alive
>> Content-Type: application/x-gzip
>> Content-Encoding: gzip
>> ------------------------------------------------------------------------------------
>> 
>> 
>> In the server responses I see these differences:
>> 
>> Cache-Control: no-transform  (not existing in test server)
>> Content-Type: application/xml
>> 
>> (test server has this instead:)
>> Content-Type: application/x-gzip
>> 
>> How is the tag "Content-Type" set in Apache?
>
>Exactly.  Because in the second case, the browser gets 
>"application/gzip" as the content-type, it thinks that what it has 
>received is ok as is, and does not unzip it.
>While in the first case, because it gets "application/xml", it "knows" 
>that the content is really xml, and that it must unzip it first.
>
>So new we must find what, in the first server, sets the content-type 
>that way.
>One more question : on the first server, is the original file on disk 
>already gzipped, or is it in xml (unzipped) format on the disk ?
>
>Since I don't have the configuration of the first server, I'm trying to 
>guess what it exactly does before it sends out the response.  It could 
>be taking an xml file, and gzipping it on-the-fly, before it sends it in 
>the response.
>Or else, it could be "cheating", taking the already gzipped file from 
>disk, and sending it as is, but "falsifying the headers" to tell the 
>browser to unzip it.
>It may be as simple as adding (or replacing) some line
>AddType application/xml .xml.gz
>

I changed httpd.conf like this:

<Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
    Options Indexes MultiViews Includes
    AllowOverride None
    Order allow,deny
    Allow from all
    AddType application/xml .xml.gz
    AddEncoding gzip .gz
    AddType text/xml .xml
    AddType text/html .shtml
</Directory>


But FireFox still offers to save the file rather than decompressing
and showing the xml like it does from the original server:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:39:58 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:19:12 GMT
Etag: "5b091-13b0-8d04e669"
Accept-Ranges: bytes
Content-Length: 5040
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-gzip
Content-Encoding: gzip
----------------------------------------------------------

With this change:
<Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
    Options Indexes MultiViews Includes
    AllowOverride None
    Order allow,deny
    Allow from all
    AddType application/xml .xml.gz
    AddType text/xml .xml
    AddType text/html .shtml
</Directory>


I get this instead:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:41:43 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:19:30 GMT
Etag: "5b225-1277-8e1f670e"
Accept-Ranges: bytes
Content-Length: 4727
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-gzip
----------------------------------------------------------

With this in place I started looking elsewhere in httpd.conf and found
this line, which I commented out:

AddType application/x-gzip .gz .tgz


What happened now is that FireFox displays an error message:

XML Parsing Error: not well-formed
Location: http://polaris/xmltv/svt1.svt.se_2008-06-15.xml.gz
Line Number 1, Column 1

and the headers now are:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:48:07 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:18:36 GMT
Etag: "5ae5a-169d-8aea1e6f"
Accept-Ranges: bytes
Content-Length: 5789
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/xml
----------------------------------------------------------

Probably now FireFox does not realize that the data are gzipped
anymore and tries to parse the binary compressed stream, which
obviously fails...
Have to re-enable this directive...

Bo Berglund


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

Posted by André Warnier <aw...@ice-sa.com>.
Bo Berglund wrote:
> 
> And now the headers become this when I access a xml.gz link:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 16:55:40 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:18:16 GMT
> Etag: "5ac36-159b-89b5a184"
> Accept-Ranges: bytes
> Content-Length: 5531
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: text/xml
> Content-Encoding: gzip
> 
> And FireFox displaye the *contents* of the gz file rather than offer
> to save it!
> BINGO!
> 
> A *really* great THANK YOU! for helpong me out!
> 
You're welcome.
My own satisfaction is that now you understand *why* it's happening.
So if something later doesn't work anymore, you can fix it.
For the full story, read : 
http://httpd.apache.org/docs/2.2/en/mod/mod_mime.html


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


[users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

Posted by Bo Berglund <bo...@telia.com>.
On Sat, 14 Jun 2008 13:59:38 +0200, André Warnier <aw...@ice-sa.com>
wrote:

>
>Add the following directive to the above section :
>  AddEncoding x-gzip .gz
>
>and try again
>
>> 
...
>> 
>> Probably now FireFox does not realize that the data are gzipped
>> anymore and tries to parse the binary compressed stream, which
>> obviously fails...
>
>Yes.  Because the server tells Firefox that the document is "text/xml" 
>and Firefox believes it.  That is the right thing to do for Firefox, 
>according to the corresponding Internet RFC's.
>(Unfortunately, that's not what IE does, but that is a whole separate 
>story, in which I hope we don't have to get).
>
>> Have to re-enable this directive...
>
>No.  Leave this one commented out :
> > AddType application/x-gzip .gz .tgz
>
>But add what I indicated above to your Directory section :
>  AddEncoding x-gzip .gz
>
>Note : I am also "fishing" to find the right settings.
>But you have to do this systematically, without getting lost about what 
>you add/remove, otherwise we will not know anymore.
>The important part is what the server sends as headers with the HTTP 
>response.
>We must get to a situation where it sends :
>Content-Type: text/xml  (or application/xml ?)
>Content-Encoding: gzip  (or x-gzip ?)
>
>So that Firefox knows that is is XML, but that it is gzipped.
>
>---------------------------------------------------------------------
>The official User-To-User support forum of the Apache HTTP Server Project.
>See <URL:http://httpd.apache.org/userslist.html> for more info.
>To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
>For additional commands, e-mail: users-help@httpd.apache.org
>

I think we/you got it now! :-)

I did this in the httpd.conf:

Commented out this in the genaral sections
# AddType application/x-gzip .gz .tgz

Made the directory look like this:

<Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
    Options Indexes MultiViews Includes
    AllowOverride None
    Order allow,deny
    Allow from all
    AddEncoding x-gzip .gz
    AddType application/xml .xml.gz
    AddType text/xml .xml
</Directory>



And now the headers become this when I access a xml.gz link:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 16:55:40 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:18:16 GMT
Etag: "5ac36-159b-89b5a184"
Accept-Ranges: bytes
Content-Length: 5531
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/xml
Content-Encoding: gzip

And FireFox displaye the *contents* of the gz file rather than offer
to save it!
BINGO!

A *really* great THANK YOU! for helpong me out!

Bo Berglund


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

Posted by André Warnier <aw...@ice-sa.com>.

Bo Berglund wrote:
> On Sat, 14 Jun 2008 09:32:26 +0200, André Warnier <aw...@ice-sa.com>
> wrote:
> 
>>> HTTP/1.x 200 OK
>>> Date: Sat, 14 Jun 2008 06:33:12 GMT
>>> Server: Apache/2.0.53 (Fedora)
>>> Last-Modified: Thu, 12 Jun 2008 14:10:29 GMT
>>> Etag: "14fc-b9387f40"
>>> Accept-Ranges: bytes
>>> Content-Length: 5372
>>> Cache-Control: no-transform
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Content-Type: application/xml
>>> Content-Encoding: gzip
>>>
>>> ------------------ my test server  ------------------------
>>>
>>> HTTP/1.x 200 OK
>>> Date: Sat, 14 Jun 2008 06:34:38 GMT
>>> Server: Apache/2.0.54 (Win32) PHP/4.4.7
>>> Last-Modified: Thu, 12 Jun 2008 22:20:20 GMT
>>> Etag: "55084-14fc-91160693"
>>> Accept-Ranges: bytes
>>> Content-Length: 5372
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Content-Type: application/x-gzip
>>> Content-Encoding: gzip
>>> ------------------------------------------------------------------------------------
>>>
>>>
>>> In the server responses I see these differences:
>>>
>>> Cache-Control: no-transform  (not existing in test server)
>>> Content-Type: application/xml
>>>
>>> (test server has this instead:)
>>> Content-Type: application/x-gzip
>>>
>>> How is the tag "Content-Type" set in Apache?
>> Exactly.  Because in the second case, the browser gets 
>> "application/gzip" as the content-type, it thinks that what it has 
>> received is ok as is, and does not unzip it.
>> While in the first case, because it gets "application/xml", it "knows" 
>> that the content is really xml, and that it must unzip it first.
>>
>> So new we must find what, in the first server, sets the content-type 
>> that way.
>> One more question : on the first server, is the original file on disk 
>> already gzipped, or is it in xml (unzipped) format on the disk ?
>>
>> Since I don't have the configuration of the first server, I'm trying to 
>> guess what it exactly does before it sends out the response.  It could 
>> be taking an xml file, and gzipping it on-the-fly, before it sends it in 
>> the response.
>> Or else, it could be "cheating", taking the already gzipped file from 
>> disk, and sending it as is, but "falsifying the headers" to tell the 
>> browser to unzip it.
>> It may be as simple as adding (or replacing) some line
>> AddType application/xml .xml.gz
>>
> 
> I changed httpd.conf like this:
> 
> <Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
>     Options Indexes MultiViews Includes
>     AllowOverride None
>     Order allow,deny
>     Allow from all
>     AddType application/xml .xml.gz
>     AddEncoding gzip .gz
>     AddType text/xml .xml
>     AddType text/html .shtml
> </Directory>
> 
> 
> But FireFox still offers to save the file rather than decompressing
> and showing the xml like it does from the original server:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:39:58 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:19:12 GMT
> Etag: "5b091-13b0-8d04e669"
> Accept-Ranges: bytes
> Content-Length: 5040
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: application/x-gzip
> Content-Encoding: gzip
> ----------------------------------------------------------
> 
> With this change:
> <Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
>     Options Indexes MultiViews Includes
>     AllowOverride None
>     Order allow,deny
>     Allow from all
>     AddType application/xml .xml.gz
>     AddType text/xml .xml
>     AddType text/html .shtml
> </Directory>
> 

Add the following directive to the above section :
  AddEncoding x-gzip .gz

and try again

> 
> I get this instead:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:41:43 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:19:30 GMT
> Etag: "5b225-1277-8e1f670e"
> Accept-Ranges: bytes
> Content-Length: 4727
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: application/x-gzip
> ----------------------------------------------------------
> 
> With this in place I started looking elsewhere in httpd.conf and found
> this line, which I commented out:
> 
> AddType application/x-gzip .gz .tgz
> 
> 
> What happened now is that FireFox displays an error message:
> 
> XML Parsing Error: not well-formed
> Location: http://polaris/xmltv/svt1.svt.se_2008-06-15.xml.gz
> Line Number 1, Column 1
> 
> and the headers now are:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:48:07 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:18:36 GMT
> Etag: "5ae5a-169d-8aea1e6f"
> Accept-Ranges: bytes
> Content-Length: 5789
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: text/xml
> ----------------------------------------------------------
> 
> Probably now FireFox does not realize that the data are gzipped
> anymore and tries to parse the binary compressed stream, which
> obviously fails...

Yes.  Because the server tells Firefox that the document is "text/xml" 
and Firefox believes it.  That is the right thing to do for Firefox, 
according to the corresponding Internet RFC's.
(Unfortunately, that's not what IE does, but that is a whole separate 
story, in which I hope we don't have to get).

> Have to re-enable this directive...

No.  Leave this one commented out :
 > AddType application/x-gzip .gz .tgz

But add what I indicated above to your Directory section :
  AddEncoding x-gzip .gz

Note : I am also "fishing" to find the right settings.
But you have to do this systematically, without getting lost about what 
you add/remove, otherwise we will not know anymore.
The important part is what the server sends as headers with the HTTP 
response.
We must get to a situation where it sends :
Content-Type: text/xml  (or application/xml ?)
Content-Encoding: gzip  (or x-gzip ?)

So that Firefox knows that is is XML, but that it is gzipped.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org