You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Bo Berglund <bo...@telia.com> on 2008/06/14 08:40:49 UTC

[users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

On Sat, 14 Jun 2008 00:47:03 +0200, André Warnier <aw...@ice-sa.com>
wrote:

>
>The above (the response from the server) means that your browser will 
>serve the object from it's cache, so it doesn't tell us much.
>
>Clear the browser cache, get the same URL from server1 again.
>(Or press SHIFT and click the reload/refresh button of the browser).
>Then clear the browser cache again, and get it again from server2.
>Re-post the results here.
>

After clearing the FireFox cache:

----------------- live site ----------------------------
http://xmltv.tvsajten.com/xmltv/svt1.svt.se_2008-06-19.xml.gz

GET /xmltv/svt1.svt.se_2008-06-19.xml.gz HTTP/1.1
Host: xmltv.tvsajten.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept:
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://polaris/xmltv/GetXMLTVFiles.html

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 06:33:12 GMT
Server: Apache/2.0.53 (Fedora)
Last-Modified: Thu, 12 Jun 2008 14:10:29 GMT
Etag: "14fc-b9387f40"
Accept-Ranges: bytes
Content-Length: 5372
Cache-Control: no-transform
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/xml
Content-Encoding: gzip

------------------ my test server  ------------------------
http://polaris/xmltv/svt1.svt.se_2008-06-19.xml.gz

GET /xmltv/svt1.svt.se_2008-06-19.xml.gz HTTP/1.1
Host: polaris
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept:
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://polaris/xmltv/index.html

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 06:34:38 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:20:20 GMT
Etag: "55084-14fc-91160693"
Accept-Ranges: bytes
Content-Length: 5372
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-gzip
Content-Encoding: gzip
------------------------------------------------------------------------------------


In the server responses I see these differences:

Cache-Control: no-transform  (not existing in test server)
Content-Type: application/xml

(test server has this instead:)
Content-Type: application/x-gzip

How is the tag "Content-Type" set in Apache?

Bo Berglund


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

Posted by André Warnier <aw...@ice-sa.com>.
Bo Berglund wrote:
> 
> And now the headers become this when I access a xml.gz link:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 16:55:40 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:18:16 GMT
> Etag: "5ac36-159b-89b5a184"
> Accept-Ranges: bytes
> Content-Length: 5531
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: text/xml
> Content-Encoding: gzip
> 
> And FireFox displaye the *contents* of the gz file rather than offer
> to save it!
> BINGO!
> 
> A *really* great THANK YOU! for helpong me out!
> 
You're welcome.
My own satisfaction is that now you understand *why* it's happening.
So if something later doesn't work anymore, you can fix it.
For the full story, read : 
http://httpd.apache.org/docs/2.2/en/mod/mod_mime.html


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


[users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

Posted by Bo Berglund <bo...@telia.com>.
On Sat, 14 Jun 2008 13:59:38 +0200, André Warnier <aw...@ice-sa.com>
wrote:

>
>Add the following directive to the above section :
>  AddEncoding x-gzip .gz
>
>and try again
>
>> 
...
>> 
>> Probably now FireFox does not realize that the data are gzipped
>> anymore and tries to parse the binary compressed stream, which
>> obviously fails...
>
>Yes.  Because the server tells Firefox that the document is "text/xml" 
>and Firefox believes it.  That is the right thing to do for Firefox, 
>according to the corresponding Internet RFC's.
>(Unfortunately, that's not what IE does, but that is a whole separate 
>story, in which I hope we don't have to get).
>
>> Have to re-enable this directive...
>
>No.  Leave this one commented out :
> > AddType application/x-gzip .gz .tgz
>
>But add what I indicated above to your Directory section :
>  AddEncoding x-gzip .gz
>
>Note : I am also "fishing" to find the right settings.
>But you have to do this systematically, without getting lost about what 
>you add/remove, otherwise we will not know anymore.
>The important part is what the server sends as headers with the HTTP 
>response.
>We must get to a situation where it sends :
>Content-Type: text/xml  (or application/xml ?)
>Content-Encoding: gzip  (or x-gzip ?)
>
>So that Firefox knows that is is XML, but that it is gzipped.
>
>---------------------------------------------------------------------
>The official User-To-User support forum of the Apache HTTP Server Project.
>See <URL:http://httpd.apache.org/userslist.html> for more info.
>To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
>For additional commands, e-mail: users-help@httpd.apache.org
>

I think we/you got it now! :-)

I did this in the httpd.conf:

Commented out this in the genaral sections
# AddType application/x-gzip .gz .tgz

Made the directory look like this:

<Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
    Options Indexes MultiViews Includes
    AllowOverride None
    Order allow,deny
    Allow from all
    AddEncoding x-gzip .gz
    AddType application/xml .xml.gz
    AddType text/xml .xml
</Directory>



And now the headers become this when I access a xml.gz link:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 16:55:40 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:18:16 GMT
Etag: "5ac36-159b-89b5a184"
Accept-Ranges: bytes
Content-Length: 5531
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/xml
Content-Encoding: gzip

And FireFox displaye the *contents* of the gz file rather than offer
to save it!
BINGO!

A *really* great THANK YOU! for helpong me out!

Bo Berglund


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

Posted by André Warnier <aw...@ice-sa.com>.

Bo Berglund wrote:
> On Sat, 14 Jun 2008 09:32:26 +0200, André Warnier <aw...@ice-sa.com>
> wrote:
> 
>>> HTTP/1.x 200 OK
>>> Date: Sat, 14 Jun 2008 06:33:12 GMT
>>> Server: Apache/2.0.53 (Fedora)
>>> Last-Modified: Thu, 12 Jun 2008 14:10:29 GMT
>>> Etag: "14fc-b9387f40"
>>> Accept-Ranges: bytes
>>> Content-Length: 5372
>>> Cache-Control: no-transform
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Content-Type: application/xml
>>> Content-Encoding: gzip
>>>
>>> ------------------ my test server  ------------------------
>>>
>>> HTTP/1.x 200 OK
>>> Date: Sat, 14 Jun 2008 06:34:38 GMT
>>> Server: Apache/2.0.54 (Win32) PHP/4.4.7
>>> Last-Modified: Thu, 12 Jun 2008 22:20:20 GMT
>>> Etag: "55084-14fc-91160693"
>>> Accept-Ranges: bytes
>>> Content-Length: 5372
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Content-Type: application/x-gzip
>>> Content-Encoding: gzip
>>> ------------------------------------------------------------------------------------
>>>
>>>
>>> In the server responses I see these differences:
>>>
>>> Cache-Control: no-transform  (not existing in test server)
>>> Content-Type: application/xml
>>>
>>> (test server has this instead:)
>>> Content-Type: application/x-gzip
>>>
>>> How is the tag "Content-Type" set in Apache?
>> Exactly.  Because in the second case, the browser gets 
>> "application/gzip" as the content-type, it thinks that what it has 
>> received is ok as is, and does not unzip it.
>> While in the first case, because it gets "application/xml", it "knows" 
>> that the content is really xml, and that it must unzip it first.
>>
>> So new we must find what, in the first server, sets the content-type 
>> that way.
>> One more question : on the first server, is the original file on disk 
>> already gzipped, or is it in xml (unzipped) format on the disk ?
>>
>> Since I don't have the configuration of the first server, I'm trying to 
>> guess what it exactly does before it sends out the response.  It could 
>> be taking an xml file, and gzipping it on-the-fly, before it sends it in 
>> the response.
>> Or else, it could be "cheating", taking the already gzipped file from 
>> disk, and sending it as is, but "falsifying the headers" to tell the 
>> browser to unzip it.
>> It may be as simple as adding (or replacing) some line
>> AddType application/xml .xml.gz
>>
> 
> I changed httpd.conf like this:
> 
> <Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
>     Options Indexes MultiViews Includes
>     AllowOverride None
>     Order allow,deny
>     Allow from all
>     AddType application/xml .xml.gz
>     AddEncoding gzip .gz
>     AddType text/xml .xml
>     AddType text/html .shtml
> </Directory>
> 
> 
> But FireFox still offers to save the file rather than decompressing
> and showing the xml like it does from the original server:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:39:58 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:19:12 GMT
> Etag: "5b091-13b0-8d04e669"
> Accept-Ranges: bytes
> Content-Length: 5040
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: application/x-gzip
> Content-Encoding: gzip
> ----------------------------------------------------------
> 
> With this change:
> <Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
>     Options Indexes MultiViews Includes
>     AllowOverride None
>     Order allow,deny
>     Allow from all
>     AddType application/xml .xml.gz
>     AddType text/xml .xml
>     AddType text/html .shtml
> </Directory>
> 

Add the following directive to the above section :
  AddEncoding x-gzip .gz

and try again

> 
> I get this instead:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:41:43 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:19:30 GMT
> Etag: "5b225-1277-8e1f670e"
> Accept-Ranges: bytes
> Content-Length: 4727
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: application/x-gzip
> ----------------------------------------------------------
> 
> With this in place I started looking elsewhere in httpd.conf and found
> this line, which I commented out:
> 
> AddType application/x-gzip .gz .tgz
> 
> 
> What happened now is that FireFox displays an error message:
> 
> XML Parsing Error: not well-formed
> Location: http://polaris/xmltv/svt1.svt.se_2008-06-15.xml.gz
> Line Number 1, Column 1
> 
> and the headers now are:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:48:07 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:18:36 GMT
> Etag: "5ae5a-169d-8aea1e6f"
> Accept-Ranges: bytes
> Content-Length: 5789
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: text/xml
> ----------------------------------------------------------
> 
> Probably now FireFox does not realize that the data are gzipped
> anymore and tries to parse the binary compressed stream, which
> obviously fails...

Yes.  Because the server tells Firefox that the document is "text/xml" 
and Firefox believes it.  That is the right thing to do for Firefox, 
according to the corresponding Internet RFC's.
(Unfortunately, that's not what IE does, but that is a whole separate 
story, in which I hope we don't have to get).

> Have to re-enable this directive...

No.  Leave this one commented out :
 > AddType application/x-gzip .gz .tgz

But add what I indicated above to your Directory section :
  AddEncoding x-gzip .gz

Note : I am also "fishing" to find the right settings.
But you have to do this systematically, without getting lost about what 
you add/remove, otherwise we will not know anymore.
The important part is what the server sends as headers with the HTTP 
response.
We must get to a situation where it sends :
Content-Type: text/xml  (or application/xml ?)
Content-Encoding: gzip  (or x-gzip ?)

So that Firefox knows that is is XML, but that it is gzipped.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


[users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

Posted by Bo Berglund <bo...@telia.com>.
On Sat, 14 Jun 2008 09:32:26 +0200, André Warnier <aw...@ice-sa.com>
wrote:

>> 
>> HTTP/1.x 200 OK
>> Date: Sat, 14 Jun 2008 06:33:12 GMT
>> Server: Apache/2.0.53 (Fedora)
>> Last-Modified: Thu, 12 Jun 2008 14:10:29 GMT
>> Etag: "14fc-b9387f40"
>> Accept-Ranges: bytes
>> Content-Length: 5372
>> Cache-Control: no-transform
>> Keep-Alive: timeout=15, max=100
>> Connection: Keep-Alive
>> Content-Type: application/xml
>> Content-Encoding: gzip
>> 
>> ------------------ my test server  ------------------------
>> 
>> HTTP/1.x 200 OK
>> Date: Sat, 14 Jun 2008 06:34:38 GMT
>> Server: Apache/2.0.54 (Win32) PHP/4.4.7
>> Last-Modified: Thu, 12 Jun 2008 22:20:20 GMT
>> Etag: "55084-14fc-91160693"
>> Accept-Ranges: bytes
>> Content-Length: 5372
>> Keep-Alive: timeout=15, max=100
>> Connection: Keep-Alive
>> Content-Type: application/x-gzip
>> Content-Encoding: gzip
>> ------------------------------------------------------------------------------------
>> 
>> 
>> In the server responses I see these differences:
>> 
>> Cache-Control: no-transform  (not existing in test server)
>> Content-Type: application/xml
>> 
>> (test server has this instead:)
>> Content-Type: application/x-gzip
>> 
>> How is the tag "Content-Type" set in Apache?
>
>Exactly.  Because in the second case, the browser gets 
>"application/gzip" as the content-type, it thinks that what it has 
>received is ok as is, and does not unzip it.
>While in the first case, because it gets "application/xml", it "knows" 
>that the content is really xml, and that it must unzip it first.
>
>So new we must find what, in the first server, sets the content-type 
>that way.
>One more question : on the first server, is the original file on disk 
>already gzipped, or is it in xml (unzipped) format on the disk ?
>
>Since I don't have the configuration of the first server, I'm trying to 
>guess what it exactly does before it sends out the response.  It could 
>be taking an xml file, and gzipping it on-the-fly, before it sends it in 
>the response.
>Or else, it could be "cheating", taking the already gzipped file from 
>disk, and sending it as is, but "falsifying the headers" to tell the 
>browser to unzip it.
>It may be as simple as adding (or replacing) some line
>AddType application/xml .xml.gz
>

I changed httpd.conf like this:

<Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
    Options Indexes MultiViews Includes
    AllowOverride None
    Order allow,deny
    Allow from all
    AddType application/xml .xml.gz
    AddEncoding gzip .gz
    AddType text/xml .xml
    AddType text/html .shtml
</Directory>


But FireFox still offers to save the file rather than decompressing
and showing the xml like it does from the original server:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:39:58 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:19:12 GMT
Etag: "5b091-13b0-8d04e669"
Accept-Ranges: bytes
Content-Length: 5040
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-gzip
Content-Encoding: gzip
----------------------------------------------------------

With this change:
<Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
    Options Indexes MultiViews Includes
    AllowOverride None
    Order allow,deny
    Allow from all
    AddType application/xml .xml.gz
    AddType text/xml .xml
    AddType text/html .shtml
</Directory>


I get this instead:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:41:43 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:19:30 GMT
Etag: "5b225-1277-8e1f670e"
Accept-Ranges: bytes
Content-Length: 4727
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-gzip
----------------------------------------------------------

With this in place I started looking elsewhere in httpd.conf and found
this line, which I commented out:

AddType application/x-gzip .gz .tgz


What happened now is that FireFox displays an error message:

XML Parsing Error: not well-formed
Location: http://polaris/xmltv/svt1.svt.se_2008-06-15.xml.gz
Line Number 1, Column 1

and the headers now are:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:48:07 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:18:36 GMT
Etag: "5ae5a-169d-8aea1e6f"
Accept-Ranges: bytes
Content-Length: 5789
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/xml
----------------------------------------------------------

Probably now FireFox does not realize that the data are gzipped
anymore and tries to parse the binary compressed stream, which
obviously fails...
Have to re-enable this directive...

Bo Berglund


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: How to configure Apache 2 to compress xml files on serving?

Posted by André Warnier <aw...@ice-sa.com>.
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 06:33:12 GMT
> Server: Apache/2.0.53 (Fedora)
> Last-Modified: Thu, 12 Jun 2008 14:10:29 GMT
> Etag: "14fc-b9387f40"
> Accept-Ranges: bytes
> Content-Length: 5372
> Cache-Control: no-transform
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: application/xml
> Content-Encoding: gzip
> 
> ------------------ my test server  ------------------------
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 06:34:38 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:20:20 GMT
> Etag: "55084-14fc-91160693"
> Accept-Ranges: bytes
> Content-Length: 5372
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: application/x-gzip
> Content-Encoding: gzip
> ------------------------------------------------------------------------------------
> 
> 
> In the server responses I see these differences:
> 
> Cache-Control: no-transform  (not existing in test server)
> Content-Type: application/xml
> 
> (test server has this instead:)
> Content-Type: application/x-gzip
> 
> How is the tag "Content-Type" set in Apache?

Exactly.  Because in the second case, the browser gets 
"application/gzip" as the content-type, it thinks that what it has 
received is ok as is, and does not unzip it.
While in the first case, because it gets "application/xml", it "knows" 
that the content is really xml, and that it must unzip it first.

So new we must find what, in the first server, sets the content-type 
that way.
One more question : on the first server, is the original file on disk 
already gzipped, or is it in xml (unzipped) format on the disk ?

Since I don't have the configuration of the first server, I'm trying to 
guess what it exactly does before it sends out the response.  It could 
be taking an xml file, and gzipping it on-the-fly, before it sends it in 
the response.
Or else, it could be "cheating", taking the already gzipped file from 
disk, and sending it as is, but "falsifying the headers" to tell the 
browser to unzip it.
It may be as simple as adding (or replacing) some line
AddType application/xml .xml.gz

André


> 
> Bo Berglund
> 
> 
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>    "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
> 

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org