You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Devendra Singh <ds...@indiamart.com> on 2005/04/21 06:34:54 UTC

mod_cache caching the 301 Moved Permanently

Hi,

I am writing to the Developer List because I did not get any response on 
the Users List and thought that the topic might be relevant to the dev list.

If a request comes for a directory w/o trailing slash, it gets cached and 
the subsequent requests see:

Moved Permanently
The document has moved here

for try to access the URL:
http://www.beach-clothing.com/where-to-buy

The configuration has two apache 2.0.53. The one running on port 80 has 
mod_disk_cache which sends requests to 8080 via ProxyPassReverse.

Here is my Virtual Host Config:

Apache on Port 80:
<VirtualHost 66.235.181.69>
ServerAlias beach-clothing.com www.beach-clothing.com
ServerName www.beach-clothing.com
ServerAdmin webmaster@indiamart.com
DocumentRoot /var/www/public_html/beach-clothing
ServerPath /beach-clothing
RewriteEngine On
RewriteCond %{HTTP_HOST}   !^www\.beach-clothing\.com [NC]
RewriteCond %{HTTP_HOST}   !^$
RewriteRule ^/(.*)         http://www.beach-clothing.com/$1 [L,R=301]
#Enable Caching
CacheDisable /index.html
CacheEnable disk /
RewriteCond %{SERVER_PORT} ^25$
RewriteRule /* - [F]
RewriteRule ^(.*)\/$ $1/index.html [P]
RewriteRule \.(gif|jpg|png|txt|css|js|ico|swf)$ - [last]
RewriteRule ^/(.*)$ http://www.beach-clothing.com:8080/$1 [proxy]
ProxyPassReverse / http://www.beach-clothing.com:8080/
</VirtualHost>

Apache on port 8080:
<VirtualHost 66.235.181.69>
ServerAlias beach-clothing.com www.beach-clothing.com
ServerName www.beach-clothing.com
ServerAdmin webmaster@indiamart.com
DocumentRoot /var/www/public_html/beach-clothing
#Enable Caching for 48 Hours
<Location /cgi>
         ExpiresActive off
</Location>
<FilesMatch "\.(mp|pl|cgi)$">
         ExpiresActive off
</FilesMatch>
<Location />
         ExpiresActive on
         ExpiresByType text/html A432000
</Location>
</VirtualHost>

UseCanonicalName is off into the Server Config Section of both the Apache.

How do I solve this problem? If anyone has any clue / pointer?

Thanks.

Devendra Singh


______________________________________________________
Devendra Singh
IndiaMART InterMESH Limited
(Global Gateway to Indian Market Place)
B-1, Sector 8, Noida, UP - 201301, India
EPABX : +91-120-2424945, +91-120-3094634, +91-9810646342
Fax: +91-120-2424943
http://www.indiamart.com
http://www.indiangiftsportal.com
http://www.indiantravelportal.com
______________________________________________________ 


Re: mod_cache caching the 301 Moved Permanently

Posted by Devendra Singh <ds...@indiamart.com>.
At 22/04/2005 11:21 (), Olaf van der Spek wrote:
>On 4/22/05, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
> > On Thu, Apr 21, 2005 at 10:04:54AM +0530, Devendra Singh wrote:
> > > Hi,
> > >
> > > I am writing to the Developer List because I did not get any response on
> > > the Users List and thought that the topic might be relevant to the 
> dev list.
> > >
> > > If a request comes for a directory w/o trailing slash, it gets cached and
> > > the subsequent requests see:
> > >
> > > Moved Permanently
> > > The document has moved here
> > >
> > > for try to access the URL:
> > > http://www.beach-clothing.com/where-to-buy
> >
> > I don't get it.  What's your problem?  -- justin
>
>The 'here' link is to http://www.beach-clothing.com:8080/where-to-buy/
>while he wants it do be to http://www.beach-clothing.com/where-to-buy/

When the Front-End Apache (running on port 80) gets a request for 
http://www.beach-clothing.com/where-to-buy (notice the lack of trailing 
slash), it issues a 301 Moved Permanently to 
http://www.beach-clothing.com/where-to-buy/ (notice the added trailing 
slash). The Front-End Apache proxies this request to Back-End Apache 
running on 8080 which becomes 
http://www.beach-clothing.com:8080/where-to-buy/ and this is Cached by the 
Front-End Apache. The subsequent request for this URL gives 
http://www.beach-clothing.com:8080/where-to-buy/ from the Cache.

This is not as expected. Thus I feel that either 301 responses should not 
be proxied or they should not be Cached at all.

Thanks.

Devendra Singh


Re: mod_cache caching the 301 Moved Permanently

Posted by r....@t-online.de.

Sander Striker wrote:
> r.pluem@t-online.de wrote:
> 
>> The problem seems to be, that the proxied backend server that is
>> cached via mod_disk_cache originally
>> delivers HTTP status 301 and the Location
>> http://www.beach-clothing.com/where-to-buy/, but once cached
>> mod_disk_cache delivers HTTP status 200 instead of 301 (but correctly
>> redelivering the Location header).
>> I have not proved this for myself so far, but this seems the problem
>> to me.
> 
> 
> This wouldn't surprise me one bit.  The 2.1 branch has seen quite a bit
> of churn in this
> area.
> 
> Any chance you could give 2.1 a go and see if that works correctly?

Sorry not enough time right now to check this on the living object. But I think what
is missing is something like the following (untested) patch (done against 2.0.54, but
code in trunk is the same at this part):


--- mod_cache.c.orig    2005-04-11 17:47:03.000000000 +0200
+++ mod_cache.c 2005-04-22 23:43:19.000000000 +0200
@@ -220,6 +220,8 @@ static int cache_out_filter(ap_filter_t
     ap_log_error(APLOG_MARK, APLOG_DEBUG, APR_SUCCESS, r->server,
                  "cache: running CACHE_OUT filter");

+    /* restore status of cached response */
+    r->status = cache->handle->cache_obj->info.status;
     /* recall_headers() was called in cache_select_url() */
     cache->provider->recall_body(cache->handle, r->pool, bb);

I found no place in the code where the saved status code is restored to the response,
so I think this could be the right place to do so.
Could somebody crosscheck who is more familar with the code? Sander? Justin?

> 
> Sander

Regards

RĂ¼diger





Re: mod_cache caching the 301 Moved Permanently

Posted by Sander Striker <st...@apache.org>.
r.pluem@t-online.de wrote:
> The problem seems to be, that the proxied backend server that is cached via mod_disk_cache originally
> delivers HTTP status 301 and the Location http://www.beach-clothing.com/where-to-buy/, but once cached
> mod_disk_cache delivers HTTP status 200 instead of 301 (but correctly redelivering the Location header).
> I have not proved this for myself so far, but this seems the problem to me.

This wouldn't surprise me one bit.  The 2.1 branch has seen quite a bit of churn in this
area.

Any chance you could give 2.1 a go and see if that works correctly?
 
Sander

Re: mod_cache caching the 301 Moved Permanently

Posted by r....@t-online.de.

Olaf van der Spek wrote:
> On 4/22/05, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
> 

[..cut..]

>>
>>I don't get it.  What's your problem?  -- justin
> 
>  
> The 'here' link is to http://www.beach-clothing.com:8080/where-to-buy/
> while he wants it do be to http://www.beach-clothing.com/where-to-buy/
> 
> 

His problem is a different one. If you type http://www.beach-clothing.com/where-to-buy
(without the slash) you have the following headers (first my request / second server reply):

GET /where-to-buy HTTP/1.1
Host: www.beach-clothing.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050319
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plai
n;q=0.8,image/png,*/*;q=0.5
Accept-Language: de,en;q=0.8,de-de;q=0.5,en-gb;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 200 OK
Date: Fri, 22 Apr 2005 20:39:58 GMT
Server: Apache
Content-Type: text/html; charset=iso-8859-1
Location: http://www.beach-clothing.com/where-to-buy/
Cache-Control: max-age=432000
Expires: Tue, 26 Apr 2005 04:58:38 GMT
Content-Length: 256
Age: 142880
Keep-Alive: timeout=3, max=20
Connection: Keep-Alive


The problem seems to be, that the proxied backend server that is cached via mod_disk_cache originally
delivers HTTP status 301 and the Location http://www.beach-clothing.com/where-to-buy/, but once cached
mod_disk_cache delivers HTTP status 200 instead of 301 (but correctly redelivering the Location header).
I have not proved this for myself so far, but this seems the problem to me.

I think the difference between the here link and the Location header can be explained by the ProxyPassReverse
directive which rewrites the Location header, but not the HTML code in the error page.

Regards

RĂ¼diger




Re: mod_cache caching the 301 Moved Permanently

Posted by Olaf van der Spek <ol...@gmail.com>.
On 4/22/05, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
> On Thu, Apr 21, 2005 at 10:04:54AM +0530, Devendra Singh wrote:
> > Hi,
> >
> > I am writing to the Developer List because I did not get any response on
> > the Users List and thought that the topic might be relevant to the dev list.
> >
> > If a request comes for a directory w/o trailing slash, it gets cached and
> > the subsequent requests see:
> >
> > Moved Permanently
> > The document has moved here
> >
> > for try to access the URL:
> > http://www.beach-clothing.com/where-to-buy
> 
> I don't get it.  What's your problem?  -- justin
 
The 'here' link is to http://www.beach-clothing.com:8080/where-to-buy/
while he wants it do be to http://www.beach-clothing.com/where-to-buy/

Re: mod_cache caching the 301 Moved Permanently

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On Thu, Apr 21, 2005 at 10:04:54AM +0530, Devendra Singh wrote:
> Hi,
> 
> I am writing to the Developer List because I did not get any response on 
> the Users List and thought that the topic might be relevant to the dev list.
> 
> If a request comes for a directory w/o trailing slash, it gets cached and 
> the subsequent requests see:
> 
> Moved Permanently
> The document has moved here
> 
> for try to access the URL:
> http://www.beach-clothing.com/where-to-buy

I don't get it.  What's your problem?  -- justin

Re: mod_cache caching the 301 Moved Permanently

Posted by Devendra Singh <ds...@indiamart.com>.
At 21/04/05 10:04 (), Devendra Singh wrote:
>Hi,
>
>I am writing to the Developer List because I did not get any response on 
>the Users List and thought that the topic might be relevant to the dev list.
>
>If a request comes for a directory w/o trailing slash, it gets cached and 
>the subsequent requests see:
>
>Moved Permanently
>The document has moved here
>
>for try to access the URL:
>http://www.beach-clothing.com/where-to-buy
>
>The configuration has two apache 2.0.53. The one running on port 80 has 
>mod_disk_cache which sends requests to 8080 via ProxyPassReverse.
>
>Here is my Virtual Host Config:
>
>Apache on Port 80:
><VirtualHost 66.235.181.69>
>ServerAlias beach-clothing.com www.beach-clothing.com
>ServerName www.beach-clothing.com
>ServerAdmin webmaster@indiamart.com
>DocumentRoot /var/www/public_html/beach-clothing
>ServerPath /beach-clothing
>RewriteEngine On
>RewriteCond %{HTTP_HOST}   !^www\.beach-clothing\.com [NC]
>RewriteCond %{HTTP_HOST}   !^$
>RewriteRule ^/(.*)         http://www.beach-clothing.com/$1 [L,R=301]
>#Enable Caching
>CacheDisable /index.html
>CacheEnable disk /
>RewriteCond %{SERVER_PORT} ^25$
>RewriteRule /* - [F]
>RewriteRule ^(.*)\/$ $1/index.html [P]
>RewriteRule \.(gif|jpg|png|txt|css|js|ico|swf)$ - [last]
>RewriteRule ^/(.*)$ http://www.beach-clothing.com:8080/$1 [proxy]
>ProxyPassReverse / http://www.beach-clothing.com:8080/
></VirtualHost>
>
>Apache on port 8080:
><VirtualHost 66.235.181.69>
>ServerAlias beach-clothing.com www.beach-clothing.com
>ServerName www.beach-clothing.com
>ServerAdmin webmaster@indiamart.com
>DocumentRoot /var/www/public_html/beach-clothing
>#Enable Caching for 48 Hours
><Location /cgi>
>         ExpiresActive off
></Location>
><FilesMatch "\.(mp|pl|cgi)$">
>         ExpiresActive off
></FilesMatch>
><Location />
>         ExpiresActive on
>         ExpiresByType text/html A432000
></Location>
></VirtualHost>
>
>UseCanonicalName is off into the Server Config Section of both the Apache.
>
>How do I solve this problem? If anyone has any clue / pointer?

Hi,

Sorry, replying my own post.

I modified the "modules/experimental/mod_cache.c" (around line 374) to 
eliminate the caching of HTTP_MOVED_PERMANENTLY and the problem was solved.

     /*
      * what responses should we not cache?
      *
      * At this point we decide based on the response headers whether it
      * is appropriate _NOT_ to cache the data from the server. There are
      * a whole lot of conditions that prevent us from caching this data.
      * They are tested here one by one to be clear and unambiguous.
      */
     if (r->status != HTTP_OK && r->status != HTTP_NON_AUTHORITATIVE
         && r->status != HTTP_MULTIPLE_CHOICES
         && r->status != HTTP_MOVED_PERMANENTLY --> Remove this line.
         && r->status != HTTP_NOT_MODIFIED) {
         reason = apr_psprintf(p, "Response status %d", r->status);
     }

Please comment, is this a right approach?

Thanks.

Devendra Singh