You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by "Zembower, Kevin" <kz...@jhuccp.org> on 2006/10/05 21:19:44 UTC
[users@httpd] Help with rewrite for errors?
I have a number of documents in HTML files like this:
www.popline.org/docs/0784/045796.html
www.popline.org/docs/0429/209471.html
www.popline.org/docs/0003/690206.html
In most of these records, the link is broken (as it is in these three
examples). This is a result of old files still in Google.
However, in these three cases, the original document can be found by
removing the 4 digit directory and the '.html' thusly:
www.popline.org/docs/045796
www.popline.org/docs/209471
www.popline.org/docs/690206
Because of the nature of our system, these resolve correctly.
Can anyone help me with a set of RewriteRules that will, whenever a 404
error is generated, transform the URL as indicated and resubmit it?
Here are the current Rewrite rules in my system:
RewriteEngine on
RewriteLog /var/www/popline/logs/rewrite.log
#Turn off rewritelog with level 0. 2 is useful/normal.
RewriteLogLevel 0
RewriteRule ^/docs$ /docs/index.html
RewriteRule ^/docs/$ /docs/index.html
RewriteRule ^/docs/index.* - [L] #If this
matches, don't do any rewriting
RewriteRule ^/error/.* - [L] #If this
matches, don't do any rewriting, so error pages come up correctly
RewriteRule ^/404.shtml - [L] #If this
matches, don't do any rewriting, so error pages come up correctly
RewriteRule ^/docs/sitemap.* - [L] #If this
matches, don't do any rewriting. For Google sitemap program
RewriteRule ^/docs/[0-9]{4}/[0-9]{6}\.html - [L] #If this
matches, don't do any rewriting
#Note that in RewriteRule below, must use %3F for '?' after
'icswppro.dll'. '?' has special meaning in Rewrite substitutions.
RewriteRule ^/docs/([0-9]{6})$
http://db.jhuccp.org/ics-wpd/exec/icswppro.dll?BU=http://db.jhuccp.org/i
cs-wpd/exec/icswppro.dll&QF0=DocNo&QI0=$1&TN=Popline&AC=QBE_QUERY&MR=30\
%DL=1&&RL=1&&RF=LongRecordDisplay&DF=LongRecordDisplay
[P]
RewriteRule ^/docs/[0-9]{4}.* - [L] #If this
matches, don't do any rewriting
RewriteRule ^/.*$ http://db.jhuccp.org/ics-wpd/popweb/basic.html
[R,L]
Here's an example from the current rewrite log of a 404 generation:
10.253.200.90 - - [05/Oct/2006:15:08:01 --0400]
[www.popline.org/sid#8275268][rid#82e1570/initial] (2) init rewrite
engine with requested uri /docs/0784/045796.html
10.253.200.90 - - [05/Oct/2006:15:08:01 --0400]
[www.popline.org/sid#8275268][rid#82e1570/initial] (1) pass through
/docs/0784/045796.html
10.253.200.90 - - [05/Oct/2006:15:08:01 --0400]
[www.popline.org/sid#8275268][rid#82e2d30/initial/redir#1] (2) init
rewrite engine with requested uri /404.shtml
10.253.200.90 - - [05/Oct/2006:15:08:01 --0400]
[www.popline.org/sid#8275268][rid#82e2d30/initial/redir#1] (1) pass
through /404.shtml
Here's an earlier excerpt from the rewrite log, before I filtered out
the 'HTTP_NOT_FOUND' information:
10.253.200.90 - - [04/Oct/2006:11:55:43 --0400]
[www.popline.org/sid#8270170][rid#82e3760/initial] (2) init rewrite
engine with requested uri /docs/0211/772369.html
10.253.200.90 - - [04/Oct/2006:11:55:43 --0400]
[www.popline.org/sid#8270170][rid#82e3760/initial] (1) pass through
/docs/0211/772369.html
10.253.200.90 - - [04/Oct/2006:11:55:43 --0400]
[www.popline.org/sid#8270170][rid#82e5070/initial/redir#1] (2) init
rewrite engine with requested uri /error/HTTP_NOT_FOUND.html.var
10.253.200.90 - - [04/Oct/2006:11:55:43 --0400]
[www.popline.org/sid#8270170][rid#82e5070/initial/redir#1] (2) rewrite
/error/HTTP_NOT_FOUND.html.var ->
http://db.jhuccp.org/ics-wpd/popweb/basic.html
10.253.200.90 - - [04/Oct/2006:11:55:43 --0400]
[www.popline.org/sid#8270170][rid#82e5070/initial/redir#1] (2)
explicitly forcing redirect with
http://db.jhuccp.org/ics-wpd/popweb/basic.html
10.253.200.90 - - [04/Oct/2006:11:55:43 --0400]
[www.popline.org/sid#8270170][rid#82e5070/initial/redir#1] (1) escaping
http://db.jhuccp.org/ics-wpd/popweb/basic.html for redirect
10.253.200.90 - - [04/Oct/2006:11:55:43 --0400]
[www.popline.org/sid#8270170][rid#82e5070/initial/redir#1] (1) redirect
to http://db.jhuccp.org/ics-wpd/popweb/basic.html [REDIRECT/302]
My question is not so much how to transform the submitted URL into the
one without the directory and '.html'. Instead, I don't understand how
to detect the 404 condition and then invoke the rewrite rule.
Thanks in advance for all your help and suggestions.
-Kevin
Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland 21202
410-659-6139
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org