You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2005/05/27 10:59:21 UTC
DO NOT REPLY [Bug 35100] New: -
URL-parsing does not work for www.altavista.com
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=35100>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=35100
Summary: URL-parsing does not work for www.altavista.com
Product: Apache httpd-2.0
Version: 2.0.54
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: mod_proxy
AssignedTo: bugs@httpd.apache.org
ReportedBy: bjoern@cs.tu-berlin.de
It's not possible to use the relatively popular search engine
http://www.altavista.com/
with apache2's mod_proxy* modules.
You can easily see the problem, if you
a) type a search word into the search field in
http://www.altavista.com/
b) click on of the links in this page
The main problem is, that apache-mod_proxy does some URL re-encodings. After
this re-encodings the original URL path component differs from the encoded form.
An example. There is an example link from http://de.altavista.com/ (I
changed it a little bit, because I do not know, if the URL contains
private infos)
http://av.rds.yahoo.com/_ylt=A9ibyDZZCEq4AklmSLaMX;_ylu=X3oDBvNjNnZmYzBHBndANhdl93ZWJfaG9tZQRzZWMDdGFicw--/SIG=11nr22kc/EXP=111216420/**http%3a//de.altavista.com/dir/default
apache-mod_proxy transforms it to (sniffed with ethereal):
GET
/_ylt=A9ibyDZZCEq4AklmSLaMX;_ylu=X3oDBvNjNnZmYzBHBndANhdl93ZWJfaG9tZQRzZWMDdGFicw--/SIG=11nr22kc/EXP=111216420/**http://de.altavista.com/dir/default
HTTP/1.1
Do you see the difference? "http%3a//" is transformed to "http://".
The offline browser wwwoffle has the same problem. I wrote a patch for wwwoffle,
which makes saves "%3a" in URL pathes, instead of rewriting it to the colon (":").
I'm not familiar with apache2's mod_proxy* code. But probably the idea of saving
"%3a" also helps to fix the problem in apache2.
--
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org