You are viewing a plain text version of this content. The canonical link for it is here.
Posted to apache-bugdb@apache.org by Jennine Townsend <je...@netcom.com> on 1999/07/27 03:09:02 UTC

other/4776: PHP's parse_url regex fails. It's OK after updating to HS's current regex.

>Number:         4776
>Category:       other
>Synopsis:       PHP's parse_url regex fails.  It's OK after updating to HS's current regex.
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    apache
>State:          open
>Class:          sw-bug
>Submitter-Id:   apache
>Arrival-Date:   Mon Jul 26 18:10:01 PDT 1999
>Last-Modified:
>Originator:     jennine@netcom.com
>Organization:
apache
>Release:        1.3.6
>Environment:
Linux 2.2.4 Alpha, Apache 1.3.6, PHP 3.0.11, egcs-2.90.29 (1.0.3)
>Description:
PHP's parse_url was always failing on Linux Alpha.
This PHP code:
<?php
		$Url="http://www.sun.com/";
		$UrlElements = parse_url($Url);
		if( (empty($UrlElements)) or (!$UrlElements) )
		{
			$errmsg = "is_url: Parse error reading [$Url]";
			print "$errmsg\n";
		}

		$scheme		= $UrlElements[scheme];
		$HostName	= $UrlElements[host];

		if(empty($scheme))
		{
			$errmsg = "is_url: Missing protocol declaration";
			print "$errmsg\n";
		}


		if(empty($HostName))
		{
			$errmsg = "is_url: No hostname in URL";
                        print "$errmsg\n";
		}

		if (!eregi("^(ht|f)tp",$scheme))
		{
			$errmsg = "is_url: No http:// or ftp://";
                        print "$errmsg\n";
		}

		print "Scheme is $scheme, host is $HostName\n";
?>
was coming back with messages like this:
Warning: unable to parse url (http://www.sun.com/) in /usr/local/apache/htdocs/j/php/purl.php3 on line 12
is_url: Parse error reading [http://www.sun.com/] is_url: Missing protocol declaration is_url: No hostname in URL is_url: No http:// or ftp:// Scheme is , host is 

I didn't really decrypt the configure complexities of PHP<->Apache regex
libraries, and I haven't broken regexes in Apache per se, but it looks
like PHP is using the Apache one, and I assume mod_rewrite uses regex
so it's probably a good idea to update it.  BTW, I also filed a bug
against PHP.
>How-To-Repeat:
See above.
>Fix:
I copied down a more recent version of regex from Henry Spencer's ftp site
(ftp.zoo.utoronto.ca) and made the obvious tweaks to replace the 1.3.6 version
with that.  After rebuilding, everything's OK now.  It looks like his version
is years newer, and he mostly mentions portability in the changenotes between
your version and the current one, so it makes sense.  Also, his tests
(make re;make r) pass on Linux Alpha in the newer version.
>Audit-Trail:
>Unformatted:
[In order for any reply to be added to the PR database, you need]
[to include <ap...@Apache.Org> in the Cc line and make sure the]
[subject line starts with the report component and number, with ]
[or without any 'Re:' prefixes (such as "general/1098:" or      ]
["Re: general/1098:").  If the subject doesn't match this       ]
[pattern, your message will be misfiled and ignored.  The       ]
["apbugs" address is not added to the Cc line of messages from  ]
[the database automatically because of the potential for mail   ]
[loops.  If you do not include this Cc, your reply may be ig-   ]
[nored unless you are responding to an explicit request from a  ]
[developer.  Reply only with text; DO NOT SEND ATTACHMENTS!     ]