You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Olivier Poitrey <ol...@pas-tres.net> on 2004/01/21 15:17:53 UTC
[users@httpd] Apache 1.3 mod_rewrite and regex backreferences
Hello dudes,
I want to use back-references in a RewriteRule regex part (think real
regex back-reference, not RewriteCond back-references like %1) but it
doesn't work. I thought that it was an apache regex engine
limitation, but by reading the source code of the regex engine, I saw
that it should be supported. So my question is: is it normal that the
following RewriteRule doesn't match the "/foo/foo" URI:
RewriteRule ^/(foo)/\1 /something/...
^^
Is it an Apache regex engine bug ?
--
______________________________________________________________________
O l i v i e r P o i t r e y
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Apache 1.3 mod_rewrite and regex backreferences
Posted by Brian Dessent <br...@dessent.net>.
Olivier Poitrey wrote:
> > What you're referring to is only valid in the "Perl compatible"
> > regular expression flavor. This is not the same as egrep regex
> > flavor. Apache 1.3 uses egrep, 2.0 uses pcre, if I'm not mistaken.
> > So you should be able to do this in 2.0 but not 1.3.
>
> I can read the following in the POSIX regex documentation:
>
> "[...] Finally, there is one new type of atom, a back reference: `\'
> followed by a non-zero decimal digit d matches the same sequence of
> characters matched by the dth parenthesized subexpression (numbering
> subexpressions by the positions of their opening parentheses, left to
> right), so that (e.g.) `\([bc]\)\1' matches `bb' or `cc' but not
> `bc'."
>
> And, if you test my regex with egrep, it works perfectly:
>
> $ echo "/foo/foo"|egrep '^/(foo)/\1'
> /foo/foo
>
> Finaly, I found in the apache regex implementation source code (witch
> isn't the same code as the egrep regex engine as you said), special
> things to handle this kind of backreferences. In can't bother why it
> doesn't work anyway.
Well, all I can say is that the POSIX specs for extended regexps seem to
be ambiguous. The FreeBSD manpage (which is what I was looking at)
specifies that egrep doesn't support backreferences (but old-style basic
regexps do.) If you consult
<http://httpd.apache.org/docs/mod/mod_rewrite.html#RewriteRule> it says:
Text:
. Any single character
[chars] Character class: One of chars
[^chars] Character class: None of chars
text1|text2 Alternative: text1 or text2
Quantifiers:
? 0 or 1 of the preceding text
* 0 or N of the preceding text (N > 0)
+ 1 or N of the preceding text (N > 1)
Grouping:
(text) Grouping of text
(either to set the borders of an alternative or
for making backreferences where the Nth group can
be used on the RHS of a RewriteRule with $N)
Anchors:
^ Start of line anchor
$ End of line anchor
Escaping:
\char escape that particular char
(for instance to specify the chars ".[]()" etc.)
...that last part of which implies that \1 means the literal 1. So, I
don't really know what to make of it. Maybe one of the Apache
developers can give you a more specific response. It does seem odd that
backreferences aren't working, though.
Brian
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Apache 1.3 mod_rewrite and regex backreferences
Posted by Olivier Poitrey <ol...@pas-tres.net>.
Brian Dessent <br...@dessent.net> writes:
> It's not a bug. Using backreferences in the "search" part of the
> regular expression is not valid POSIX (extended) regular expression
> syntax. Check re_format(7). A backslash followed by a number or
> letter is to match that number or letter, as if the backslash wasn't
> there. You can use backreferences only in the "replace" part, not
> the "search" part.
>
> What you're referring to is only valid in the "Perl compatible"
> regular expression flavor. This is not the same as egrep regex
> flavor. Apache 1.3 uses egrep, 2.0 uses pcre, if I'm not mistaken.
> So you should be able to do this in 2.0 but not 1.3.
I can read the following in the POSIX regex documentation:
"[...] Finally, there is one new type of atom, a back reference: `\'
followed by a non-zero decimal digit d matches the same sequence of
characters matched by the dth parenthesized subexpression (numbering
subexpressions by the positions of their opening parentheses, left to
right), so that (e.g.) `\([bc]\)\1' matches `bb' or `cc' but not
`bc'."
And, if you test my regex with egrep, it works perfectly:
$ echo "/foo/foo"|egrep '^/(foo)/\1'
/foo/foo
Finaly, I found in the apache regex implementation source code (witch
isn't the same code as the egrep regex engine as you said), special
things to handle this kind of backreferences. In can't bother why it
doesn't work anyway.
Best regards,
--
______________________________________________________________________
O l i v i e r P o i t r e y
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Apache 1.3 mod_rewrite and regex backreferences
Posted by Brian Dessent <br...@dessent.net>.
Olivier Poitrey wrote:
> I want to use back-references in a RewriteRule regex part (think real
> regex back-reference, not RewriteCond back-references like %1) but it
> doesn't work. I thought that it was an apache regex engine
> limitation, but by reading the source code of the regex engine, I saw
> that it should be supported. So my question is: is it normal that the
> following RewriteRule doesn't match the "/foo/foo" URI:
>
> RewriteRule ^/(foo)/\1 /something/...
It's not a bug. Using backreferences in the "search" part of the
regular expression is not valid POSIX (extended) regular expression
syntax. Check re_format(7). A backslash followed by a number or letter
is to match that number or letter, as if the backslash wasn't there.
You can use backreferences only in the "replace" part, not the "search"
part.
What you're referring to is only valid in the "Perl compatible" regular
expression flavor. This is not the same as egrep regex flavor. Apache
1.3 uses egrep, 2.0 uses pcre, if I'm not mistaken. So you should be
able to do this in 2.0 but not 1.3.
Brian
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org