You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Michael Hanisch <ha...@informatik.uni-muenchen.de> on 2000/07/31 00:12:23 UTC

Re: [OT] & in URLs -- SUMMARY (was: Re: Templating System)

Hi *, 

thank you all for your comments on the plain-ampersand-in-URL-problem. 
I received many useful hints regarding this subject, so I thought I might
as well sum them up for those who have missed this thread. 

In short:
  If you have URLs in your HTML with more than one query-string arg
  (args separated by "&"), you _have_ to escape the ampersand character:
  I.e., write something like 
   ' <A HREF="/cgi-bin/baz.cgi?blah=blurb&amp;foo=bar"> ',
  instead of simply using
   ' <A HREF="/cgi-bin/baz.cgi??blah=blurb&foo=bar"> '. 
  (if &foo; were a defined entity, you'd be in trouble!).
  The "&amp;" entity will be expanded as in your regular HTML code.
 
  To improve readability, you might as well get rid of the ampersand
  completely and use a semicolon as the delimiter.
 (Thanks to Alan J. Flavell <fl...@mail.cern.ch> for pointing this out).

  Example:  ' <A HREF="/cgi-bin/baz.cgi?blah=blurb;foo=bar"> '
  For more complete information, be sure to read
  http://ppewww.ph.gla.ac.uk/~flavell/www/formgetbyurl.html 


For those of you with time to waste: 
;-)

  I never thought that there could be problems with the "&" character in
  HREF-attributes, esp. since that attribut is declared as CDATA.
  However, I was wrong regarding the meaning of CDATA in attribute
  declarations:
  As Mark Doyle <do...@aps.org> has explained, entites in attribute values
  will be expanded nevertheless. In addition, entities can not only be
  terminated by semicolons (as in "&sect;"), but with other
   "non-word-characters" as well (as in "&sect=").
  Especially that last point might give you troubles; it is even more
  complicated than that as some browsers appear to be broken as they
  regard "&section=" as an entity as well, though the "&sect" is followed
  by a "word-character"!
  

  Unfortunately, I was fortunate enough not to run into any troubles yet
  (sorry for the pun ;-),
  so this quirk of SGML/HTML-syntax may have gone unnoticed (by me, at
  least) for a long time.

  I guess that I really learned something from this discussion, that will
  help me make my code more robust. Still wondering why nobody ever
  cared to explain this to me, though - maybe most people out there simply
  don't know...?
  
  Thanks to Jacob Davies & Randal L. Schwartz for bringing this up in the
  first place, and to Alan J. Flavell, Mark Doyle and various others for
  clarifying.

  Looks like this list helped to make my life less miserable once again.
  Thanks for your time.

  Regards,
	Michael.

________________________________________________________
Michael   |  email: hanisch@informatik.uni-muenchen.de 
Hanisch   |                                            
________________________________________________________