You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Da...@lotus.com on 2000/12/01 19:53:31 UTC

Re: XalanJ1 + 0D0A

Xerces-C++ definitely normalizes CR/LF or CR to LF.  It looks like Xerces-J
is also doing this.  See src.org.apache.readers.CharReader, for instance.

Dave



                                                                                                                   
                    Scott_Boag@lo                                                                                  
                    tus.com              To:     xerces-j-dev@xml.apache.org                                       
                                         cc:     xalan-dev@xml.apache.org, (bcc: David N Bertoni/CAM/Lotus)        
                    12/01/2000           Subject:     Re: XalanJ1 + 0D0A                                           
                    12:07 PM                                                                                       
                    Please                                                                                         
                    respond to                                                                                     
                    xalan-dev                                                                                      
                                                                                                                   
                                                                                                                   




Can anyone on the Xerces team comment on normalization of CRLF etc. in an
XML Parser.  I don't believe I have seen this normalization take place with
Xerces????

-scott





                    Joseph_Kesselman

                    @lotus.com              To:
xalan-dev@xml.apache.org
                                            cc:     (bcc: Scott
Boag/CAM/Lotus)
                    11/30/00 09:22          Subject:     Re: XalanJ1 + 0D0A

                    AM

                    Please respond

                    to xalan-dev








> If I pass parameter with 0D0A sequence (CRLF) XSLT processor
>outputs two sequences 0D0A 0D0A.

During normal XML parsing, CRLF gets turned into LF, as would CR alone.
Theoretically, this means that the CR character (0D) should never appear in
XML content. So one can certainly argue that converting your line-breaks
before passing them into XSLT is the Right Thing to do.

Xalan could check for this and apply the conversion for you, though there'd
be performance penalties due to having to scan and possibly recopy the
string. Of course that overhead should only happen once per stylesheet
invocation, so this may be acceptable. On the other hand, this is a case of
sweeping a problem under the carpet; the next XSLT processor you use may
not be so forgiving, so there's something to be said for teaching folks to
fix it before Xalan sees it.

Question: Can one argue that line-break conversion of parameters should
_not_ occur when XSLT is outputting in Text mode? I hope not, but I'm not
110% sure... I consider this another good reason for Xalan to put the
conversion burden on the caller; by doing so we avoid having to demand an
official answer for this boundary case. <grin level=".5"/>










Re: XalanJ1 + 0D0A

Posted by Dean Roddey <dr...@charmedquark.com>.
That's for output. When outputting, it doesn't matter. Normalization happens
when the file is parsed back in. At that point, all CR/LF sequences are
converted to single LFs (unless appropriately escaped via a char reference.)

--------------------------
Dean Roddey
The CIDLib C++ Frameworks
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"It takes two buttocks to make friction"
    - African Proverb


----- Original Message -----
From: "Miroslaw Dobrzanski-Neumann" <mn...@mosaic-ag.com>
To: <xe...@xml.apache.org>
Sent: Wednesday, December 06, 2000 12:03 AM
Subject: Re: XalanJ1 + 0D0A


> On Fri, Dec 01, 2000 at 01:53:31PM -0500, David_N_Bertoni@lotus.com wrote:
> >
> > Xerces-C++ definitely normalizes CR/LF or CR to LF.  It looks like
Xerces-J
> > is also doing this.  See src.org.apache.readers.CharReader, for
instance.
>
> I have found this in DOMPrint sample
>
> static const XMLCh  gXMLDecl4[] =
> {
>         chDoubleQuote, chQuestion, chCloseAngle
>     ,   chCR, chLF, chNull
> };
>
> At the first glance one could think chCR is really required for XMLDecl
> --
> Miroslaw Dobrzanski-Neumann
>
> MOSAIC SOFTWARE AG
> Base Development and Research
> Tel +49-2225-882-291
> Fax +49-2225-882-201
> E-mail: mne@mosaic-ag.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>


Re: XalanJ1 + 0D0A

Posted by Miroslaw Dobrzanski-Neumann <mn...@mosaic-ag.com>.
On Fri, Dec 01, 2000 at 01:53:31PM -0500, David_N_Bertoni@lotus.com wrote:
> 
> Xerces-C++ definitely normalizes CR/LF or CR to LF.  It looks like Xerces-J
> is also doing this.  See src.org.apache.readers.CharReader, for instance.

I have found this in DOMPrint sample

static const XMLCh  gXMLDecl4[] =
{
	        chDoubleQuote, chQuestion, chCloseAngle
			    ,   chCR, chLF, chNull
};

At the first glance one could think chCR is really required for XMLDecl
-- 
Miroslaw Dobrzanski-Neumann

MOSAIC SOFTWARE AG
Base Development and Research
Tel +49-2225-882-291
Fax +49-2225-882-201
E-mail: mne@mosaic-ag.com