You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Christian Will (JIRA)" <xe...@xml.apache.org> on 2005/03/07 09:37:54 UTC
[jira] Updated: (XERCESC-1369) Performance: improve end-of-line handling
[ http://issues.apache.org/jira/browse/XERCESC-1369?page=history ]
Christian Will updated XERCESC-1369:
------------------------------------
Attachment: XMLReader.cpp.patch
XMLReader.hpp.patch
> Performance: improve end-of-line handling
> -----------------------------------------
>
> Key: XERCESC-1369
> URL: http://issues.apache.org/jira/browse/XERCESC-1369
> Project: Xerces-C++
> Type: Improvement
> Components: Miscellaneous
> Versions: 2.6.0
> Reporter: Christian Will
> Priority: Minor
> Attachments: XMLReader.cpp.patch, XMLReader.hpp.patch
>
> We can improve the end-of-line handling by two steps.
> 1. We move the function XMLReader:handleEOL(...) from the header into the cpp file, because the function is to big for inlining.
> 2. We create bit masks to avoid most of the handleEOL calls.
> Here are two examples :
> a)
> We use the content information that our current character is a whitespace. The bit mask selects all cases where we have to call handleEOL.
> if (isWhitespace(curCh))
> {
> //
> // 'curCh' is a whitespace(x20|x9|xD|xA), so we only can have
> // end-of-line combinations with a leading chCR(xD) or chLF(xA)
> //
> // 100000 x20
> // 001001 x9
> // 001010 chLF
> // 001101 chCR
> // -----------
> // 000110 == (chCR|chLF) & ~(0x9|0x20)
> //
> // if the result of thelogical-& operation is
> // true : 'curCh' must be xA or xD
> // false : 'curCh' must be x20 or x9
> //
> if ( ( curCh & (chCR|chLF) & ~(0x9|0x20) ) == 0 )
> {
> fCurCol++;
> } else
> {
> handleEOL(curCh, false);
> }
> b)
> We have no content information so we have to test for all four possible start characters.
> The bit masks selects only 128 cases (from before 63483) where we have to call handleEOL.
> //
> // we can have end-of-line combinations with a leading
> // chCR(xD), chLF(xA), chNEL(x85), or chLineSeparator(x2028)
> //
> // 0000000000001101 chCR
> // 0000000000001010 chLF
> // 0000000010000101 chNEL
> // 0010000000101000 chLineSeparator
> // -----------------------
> // 1101111101010000 == ~(chCR|chLF|chNEL|chLineSeparator)
> //
> // if the result of the logical-& operation is
> // true : 'curCh' can not be chCR, chLF, chNEL or chLineSeparator
> // false : 'curCh' can be chCR, chLF, chNEL or chLineSeparator
> //
> if ( chGotten & (XMLCh) ~(chCR|chLF|chNEL|chLineSeparator) )
> {
> fCurCol++;
> } else
> {
> handleEOL(chGotten, false);
> }
> I created bit masks for all (5) cases where we call handleEOL.
> I attached patch files against the latest cvs version.
> Regards,
> Christian Will
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org