You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "cargilld (JIRA)" <xe...@xml.apache.org> on 2005/03/22 22:15:20 UTC

[jira] Resolved: (XERCESC-1369) Performance: improve end-of-line handling

     [ http://issues.apache.org/jira/browse/XERCESC-1369?page=history ]
     
cargilld resolved XERCESC-1369:
-------------------------------

    Resolution: Fixed

Hi Christian,
I checked in your patch.  Can you please verify?  Thanks.

David

> Performance: improve end-of-line handling
> -----------------------------------------
>
>          Key: XERCESC-1369
>          URL: http://issues.apache.org/jira/browse/XERCESC-1369
>      Project: Xerces-C++
>         Type: Improvement
>   Components: Miscellaneous
>     Versions: 2.6.0
>     Reporter: Christian Will
>     Priority: Minor
>  Attachments: XMLReader.cpp.patch, XMLReader.hpp.patch
>
> We can improve the end-of-line handling by two steps.
> 1. We move the function XMLReader:handleEOL(...) from the header into the cpp file, because the function is to big for inlining.
> 2. We create bit masks to avoid most of the handleEOL calls.
> Here are two examples :
> a)
> We use the content information that our current character is a whitespace. The bit mask selects all cases where we have to call handleEOL.
> if (isWhitespace(curCh))
> {
>     //
>     //  'curCh' is a whitespace(x20|x9|xD|xA), so we only can have
>     //  end-of-line combinations with a leading chCR(xD) or chLF(xA)
>     //
>     //  100000 x20
>     //  001001 x9
>     //  001010 chLF
>     //  001101 chCR
>     //  -----------
>     //  000110 == (chCR|chLF) & ~(0x9|0x20)
>     //
>     //  if the result of thelogical-& operation is
>     //  true  : 'curCh' must be xA  or xD
>     //  false : 'curCh' must be x20 or x9
>     //
>     if ( ( curCh & (chCR|chLF) & ~(0x9|0x20) ) == 0 )
>     {
>         fCurCol++;
>     } else
>     {
>         handleEOL(curCh, false);
>     }
> b)
> We have no content information so we have to test for all four possible start characters.
> The bit masks selects only 128 cases (from before 63483) where we have to call handleEOL.
>     //
>     // we can have end-of-line combinations with a leading
>     // chCR(xD), chLF(xA), chNEL(x85), or chLineSeparator(x2028)
>     //
>     // 0000000000001101 chCR
>     // 0000000000001010 chLF
>     // 0000000010000101 chNEL
>     // 0010000000101000 chLineSeparator
>     // -----------------------
>     // 1101111101010000 == ~(chCR|chLF|chNEL|chLineSeparator)
>     //
>     // if the result of the logical-& operation is
>     // true  : 'curCh' can not be chCR, chLF, chNEL or chLineSeparator
>     // false : 'curCh' can be chCR, chLF, chNEL or chLineSeparator
>     //
>     if ( chGotten & (XMLCh) ~(chCR|chLF|chNEL|chLineSeparator) )
>     {
>         fCurCol++;
>     } else
>     {
>         handleEOL(chGotten, false);
>     }
> I created bit masks for all (5) cases where we call handleEOL.
> I attached patch files against the latest cvs version.
> Regards,
> Christian Will

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org