You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Alberto Massari (JIRA)" <xe...@xml.apache.org> on 2013/08/26 12:37:51 UTC

[jira] [Resolved] (XERCESC-2016) XML 1.0 5th edition support

     [ https://issues.apache.org/jira/browse/XERCESC-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alberto Massari resolved XERCESC-2016.
--------------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 3.1.2)
                   3.2.0
         Assignee: Alberto Massari

I have gone through the 5th edition specs and implemented a good part of it; the XML Test suite still reports 16 failures mainly due to URI rules
                
> XML 1.0 5th edition support
> ---------------------------
>
>                 Key: XERCESC-2016
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2016
>             Project: Xerces-C++
>          Issue Type: Improvement
>          Components: Non-Validating Parser
>    Affects Versions: 3.1.1
>         Environment: All
>            Reporter: Rob Cameron
>            Assignee: Alberto Massari
>             Fix For: 3.2.0
>
>         Attachments: diff5e
>
>
> Xerces-C currently applies XML 1.0 4th edition rules to name characters
> in XML 1.0 documents.    XML 1.0 5th edition permits a broader class
> of name characters, based on those permitted in XML 1.1.
> Proposal: that Xerces-C 3.2.0 be updated to include support for XML 1.0
> 5th edition.
> Although our main work is with icXML, we've looked at making this change
> in Xerces-C original code base so that icXML support for XML 1.0 5e is
> compatible with us.
> I'm not entirely sure that I've handled everything, but the following change
> works in our test.  The change plan is below and a svn diff file is
> attached.
> Here is the change plan.
> ----------------------------------
> (1)  internal/CharTypeTables.hpp
> Rename gFirstNameChars1_1 to be gFirstNameChars
> Rename gNameChars1_1 to be gNameChars
> (2) util/XMLChar.cpp
> (2a)
>    Update initCharFlagTable1_1() to use the gFirstNameChars, gNameChars
>    Update initCharFlagTable() to use the set-ups from initCharFlagTable1_1()
>      to define gNameCharMask, gNCNameCharMask, and gFirstNameCharMask.
>     //
>     //  Name characters are special. A name is made up of a number of
>     //  different tables and some special case characters.
>     //
>     initOneTable(gNameChars, gNameCharMask);
>     //
>     //  Name characters are special. A name is made up of a number of
>     //  different tables and some special case characters.
>     //
>     initOneTable(gNameChars, gNCNameCharMask);
>     gTmpCharTable[chColon] &= ~gNCNameCharMask;
>     //
>     //  Then do the first name char
>     //
>     initOneTable(gFirstNameChars, gFirstNameCharMask);
> (2b) #define NEED_TO_GEN_TABLE
> compile and do a sample run of a Xerces app, generate table.out
> (2c) Replace the XMLChar1_0::fgCharCharsTable1_0 definition pf XMLChar.cpp
> with that from table.out.
> (3) XMLChar.hpp
>     Modify XMLChar1_0::isFirstNameChar, XMLChar1_0::isFirstNCNameChar,
> XMLChar1_0::isNameChar, XMLChar1_0::isNCNameChar
>     to each check for and allow characters in the #x10000-#xEFFFF range
>     else {
>         if ((toCheck >= 0xD800) && (toCheck <= 0xDB7F))
>            if ((toCheck2 >= 0xDC00) && (toCheck2 <= 0xDFFF))
>                return true;
>     }
> (4)  Modify XMLReader::getName and XMLReader::getNCName
>        to allow surrogate pairs in Names and NCNames
>        (i.e., use the version 1.1 logic for both 1.0 and 1.1).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org