You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@xerces.apache.org by ra...@locus.apache.org on 2000/02/23 02:45:56 UTC
cvs commit: xml-xerces/c/doc migration.xml
rahulj 00/02/22 17:45:55
Modified: c/doc migration.xml
Log:
Summarized the changes needed to migrate from XML4C 2.x to the
new source code base.
Revision Changes Path
1.2 +448 -265 xml-xerces/c/doc/migration.xml
Index: migration.xml
===================================================================
RCS file: /home/cvs/xml-xerces/c/doc/migration.xml,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- migration.xml 2000/01/22 00:37:20 1.1
+++ migration.xml 2000/02/23 01:45:55 1.2
@@ -1,71 +1,206 @@
<?xml version="1.0" standalone="no"?>
<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd">
-<s1 title="Migrating from V2 to V3">
- <p>This document is a discussion of the technical differences between
- the 2.x code base and the new code base. </p>
-
- <s2 title="General Improvements">
-
- <p>The new version is improved in many ways. Some general improvements
- are significantly better conformance to the XML spec, cleaner
- internal architecture, many bug fixes, and better speed.</p>
-
- <s3 title="Compliance">
- <p>Except for a couple of the very obscure (mostly related to
- the 'standalone' mode), this version should be quite compliant.
- We have more than a thousand tests, some collected from various
- public sources and some IBM generated, which are used to do
- regression testing. The C++ parser is now passing all but a
- handful of them.</p>
- </s3>
- <s3 title="Bug Fixes">
- <p>This version has many bug fixes with regard to Version 2.
- Some of these were reported by users and some were brought up by
- way of the conformance testing.</p>
- </s3>
- <s3 title="Speed">
- <p>Much work was done to speed up this version. Some of the
- new features, such as namespaces, and conformance checks ended
- up eating up some of these gains, but overall the new version
- is significantly faster than previous versions, even while doing
- more.</p>
- </s3>
- </s2>
-
- <s2 title="The Samples">
- <p>The sample programs no longer use any of the unsupported util/xxx
- classes. They only existed to
- allow us to write portable samples. But, since we feel that the wide
- character APIs are supported on a lot of platforms these days, it was
- decided to go ahead and just write the samples in terms of these.
- If your system does not support these APIs, you will not be able
- to build and run the samples. On some platforms, these APIs might
- perhaps be optional packages or require runtime updates or some such action.</p>
- <p>More samples have been added as well. These show off some of the
- new functionality introduced in the V3 code base. And the existing
- ones have been tighted up a bit as well.</p>
- <p>The new samples are:</p>
- <ol>
- <li>PParse - Demonstrates 'progressive parse', (see below)</li>
- <li>StdInParse - Demonstrates use of the standard in input source</li>
- <li>EnumVal - Shows how to enumerate the markup decls in a DTD Validator</li>
- </ol>
- </s2>
- <s2 title="Parser Classes">
- <p>In the 2.x code base, there were the following parser classes
- (in the src/parsers/ source directory): NonValidatingSAXParser,
- ValidatingSAXParser, NonValidatingDOMParser, ValidatingDOMParser.
- The non-validating ones were the base classes and the validating
- ones just derived from them and turned on the validation.
- This was deemed a little bit overblown, considering the tiny
- amount of code required to turn on validation and the fact
- that it makes people use a pointer to the parser in most cases
- (if they needed to support either validating or non-validating versions.)</p>
- <p>The new code base just has SAXParer and DOMParser classes. These
- are capable of handling both validating and non-validating modes,
- according to the state of a flag that you can set on them. For
- instance, here is a code snippet that shows this in action.</p>
+<s1 title="Migrating from XML4C 2.x">
+ <p>This document is a discussion of the technical differences between
+ XML4C 2.x code base and the new &XercesCName; &XercesCVersion; code base.</p>
+
+ <p>Topics discussed are:</p>
+ <ul>
+ <li><link anchor="GenImprovements">General Improvements</link></li>
+ <ul>
+ <li><link anchor="Compliance">Compliance</link></li>
+ <li><link anchor="BugFixes">Bug Fixes</link></li>
+ <li><link anchor="Speed">Speed</link></li>
+ </ul>
+ <li><link anchor="Summary">Summary of changes required to migrate from XML4C 2.x to &XercesCName; &XercesCVersion;</link></li>
+ <li><link anchor="Samples">The Samples</link></li>
+ <li><link anchor="ParserClasses">Parser Classes</link></li>
+ <li><link anchor="DOMLevel2">DOM Level 2 support</link></li>
+ <li><link anchor="Progressive">Progressive Parsing</link></li>
+ <li><link anchor="Namespace">Namespace support</link></li>
+ <li><link anchor="MovedToSrcFramework">Moved Classes to src/framework</link></li>
+ <li><link anchor="LoadableMessageText">Loadable Message Text</link></li>
+ <li><link anchor="PluggableValidators">Pluggable Validators</link></li>
+ <li><link anchor="PluggableTranscoders">Pluggable Transcoders</link></li>
+ <li><link anchor="UtilReorg">Util directory Reorganization</link></li>
+ <ul>
+ <li><link anchor="UtilPlatform">util - The platform independent utility stuff</link></li>
+ </ul>
+ </ul>
+
+
+
+ <anchor name="GenImprovements"/>
+ <s2 title="General Improvements">
+
+ <p>The new version is improved in many ways. Some general improvements
+ are: significantly better conformance to the XML spec, cleaner
+ internal architecture, many bug fixes, and faster speed.</p>
+
+ <anchor name="Compliance"/>
+ <s3 title="Compliance">
+ <p>Except for a couple of the very obscure (mostly related to
+ the 'standalone' mode), this version should be quite compliant.
+ We have more than a thousand tests, some collected from various
+ public sources and some IBM generated, which are used to do
+ regression testing. The C++ parser is now passing all but a
+ handful of them.</p>
+ </s3>
+
+ <anchor name="BugFixes"/>
+ <s3 title="Bug Fixes">
+ <p>This version has many bug fixes with regard to XML4C version 2.x.
+ Some of these were reported by users and some were brought up by
+ way of the conformance testing.</p>
+ </s3>
+
+ <anchor name="Speed"/>
+ <s3 title="Speed">
+ <p>Much work was done to speed up this version. Some of the
+ new features, such as namespaces, and conformance checks ended
+ up eating up some of these gains, but overall the new version
+ is significantly faster than previous versions, even while doing
+ more.</p>
+ </s3>
+ </s2>
+
+
+ <anchor name="Summary"/>
+ <s2 title="Summary of changes required to migrate from XML4C 2.x to &XercesCName; &XercesCVersion;">
+
+ <p>As mentioned, there are some major architectural changes
+ between the 2.3.x and &XercesCName; &XercesCVersion; releases
+ of the parser, and as a result the code has undergone
+ significant restructuring. The list below mentions the public
+ api's which existed in 2.3.x and no longer exist in
+ &XercesCName; &XercesCVersion;. It also mentions the
+ &XercesCName; &XercesCVersion; api which will give you the
+ same functionality. Note: This list is not exhaustive. The
+ API docs (and ultimately the header files) supplement this
+ information.</p>
+
+ <ul>
+
+ <li><code>parsers/[Non]Validating[DOM/SAX]parser.hpp</code><br/>
+ These files/classes have all been consolidated in the new
+ version to just two files/classes:
+ <code>[DOM/SAX]Parser.hpp</code>. Validation is now a
+ property which may be set before invoking the
+ <code>parse</code>. Now, the
+ <code>setDoValidation()</code> method controls the
+ validation processing.</li>
+
+ <li>The <code>framework/XMLDocumentTypeHandler.hpp</code>
+ been replaced with
+ <code>validators/DTD/DocTypeHandler.hpp</code>.</li>
+
+ <li>The following methods now have different set of parameters:</li>
+ <ul>
+ <li><code>[Non]Validating[DOM/SAX]Parser::docComment</code></li>
+ <li><code>[Non]Validating[DOM/SAX]Parser::doctypePI</code></li>
+ <li><code>[Non]ValidatingSAXParser::elementDecl</code></li>
+ <li><code>[Non]ValidatingSAXParser::endAttList</code></li>
+ <li><code>[Non]ValidatingSAXParser::entityDecl</code></li>
+ <li><code>[Non]ValidatingSAXParser::notationDecl</code></li>
+ <li><code>[Non]ValidatingSAXParser::startAttList</code></li>
+ <li><code>[Non]ValidatingSAXParser::TextDecl</code></li>
+ <li><code>[Non]ValidatingSAXParser::docComment</code></li>
+ <li><code>[Non]ValidatingSAXParser::docPI</code></li>
+ <li><code>[Non]Validating[DOM/SAX]Parser::endElement</code></li>
+ <li><code>[Non]Validating[DOM/SAX]Parser::startElement</code></li>
+ <li><code>[Non]Validating[DOM/SAX]Parser::XMLDecl</code></li>
+ <li><code>[Non]Validating[DOM/SAX]Parser::error</code></li>
+ </ul>
+
+ <li>The following methods/data members changed visibility
+ from <code>protected</code> in 2.3.x to
+ <code>private</code> (with public setters and getters, as
+ appropriate).</li>
+
+ <ul>
+ <li><code>[Non]ValidatingDOMParser::fDocument</code></li>
+ <li><code>[Non]ValidatingDOMParser::fCurrentParent</code></li>
+ <li><code>[Non]ValidatingDOMParser::fCurrentNode</code></li>
+ <li><code>[Non]ValidatingDOMParser::fNodeStack</code></li>
+ </ul>
+
+
+ <li>The following files have moved, possibly requiring
+ changes in the <code>#include</code> statements.</li>
+
+ <ul>
+ <li><code>MemBufInputSource.hpp</code></li>
+ <li><code>StdInInputSource.hpp</code></li>
+ <li><code>URLInputSource.hpp</code></li>
+ </ul>
+
+
+ <li>All the DTD validator code was moved from
+ <code>internal</code> to separate
+ <code>validators/DTD</code> directory.</li>
+
+ <li>The error code definitions which were earlier in
+ <code>internal/ErrorCodes.hpp</code> are now splitup into
+ the following files:</li>
+
+ <ul>
+ <li><code>framework/XMLErrorCodes.hpp </code> - Core XML errors</li>
+ <li><code>framework/XMLValidityCodes.hpp</code> - DTD validity errors</li>
+ <li><code>util/XMLExceptMsgs.hpp </code> - C++ specific exception codes.</li>
+ </ul>
+ </ul>
+
+ </s2>
+
+
+
+ <anchor name="Samples"/>
+ <s2 title="The Samples">
+
+ <p>The sample programs no longer use any of the unsupported
+ util/xxx classes. They only existed to allow us to write
+ portable samples. But, since we feel that the wide character
+ APIs are supported on a lot of platforms these days, it was
+ decided to go ahead and just write the samples in terms of
+ these. If your system does not support these APIs, you will
+ not be able to build and run the samples. On some platforms,
+ these APIs might perhaps be optional packages or require
+ runtime updates or some such action.</p>
+
+ <p>More samples have been added as well. These highlight some
+ of the new functionality introduced in the new code base. And
+ the existing ones have been cleaned up as well.</p>
+
+ <p>The new samples are:</p>
+ <ol>
+ <li>PParse - Demonstrates 'progressive parse', (see below)</li>
+ <li>StdInParse - Demonstrates use of the standard in input source</li>
+ <li>EnumVal - Shows how to enumerate the markup decls in a DTD Validator</li>
+ </ol>
+ </s2>
+
+
+ <anchor name="ParserClasses"/>
+ <s2 title="Parser Classes">
+
+ <p>In the XML4C 2.x code base, there were the following parser
+ classes (in the src/parsers/ source directory):
+ NonValidatingSAXParser, ValidatingSAXParser,
+ NonValidatingDOMParser, ValidatingDOMParser. The
+ non-validating ones were the base classes and the validating
+ ones just derived from them and turned on the validation.
+ This was deemed a little bit overblown, considering the tiny
+ amount of code required to turn on validation and the fact
+ that it makes people use a pointer to the parser in most cases
+ (if they needed to support either validating or non-validating
+ versions.)</p>
+
+ <p>The new code base just has SAXParer and DOMParser
+ classes. These are capable of handling both validating and
+ non-validating modes, according to the state of a flag that
+ you can set on them. For instance, here is a code snippet that
+ shows this in action.</p>
<source>void ParseThis(const XMLCh* const fileToParse,
const bool validate)
@@ -88,39 +223,48 @@
...
};</source>
- <p>We feel that this is a simpler architecture, and that it makes things
- easier for you. In the above example, for instance, the parser will be
- cleaned up for you automatically upon exit since you don't have to
- allocate it anymore.</p>
-
- </s2>
-
- <s2 title="DOM Level 2 support">
- <p>Experimental early support for some parts of the DOM level
- 2 specification have been added. These address some of the
- shortcomings in our DOM implementation,
- such as a simple, standard mechanism for tree traversal.</p>
-
- </s2>
- <s2 title="Progressive Parsing">
- <p>The new parser classes support, in addition to the <ref>parse()</ref>
- method, two new parsing methods, <ref>parseFirst()</ref> and <ref>parseNext()</ref>.
- These are designed to support 'progressive parsing', so that
- you don't have to depend upon throwing an exception to
- terminate the parsing operation. Calling parseFirst() will
- cause the DTD (or in the future, Schema) to be parsed (both
- internal and external subsets) and any pre-content, i.e. everything
- up to but not including the root element. Subsequent calls to
- parseNext() will cause one more pieces of markup to be parsed,
- and spit out from the core scanning code to the parser (and
- hence either on to you if using SAX or into the DOM tree if
- using DOM.) You can quit the parse any time by just not
- calling parseNext() anymore and breaking out of the loop. When
- you call parseNext() and the end of the root element is the
- next piece of markup, the parser will continue on to the
- end of the file and return false, to let you know that the
- parse is done. So a typical progressive parse loop will look
- like this:</p>
+ <p>We feel that this is a simpler architecture, and that it makes things
+ easier for you. In the above example, for instance, the parser will be
+ cleaned up for you automatically upon exit since you don't have to
+ allocate it anymore.</p>
+
+ </s2>
+
+
+ <anchor name="DOMLevel2"/>
+ <s2 title="DOM Level 2 support">
+
+ <p>Experimental early support for some parts of the DOM level
+ 2 specification have been added. These address some of the
+ shortcomings in our DOM implementation,
+ such as a simple, standard mechanism for tree traversal.</p>
+
+ </s2>
+
+
+ <anchor name="Progressive"/>
+ <s2 title="Progressive Parsing">
+
+ <p>The new parser classes support, in addition to the
+ <ref>parse()</ref> method, two new parsing methods,
+ <ref>parseFirst()</ref> and <ref>parseNext()</ref>. These are
+ designed to support 'progressive parsing', so that you don't
+ have to depend upon throwing an exception to terminate the
+ parsing operation. Calling parseFirst() will cause the DTD (or
+ in the future, Schema) to be parsed (both internal and
+ external subsets) and any pre-content, i.e. everything up to
+ but not including the root element. Subsequent calls to
+ parseNext() will cause one more pieces of markup to be parsed,
+ and spit out from the core scanning code to the parser (and
+ hence either on to you if using SAX or into the DOM tree if
+ using DOM.) You can quit the parse any time by just not
+ calling parseNext() anymore and breaking out of the loop. When
+ you call parseNext() and the end of the root element is the
+ next piece of markup, the parser will continue on to the end
+ of the file and return false, to let you know that the parse
+ is done. So a typical progressive parse loop will look like
+ this:</p>
+
<source>// Create a progressive scan token
XMLPScanToken token;
@@ -138,179 +282,218 @@
while (gotMore && !handler.getDone())
gotMore = parser.parseNext(token);</source>
- <p>In this case, our event handler object (named 'handler' surprisingly
- enough) is watching form some criteria and will return a
- status from its getDone() method. Since the handler sees
- the SAX events coming out of the SAXParser, it can tell
- when it finds what it wants. So we loop until we get no
- more data or our handler indicates that it saw what it wanted to see.</p>
- <note>In the case of the normal parse() call, the end of the parse
- is unambiguous and the parser will flush any open entities on exit
- (closing files, sockets, memory buffers, etc...) that were open
- when the exit completed. In the progressive parse scenario, it cannot
- know when you are done unless you parse to the end. So, to insure
- that all entites are closed, you should typically destroy the parser.
- If you are going to reuse the parser again and again, each reuse will
- implicitly flush any previous content, but any opened entities will
- remain opened until you do one of these things.</note>
-
- <p>Also note that you must create a scan token and pass it back
- in on each call. This insures that things don't get done out of
- sequence. When you call parseFirst() or parse(), any previous scan
- tokens are invalidated and will cause an error if used again. This
- prevents incorrect mixed use of the two different parsing
- schemes or incorrect calls to parseNext().</p>
- </s2>
- <s2 title="Namespace support">
- <p>The C++ parser now supports namespaces. With current
- XML interfaces (SAX/DOM) this doesn't mean very much because
- these APIs are incapable of passing on the namespace information.
- However, if you are using our internal APIs to write your own
- parsers, you can make use of this new information. Since the
- internal event APIs must be able to now support both namespace
- and non-namespace information, they have more parameters. These
- allow namespace information to be passed along.</p>
- <p>Most of the samples now have a new command line parameter to
- turn on namespace support. You turn on namespaces like this:</p>
+ <p>In this case, our event handler object (named 'handler'
+ surprisingly enough) is watching form some criteria and will
+ return a status from its getDone() method. Since the handler
+ sees the SAX events coming out of the SAXParser, it can tell
+ when it finds what it wants. So we loop until we get no more
+ data or our handler indicates that it saw what it wanted to
+ see.</p>
+
+ <note>In the case of the normal parse() call, the end of the
+ parse is unambiguous and the parser will flush any open
+ entities on exit (closing files, sockets, memory buffers,
+ etc...) that were open when the exit completed. In the
+ progressive parse scenario, it cannot know when you are done
+ unless you parse to the end. So, to insure that all entites
+ are closed, you should typically destroy the parser. If you
+ are going to reuse the parser again and again, each reuse will
+ implicitly flush any previous content, but any opened entities
+ will remain opened until you do one of these things.</note>
+
+ <p>Also note that you must create a scan token and pass it
+ back in on each call. This insures that things don't get done
+ out of sequence. When you call parseFirst() or parse(), any
+ previous scan tokens are invalidated and will cause an error
+ if used again. This prevents incorrect mixed use of the two
+ different parsing schemes or incorrect calls to
+ parseNext().</p>
+
+ </s2>
+
+
+ <anchor name="Namespace"/>
+ <s2 title="Namespace support">
+
+ <p>The C++ parser now supports namespaces. With current XML
+ interfaces (SAX/DOM) this doesn't mean very much because these
+ APIs are incapable of passing on the namespace information.
+ However, if you are using our internal APIs to write your own
+ parsers, you can make use of this new information. Since the
+ internal event APIs must be able to now support both namespace
+ and non-namespace information, they have more
+ parameters. These allow namespace information to be passed
+ along.</p>
+
+ <p>Most of the samples now have a new command line parameter
+ to turn on namespace support. You turn on namespaces like
+ this:</p>
<source>SAXParser myParser;
// Tell it whether to do namespacse
myParser.setDoNamespaces(true);</source>
- </s2>
+ </s2>
+
+
+
+ <anchor name="MovedToSrcFramework"/>
+ <s2 title="Moved Classes to src/framework">
- <s2 title="Moved Classes to src/framework">
- <p>Some of the classes previously in the src/internal/
- directory have been moved to their more correct location
- in the src/framework/ directory. These are classes used by
- the outside world and should have been framework classes
- to begin with. Also, to avoid name classes in the absense
- of C++ namespace support, some of these classes have been
- renamed to make them more XML specific and less likely to
- clash. More classes might end up being moved to framework
- as well.</p>
- <p>So you might have to change a few include statements to
- find these classes in their new locations. And you might
- have to rename some of the names of the classes, if you
- used any of the ones whose names were changed.</p>
- </s2>
-
- <s2 title="Loadable Message Text">
- <p>The system now supoprts loadable message text, instead of
- having it hard coded into the program. The current drop
- still just supports English, but it can now support other
- languages. Anyone interested in contributing any translations
- should contact us. This would be an extremely useful service.</p>
-
- <p>In order to support the local message loading services, we
- have created a pretty flexible framework for supporting
- loadable text. Firstly, there is now an XML file, in the
- src/NLS/ directory, which contains all of the error messages.
- There is a simple program, in the Tools/NLSXlat/ directory,
- which can spit out that text in various formats. It currently
- supports a simple 'in memory' format (i.e. an array of strings),
- the Win32 resource format, and the message catalog format.
- The 'in memory' format is intended for very simple installations
- or for use when porting to a new platform (since you can use
- it until you can get your own local message loading support done.)</p>
- <p>In the src/util/ directory, there is now an XMLMsgLoader class.
- This is an abstraction from which any number of message loading
- services can be derived. Your platform driver file can create
- whichever type of message loader it wants to use on that platform.
- We currently have versions for the in memory format, the Win32
- resource format, and the message catalog format. An ICU one is
- present but not implemented yet. Some of the platforms can
- support multiple message loaders, in which case a #define token
- is used to control which one is used. You can set this in your
- build projects to control the message loader type used.</p>
- <p>
- This is also a good place to mention that the Java and C++
- messages are now the <em>same</em>, since both are being taken
- from the same XML message file.</p>
- </s2>
-
- <s2 title="Pluggable Validators">
- <p>In a preliminary move to support Schemas, and to make them
- first class citizens just like DTDs, the system has been
- reworked internally to make validators completely pluggable.
- So now the DTD validator code is under the src/validators/DTD/
- directory, with a future Schema validator probably going into
- the src/validators. The core scanner architecture now works
- completely in terms of the framework/XMLValidator abstract
- interface and knows almost nothing about DTDs or Schemas. For
- now, if you don't pass in a validator to the parsers, they
- will just create a DTDValidator. This means that, theoretically,
- you could write your own validator. But we would not encourage
- this for a while, until the semantics of the XMLValidator
- interface are completely worked out and proven to handle
- DTD and Schema cleanly.</p>
- </s2>
-
-
- <s2 title="Pluggable Transcoders">
- <p>Another abstract framework added in the src/util/ directory
- is to support pluggable transcoding services. The XMLTransService
- class is an abtract API that can be derived from, to support
- any desired transcoding service. XMLTranscoder is the
- abstract API for a particular instance of a transcoder
- for a particular encoding. The platform driver file decides
- what specific type of transcoder to use, which allows each
- platform to use its native transcoding services, or the ICU
- service if desired.</p>
- <p>Implementations are provided for Win32 native services, ICU
- services, and the <ref>iconv</ref> services available on many Unix
- platforms. The Win32 version only provides native code page
- services, so it can only handle XML code in the intrinsic
- encodings (UTF-8, ASCII, UTF-16, UCS-4.) The ICU version provides
- all of the encodings that ICU supports. The <ref>iconv</ref> version will
- support the encodings supported by the local system. You can use
- transcoders we provide or create your own if you feel ours are
- insufficient in some way, or if your platform requires an implementation
- that we do not provide.</p>
- </s2>
-
- <s2 title="Util directory Reorganization">
- <p>The src/util directory was becoming somewhat of a dumping
- ground of platform and compiler stuff. So we reworked that
- directory to better spread things out. The new scheme is:
- </p>
-
- <s3 title="util - The platform independent utility stuff">
- <ul>
- <li>MsgLoaders - Holds the msg loader implementations</li>
- <ol>
- <li>ICU</li>
- <li>InMemory</li>
- <li>MsgCatalog</li>
- <li>Win32</li>
- </ol>
- <li>Compilers - All the compiler specific files</li>
- <li>Transcoders - Holds the transcoder implementations</li>
- <ol>
- <li>Iconv</li>
- <li>ICU</li>
- <li>Win32</li>
- </ol>
- <li>Platforms</li>
- <ol>
- <li>AIX</li>
- <li>HP-UX</li>
- <li>Linux</li>
- <li>Solaris</li>
- <li>....</li>
- <li>Win32</li>
- </ol>
- </ul>
- </s3>
- <p>This organization makes things much easier to understand.
- And it makes it easier to find which files you need and
- which are optional. Note that only per-platform files have
- any hard coded references to specific message loaders or
- transcoders. So if you don't include the ICU implementations
- of these services, you don't need to link in ICU or use
- any ICU headers. The rest of the system works only in
- terms of the abstraction APIs.</p>
+ <p>Some of the classes previously in the src/internal/
+ directory have been moved to their more correct location in
+ the src/framework/ directory. These are classes used by the
+ outside world and should have been framework classes to begin
+ with. Also, to avoid name classes in the absense of C++
+ namespace support, some of these classes have been renamed to
+ make them more XML specific and less likely to clash. More
+ classes might end up being moved to framework as well.</p>
+
+ <p>So you might have to change a few include statements to
+ find these classes in their new locations. And you might have
+ to rename some of the names of the classes, if you used any of
+ the ones whose names were changed.</p>
+
+ </s2>
+
+
+ <anchor name="LoadableMessageText"/>
+ <s2 title="Loadable Message Text">
+
+ <p>The system now supoprts loadable message text, instead of
+ having it hard coded into the program. The current drop still
+ just supports English, but it can now support other
+ languages. Anyone interested in contributing any translations
+ should contact us. This would be an extremely useful
+ service.</p>
+
+ <p>In order to support the local message loading services, we
+ have created a pretty flexible framework for supporting
+ loadable text. Firstly, there is now an XML file, in the
+ src/NLS/ directory, which contains all of the error messages.
+ There is a simple program, in the Tools/NLSXlat/ directory,
+ which can spit out that text in various formats. It currently
+ supports a simple 'in memory' format (i.e. an array of
+ strings), the Win32 resource format, and the message catalog
+ format. The 'in memory' format is intended for very simple
+ installations or for use when porting to a new platform (since
+ you can use it until you can get your own local message
+ loading support done.)</p>
+
+ <p>In the src/util/ directory, there is now an XMLMsgLoader
+ class. This is an abstraction from which any number of
+ message loading services can be derived. Your platform driver
+ file can create whichever type of message loader it wants to
+ use on that platform. We currently have versions for the in
+ memory format, the Win32 resource format, and the message
+ catalog format. An ICU one is present but not implemented
+ yet. Some of the platforms can support multiple message
+ loaders, in which case a #define token is used to control
+ which one is used. You can set this in your build projects to
+ control the message loader type used.</p>
+
+ <p>This is also a good place to mention that the Java and C++
+ messages are now the <em>same</em>, since both are being taken
+ from the same XML message file.</p>
+
+ </s2>
+
+
+ <anchor name="PluggableValidators"/>
+ <s2 title="Pluggable Validators">
+
+ <p>In a preliminary move to support Schemas, and to make them
+ first class citizens just like DTDs, the system has been
+ reworked internally to make validators completely pluggable.
+ So now the DTD validator code is under the src/validators/DTD/
+ directory, with a future Schema validator probably going into
+ the src/validators. The core scanner architecture now works
+ completely in terms of the framework/XMLValidator abstract
+ interface and knows almost nothing about DTDs or Schemas. For
+ now, if you don't pass in a validator to the parsers, they
+ will just create a DTDValidator. This means that,
+ theoretically, you could write your own validator. But we
+ would not encourage this for a while, until the semantics of
+ the XMLValidator interface are completely worked out and
+ proven to handle DTD and Schema cleanly.</p>
+
+ </s2>
+
+
+ <anchor name="PluggableTranscoders"/>
+ <s2 title="Pluggable Transcoders">
+
+ <p>Another abstract framework added in the src/util/ directory
+ is to support pluggable transcoding services. The
+ XMLTransService class is an abtract API that can be derived
+ from, to support any desired transcoding
+ service. XMLTranscoder is the abstract API for a particular
+ instance of a transcoder for a particular encoding. The
+ platform driver file decides what specific type of transcoder
+ to use, which allows each platform to use its native
+ transcoding services, or the ICU service if desired.</p>
+
+ <p>Implementations are provided for Win32 native services, ICU
+ services, and the <ref>iconv</ref> services available on many
+ Unix platforms. The Win32 version only provides native code
+ page services, so it can only handle XML code in the intrinsic
+ encodings (UTF-8, ASCII, UTF-16, UCS-4.) The ICU version
+ provides all of the encodings that ICU supports. The
+ <ref>iconv</ref> version will support the encodings supported
+ by the local system. You can use transcoders we provide or
+ create your own if you feel ours are insufficient in some way,
+ or if your platform requires an implementation that we do not
+ provide.</p>
+
+ </s2>
+
+
+ <anchor name="UtilReorg"/>
+ <s2 title="Util directory Reorganization">
+
+ <p>The src/util directory was becoming somewhat of a dumping
+ ground of platform and compiler stuff. So we reworked that
+ directory to better spread things out. The new scheme is:
+ </p>
+
+ <anchor name="UtilPlatform"/>
+ <s3 title="util - The platform independent utility stuff">
+ <ul>
+ <li>MsgLoaders - Holds the msg loader implementations</li>
+ <ol>
+ <li>ICU</li>
+ <li>InMemory</li>
+ <li>MsgCatalog</li>
+ <li>Win32</li>
+ </ol>
+ <li>Compilers - All the compiler specific files</li>
+ <li>Transcoders - Holds the transcoder implementations</li>
+ <ol>
+ <li>Iconv</li>
+ <li>ICU</li>
+ <li>Win32</li>
+ </ol>
+ <li>Platforms</li>
+ <ol>
+ <li>AIX</li>
+ <li>HP-UX</li>
+ <li>Linux</li>
+ <li>Solaris</li>
+ <li>....</li>
+ <li>Win32</li>
+ </ol>
+ </ul>
+ </s3>
+
+ <p>This organization makes things much easier to understand.
+ And it makes it easier to find which files you need and which
+ are optional. Note that only per-platform files have any hard
+ coded references to specific message loaders or
+ transcoders. So if you don't include the ICU implementations
+ of these services, you don't need to link in ICU or use any
+ ICU headers. The rest of the system works only in terms of the
+ abstraction APIs.</p>
- </s2>
+ </s2>
</s1>