You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Alberto Massari <am...@datadirect.com> on 2005/11/02 15:49:08 UTC

Re: Reducing the size of Xerces, static build

Hi Victor,
if you are using Visual C++ .NET 2003, the latest 
Xerces 2.7 distribution has a 'Static build' 
configuration; if you are using MinGW, you should 
be able to get a static library simply by invoking gcc over the *.o files.

Alberto

At 11.30 31/10/2005 -0600, Victor Broto wrote:
>We are releasing a new application making use of 
>Xerces, but still our executable size is too big.
>
>We are trying to ship a final version as small 
>as possible (we'd like to get a compressed size 
><5 MB) and, besides other dependencies, we have 
>one on xerces-c_2_5_0.dll, that is about 4.5 MB.
>
>Can somebody figure out a simple way to reduce the dll size?
>
>Also we are considering to use a static build, 
>that would remove our dependency on the .dll, 
>although increasing the exe size. However, after 
>browsing the documentation and the lists, I 
>haven't been able to know how to build Xerces 
>statically, so we cannot test whether removing 
>the .dll is worthy in that sense.
>
>Is there any way to get a static build of Xerces?
>
>Thanks,
>
>Victor
>
>Elisha Berns wrote:
>
>>Thanks for clearing this up for me, and thanks also for the other item
>>you responded to a few weeks ago.
>>
>>Elisha
>>
>>
>>
>>
>>>-----Original Message-----
>>>From: Alberto Massari [mailto:amassari@datadirect.com]
>>>Sent: Tuesday, November 01, 2005 12:21 AM
>>>To: c-dev@xerces.apache.org
>>>Subject: RE: Xerces issues handling recursive schema includes
>>>
>>>Hi Elisha,
>>>
>>>At 19.39 31/10/2005 -0800, Elisha Berns wrote:
>>>
>>>
>>>>Neil,
>>>>
>>>>I made a naïve implementation of an EntityResolver that only uses the
>>>>absolute paths of the SystemIds it receives, but this doesn't work
>>>>
>>for
>>
>>
>>>>the following reasons:
>>>>
>>>>The main schema file includes ~20 other schema files which are
>>>>
>>located
>>
>>
>>>>in other directories using relative paths and each one of those 20
>>>>
>>files
>>
>>
>>>>includes ~20 files (which can be included multiple times) also using
>>>>relative paths.  So if I use XMLPlatformUtils::weavePaths() using the
>>>>base path from the main schema file being parsed with all of those
>>>>relative paths in the other included schema files, the results are
>>>>invalid paths.
>>>>
>>>>The issue is that the EntityResolver needs to know what base path to
>>>>
>>use
>>
>>
>>>>when it gets a SystemId in order to correctly resolve it to an
>>>>
>>absolute
>>
>>
>>>>path. And the base path keeps changing as the SAX2XMLReader parses
>>>>through the paths it finds in schemaLocation attributes.
>>>>
>>>>Is there any way to get this information (the correct base path to
>>>>
>>use
>>
>>
>>>>per relative path) without having to pre-parse all the schema files
>>>>
>>for
>>
>>
>>>>their schemaLocation attributes?  Surely there must be some simpler
>>>>
>>way
>>
>>
>>>>to prevent the parser from mistaking two or more relative SystemIds
>>>>
>>as
>>
>>
>>>>different SystemIds?
>>>>
>>>To overcome this limitation there is a
>>>XMLEntityResolver interface that you should
>>>register using
>>>SAX2XMLReaderImpl::setXMLEntityResolver (you may
>>>have to cast your SAX2XMLReader to the
>>>implementation class). In your
>>>XMLEntityResolver-derived class you should
>>>implement resolveEntity(XMLResourceIdentifier*)
>>>resolving the entity using the getSystemId() and getBaseURI()
>>>
>>accessors.
>>
>>
>>>Hope this helps,
>>>Alberto
>>>
>>>
>>>
>>>
>>>>Thanks,
>>>>
>>>>Elisha
>>>>
>>>>
>>>>
>>>>>Hi Elisha,
>>>>>
>>>>>Recursive, or circular, includes are supposed to be handled
>>>>>
>>properly
>>
>>
>>>>by a
>>>>
>>>>
>>>>>schema parser.  While I'm not really active anymore on the code
>>>>>
>>base,
>>
>>
>>>>this
>>>>
>>>>
>>>>>question does come up periodically, usually in the context of a
>>>>>
>>set of
>>
>>
>>>>>schemas that get loaded purely via schemaLocation hints, or via a
>>>>>
>>>>user's
>>>>
>>>>
>>>>>EntityResolver which doesn't set system identifiers on the
>>>>>
>>>>InputSources it
>>>>
>>>>
>>>>>returns to the parser.  The usual way to get around this is to
>>>>>
>>>>register a
>>>>
>>>>
>>>>>custom EntityResolver instance, and take good care that system
>>>>>
>>>>identifier
>>>>
>>>>
>>>>>fields are always set to the same value when an InputSource is
>>>>>
>>>>returned.
>>>>
>>>>
>>>>>It's best if this is absolute, but I think a relative URI should
>>>>>
>>work
>>
>>
>>>>too.
>>>>
>>>>
>>>>>The reason this is important is that the parser uses system
>>>>>
>>>>identifiers
>>>>
>>>>
>>>>>internally to figure out whether it's processed a schema document
>>>>>
>>>>before.
>>>>
>>>>
>>>>>Cheers,
>>>>>Neil
>>>>>Neil Graham
>>>>>Manager, C++ Compiler Front-End and Runtime Development
>>>>>IBM Toronto Lab
>>>>>Phone:  905-413-3519, T/L 969-3519
>>>>>E-mail:  neilg@ca.ibm.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>"Elisha Berns" <e....@computer.org>
>>>>>10/30/2005 10:30 PM
>>>>>Please respond to
>>>>>c-dev
>>>>>
>>>>>
>>>>>To
>>>>>"Xerces C++ Development" <c-...@xerces.apache.org>
>>>>>cc
>>>>>
>>>>>Subject
>>>>>Xerces issues handling recursive schema includes
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>Hi,
>>>>>
>>>>>I'm trying to determine both what Xerces does when it encounters
>>>>>recursive schema includes and what to do about it because it
>>>>>
>>causes
>>
>>
>>>>some
>>>>
>>>>
>>>>>problems.
>>>>>
>>>>>It appears that the XercesC schema parser creates multiple XSxxx
>>>>>
>>type
>>
>>
>>>>>objects for the same type if the schema files are included
>>>>>
>>>>recursively.
>>>>
>>>>
>>>>>In addition it would appear that the load time for a schema is
>>>>>
>>much,
>>
>>
>>>>>much slower in the presence of recursive includes.
>>>>>
>>>>>I get one 'proper' globally defined type object but multiple
>>>>>
>>>>duplicates
>>>>
>>>>
>>>>>when the type appears as a contained type (in a complexType
>>>>>
>>>>definition).
>>>>
>>>>
>>>>>The only way I know this now is because I get different pointer
>>>>>
>>values
>>
>>
>>>>>for the XSxxx object when this situation arises, even though they
>>>>>
>>end
>>
>>
>>>>up
>>>>
>>>>
>>>>>pointing to the same type.
>>>>>
>>>>>Does anybody know firsthand whether there is any internal
>>>>>
>>mechanism to
>>
>>
>>>>>prevent this from happening (apparently not), and what can be
>>>>>
>>done, at
>>
>>
>>>>>present, to prevent this duplication from occuring.
>>>>>
>>>>>It has occurred to me that it might be a good idea to create a new
>>>>>
>>>>type
>>>>
>>>>
>>>>>of parser warning specifically regarding the issue of 'recursive
>>>>>includes'.  This of course only makes sense if there is a strong
>>>>>consensus that this is a classic anti-pattern of XML Schema
>>>>>
>>>>development
>>>>
>>>>
>>>>>and should be avoided at all costs.  I can see more or less how to
>>>>>implement it outside of Xerces by constructing a dependency graph
>>>>>
>>of
>>
>>
>>>>the
>>>>
>>>>
>>>>>schema files and testing for back-edges.  So my question about
>>>>>
>>this
>>
>>
>>>>side
>>>>
>>>>
>>>>>of things is whether there is any desire to make this test a built
>>>>>
>>in
>>
>>
>>>>>part of the parser to make the parser smarter about these things?
>>>>>
>>>>>Thanks for some feedback here.
>>>>>
>>>>>Elisha Berns
>>>>>e.berns@computer.org
>>>>>tel. (310) 556 - 8332
>>>>>fax (310) 556 - 2839
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>---------------------------------------------------------------------
>>
>>
>>>>>To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>>>>For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>---------------------------------------------------------------------
>>
>>
>>>>>To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>>>>For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>>>
>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>>>For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>>For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>
>>
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>For additional commands, e-mail: c-dev-help@xerces.apache.org
>>
>>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>For additional commands, e-mail: c-dev-help@xerces.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org