You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cxf.apache.org by "mr.andersen" <xm...@bec.dk> on 2008/10/01 08:55:14 UTC
Re: Schema DOM memory problem
Hi Daniel
Any change that you have found time to look into this problem and made some
changes.
Then I'm like to try it out, since I having the same problem as Charles had.
Morten
dkulp wrote:
>
>
> Charles,
>
> One of the primary reasons (right now) for keeping the DOM tree around is
> to work around some severe bugs in XmlSchema. The XmlSchema serializer
> in 1.3.2 loses a bunch of things so the results schemas that you get
> would not be correct. I think all the bugs have been fixed in
> XmlSchema and I've been asking for a new release. See:
> http://mail-archives.apache.org/mod_mbox/ws-commons-dev/200802.mbox/<200802071000.14543.dkulp%40apache.org>
> but so far, no luck. I'd appreciate it if you could also start bugging
> them. :-) If we can get a version that can actually round-trip
> schema properly, I'm OK with dropping the DOM.
>
> That all said, I've also thought about creating a "ShemaManager" to go
> along with the current WSDLManager to cache a lot of this. Just
> haven't gotten around to doing it. I'd definitely welcome any patches
> that would help us head that direction. :-)
>
> Dan
>
>
>
>
>
> On Tuesday 12 February 2008, Charles O'Farrell wrote:
>> G'day all,
>>
>> I have been given the task of generating WSDL from my companies large
>> collection of application models, as well as handling the invoking of
>> corresponding services which are already deployed. The number of
>> possible services numbers in the hundreds, with a handful of large
>> (2MB) shared shemas.
>>
>> When trying to run a small Jetty server with more than one of these
>> generated WSDLs I quickly ran out of memory (the default setting - 64M
>> I think). While it wouldn't be hard to bump up the memory allocation,
>> I feared the final scenario of hundreds of WSDLs would be problematic
>> even for large amounts of memory.
>>
>> To cut a long story short this is what I found:
>>
>> 1. For each WSDL, every imported schema is loaded into memory,
>> regardless of whether it is shared among other WSDLs.
>> 2. Every Schema DOM tree is stored in memory after parsing.
>>
>> Given that the Schema is parsed to the more useful XmlSchema object
>> tree, I'm not sure what benefits are gained from keeping it in DOM. I
>> fixed the memory bloat by some minor changes in SchemaUtil, which I
>> will explain briefly here. Note that reflection was unfortunately
>> required in dealing with the XmlSchema library.
>>
>> 1. Used a static map to update the XmlSchemaCollection parameter with
>> any cached Schemas before calling schemaCol.read(schemaElem,
>> systemId); in extractSchema
>>
>> 2. Nulled out cached DOM elements in the following:
>>
>> - extractSchema() -> xmlSchema.setElement() (well actually I
>> stopped it being set)
>> - addSchema() -> schema.setElement() after targetNamespace is
>> retrieved
>> - At the end of getSchemas() iterate any new schemas, get its
>> NodeNamespaceContext, call getDeclaredPrefixes() before settings
>> its node field to null.
>>
>> 3. Ignored schemaList from the constructor and instead just relied on
>> an internal set to avoid recursion. (I think this map is only needed
>> on the WSDL2Java?)
>> 4. Fixed WSDLQueryHandler to output full WSDL due to missing schema
>> node (I loaded it from the file system instead of serialising the
>> Definition object)
>>
>> I guess my biggest qualm in all this is that it was extremely
>> difficult to subclass and spring SchemaUtil to make the required
>> changes. In particular I had to reproduce the following invocation
>> class chain to fix the problem.
>>
>> JaxWsServiceFactoryBean -> buildServiceFromWSDL() ->
>> WSDLServiceFactory -> create() -> WSDLServiceBuilder -> getSchemas()
>> -> SchemaUtil
>>
>> Because SchemaUtil isn't a sprung object, nor any of the other
>> classes, and because most of the methods/fields are private I ended up
>> literally copy+pasting each class.
>>
>> Forgive me if this all sounds like criticism, because I am very
>> impressed and happy with CXF. This is just as much a documenting of my
>> findings as anything else.
>>
>> Anyway. I'm not too worried about what happens now but I am curious
>> what you guys think of all this.
>>
>> Cheers,
>>
>> Charles O'Farrell
>
>
>
> --
> J. Daniel Kulp
> Principal Engineer, IONA
> dkulp@apache.org
> http://www.dankulp.com/blog
>
>
--
View this message in context: http://www.nabble.com/Schema-DOM-memory-problem-tp15430330p19755456.html
Sent from the cxf-user mailing list archive at Nabble.com.
Re: Schema DOM memory problem
Posted by Daniel Kulp <dk...@apache.org>.
On Wednesday 01 October 2008, mr.andersen wrote:
> Hi Daniel
>
> Any change that you have found time to look into this problem and made
> some changes.
> Then I'm like to try it out, since I having the same problem as
> Charles had.
I wish I had better news for you. :-(
Everytime I turn around, I log some more issues with XmlSchema.
https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&mode=hide&pid=12310250&sorter/order=DESC&sorter/field=priority&resolution=-1&component=12310702
Particularly:
https://issues.apache.org/jira/browse/WSCOMMONS-363
A start of a patch is attached to the related issue, but the webservice
team is pretty much ignoring xmlschema again. Bugging them would be a
good thing.
Dan
>
> Morten
>
> dkulp wrote:
> > Charles,
> >
> > One of the primary reasons (right now) for keeping the DOM tree
> > around is to work around some severe bugs in XmlSchema. The
> > XmlSchema serializer in 1.3.2 loses a bunch of things so the results
> > schemas that you get would not be correct. I think all the bugs
> > have been fixed in XmlSchema and I've been asking for a new release.
> > See:
> > http://mail-archives.apache.org/mod_mbox/ws-commons-dev/200802.mbox/
> ><200802071000.14543.dkulp%40apache.org> but so far, no luck. I'd
> > appreciate it if you could also start bugging them. :-) If we
> > can get a version that can actually round-trip schema properly, I'm
> > OK with dropping the DOM.
> >
> > That all said, I've also thought about creating a "ShemaManager" to
> > go along with the current WSDLManager to cache a lot of this.
> > Just haven't gotten around to doing it. I'd definitely welcome any
> > patches that would help us head that direction. :-)
> >
> > Dan
> >
> > On Tuesday 12 February 2008, Charles O'Farrell wrote:
> >> G'day all,
> >>
> >> I have been given the task of generating WSDL from my companies
> >> large collection of application models, as well as handling the
> >> invoking of corresponding services which are already deployed. The
> >> number of possible services numbers in the hundreds, with a handful
> >> of large (2MB) shared shemas.
> >>
> >> When trying to run a small Jetty server with more than one of these
> >> generated WSDLs I quickly ran out of memory (the default setting -
> >> 64M I think). While it wouldn't be hard to bump up the memory
> >> allocation, I feared the final scenario of hundreds of WSDLs would
> >> be problematic even for large amounts of memory.
> >>
> >> To cut a long story short this is what I found:
> >>
> >> 1. For each WSDL, every imported schema is loaded into memory,
> >> regardless of whether it is shared among other WSDLs.
> >> 2. Every Schema DOM tree is stored in memory after parsing.
> >>
> >> Given that the Schema is parsed to the more useful XmlSchema object
> >> tree, I'm not sure what benefits are gained from keeping it in DOM.
> >> I fixed the memory bloat by some minor changes in SchemaUtil, which
> >> I will explain briefly here. Note that reflection was unfortunately
> >> required in dealing with the XmlSchema library.
> >>
> >> 1. Used a static map to update the XmlSchemaCollection parameter
> >> with any cached Schemas before calling schemaCol.read(schemaElem,
> >> systemId); in extractSchema
> >>
> >> 2. Nulled out cached DOM elements in the following:
> >>
> >> - extractSchema() -> xmlSchema.setElement() (well actually I
> >> stopped it being set)
> >> - addSchema() -> schema.setElement() after targetNamespace is
> >> retrieved
> >> - At the end of getSchemas() iterate any new schemas, get its
> >> NodeNamespaceContext, call getDeclaredPrefixes() before settings
> >> its node field to null.
> >>
> >> 3. Ignored schemaList from the constructor and instead just relied
> >> on an internal set to avoid recursion. (I think this map is only
> >> needed on the WSDL2Java?)
> >> 4. Fixed WSDLQueryHandler to output full WSDL due to missing schema
> >> node (I loaded it from the file system instead of serialising the
> >> Definition object)
> >>
> >> I guess my biggest qualm in all this is that it was extremely
> >> difficult to subclass and spring SchemaUtil to make the required
> >> changes. In particular I had to reproduce the following invocation
> >> class chain to fix the problem.
> >>
> >> JaxWsServiceFactoryBean -> buildServiceFromWSDL() ->
> >> WSDLServiceFactory -> create() -> WSDLServiceBuilder ->
> >> getSchemas() -> SchemaUtil
> >>
> >> Because SchemaUtil isn't a sprung object, nor any of the other
> >> classes, and because most of the methods/fields are private I ended
> >> up literally copy+pasting each class.
> >>
> >> Forgive me if this all sounds like criticism, because I am very
> >> impressed and happy with CXF. This is just as much a documenting of
> >> my findings as anything else.
> >>
> >> Anyway. I'm not too worried about what happens now but I am curious
> >> what you guys think of all this.
> >>
> >> Cheers,
> >>
> >> Charles O'Farrell
> >
> > --
> > J. Daniel Kulp
> > Principal Engineer, IONA
> > dkulp@apache.org
> > http://www.dankulp.com/blog
--
J. Daniel Kulp
Principal Engineer, IONA
dkulp@apache.org
http://www.dankulp.com/blog