You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by Dean Roddey <dr...@charmedquark.com> on 2000/05/14 20:59:12 UTC

Re: limitations of Xerces-J 1.0.3

"First:
There is no way to get the values of 'version', 'encoding', 'standalone' in
the prolog. What I've done was to extend SAXParser, and write my own
'startDocument' function which saves the values I need."

This is a limitation of SAX, which we don't define. SAX2 will address most
of these issues, but no standard for SAX2 on C++ has been established yet.
Anything we would do here would be non-standard, and we have avoided that
whereever possible. Otherwise, people will use these features and we'll
never be able to get rid of them and their code won't be portable to other
conforming SAX parsers. As you say, you can always derive from SAXParser and
intercept the things you need in the meantime. Eventually, at least some of
the extensoins of SAX2 will be implemented, and the ones to get this stuff
would surely be among them.

"Second:
Concider the attributes of the element "D"... when the parser parses them he
tells me that there is an enitty before even told me that a new element has
begun... and then when he calls 'startElement(...)' and tells me the
attribute values, they are already resolved and there is no way to know what
they were before they got resolved"

Yes, this is a known issue. In fact, in the lastest code, you won't get
those start/end entity events in order to avoid this confusion. There will
have to be significant changes to the internal event APIs before this will
work. And, if you use SAX, there still won't be anything we can do becaues
SAX works the way it does and it sends all of the element stuff at once,
after its all been parsed and all entities have been seen. So you will not
be able to use SAX (at least SAX1) if you want to deal with entities inside
attributes.

"Third:
In the 3 elements of type A, there is always one argument .. f1='fix1' ..
but i cannot know if this is the default value.. or is it the value the user
set.. which is the same as the default"

You can figure this out, but again there is no way in the SAX API, which we
don't define, to do this. SAX2 might help in this respect as well, but I'm
just guessing since I don't remember if it addressed any of this. In order
to figure this out, you will either have to move to a DOM based application,
or you will have to use parser specific (non-portable) mechanisms to do it.
See the EnumVal sample program.

"Fourth:
When there is an entity such as the &SP; in the example the parser just
parses it and replaces it with space, but what if I dont want it replaced?"

It will always expand them, but it will send you start/end entity events if
you want to know what's inside of it. This is of course outside of the
issues discussed above for attributes. The only entities that the parser
could get away without not expanding (and still be able to reasonably parse
it) would be top level general entites, but there is not currently a switch
to make it leave them unparsed.

--------------------------
Dean Roddey
The CIDLib Class Libraries
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"Give me immortality, or give me death"