You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by David Slack-Smith <da...@cisra.canon.com.au> on 2002/07/12 09:43:40 UTC

Embedded Polymorphic Data

Hi,

I am using Xerces' SAX parser to parse XML data conforming to a
certain schema.  One of the subelements of the data is of type
"xsd:any".  It is therefore possible to embed data conforming to other
schemas within these files.  I would like to be able to run a
'subparser' to parse this other data before returning control (and
some data representing the parsed data) to the 'superparser'.  For
example:
    <!-- Document starts here -->
    <Xroot xmlns=".../namespace/X">
        <Xchild>

            <!-- Superparser stops parsing temporarily -->
            <!-- Subparser starts parsing -->

            <Yroot xmlns.Y=".../namespace/Y">   
                <Ychild/>
            </Yroot>   

            <!-- Where I want the subparser to stop parsing -->
            <!-- Where I want the superparser to resume parsing -->

        </Xchild>
    </Xroot>
    <!-- Document ends here -->

Is there a well-known idiom for doing this with the SAX parser?  I can
see how to give future events to the subparser by calling the
SAX2XMLReader class's setContentHandler method.  I can see how the
subparser could know when to return control by counting the start and
end element events.  I think I can even see how to return control to
the superparser.  But I'm not sure I'm looking at the problem in the
right way so as to arrive at the best solution.

I was thinking that the superparser should look something like this
    class SuperparserX: public ContentHandler
    {
    public:
        startElement(uri, localname, qname, attrs);
        endElement(uri, localname, qname);

        X *
        parse(inputSource);
    private:
        SAX2XMLReader  *readerSax;
        ContentHandler *subparser;
        X              *x;
    };
implement the ContentHandler::startElement method like this
    if (XMLString::compareString(uri, schema_string_X) == 0)
    {
        Determine what is being parsed from localname, initialize an
            object of that type with the attribute values in attrs.
        if (elementStack is non-empty)
        {
            Give a pointer to the new object to its parent (i.e.
                the element on the top of the stack) so it
                can be added to its data structures;
            Push a pointer to the new object onto elementStack;
        }
        else
        {
            Push a pointer to the new object onto elementStack;
            Set x = a pointer to the new object;
        }
    }
    else
    {
        if (XMLString::compareString(uri, schema_string_Y == 0)
        {
            //
            // Forward the startElement method to a suitable subparser
            // and tell the SAX Reader to send future events there
            // too.
            //
            Parser *subparser = new YParser(this, readerSax);
            readerSax->SetContentHandler(subparser);
            subparser->startElement(uri, ...);
        }
        else
        {
            ...
        }
    },
and its endElement method would do this
    Pop the elementStack;
and the parse method would just do this
    readerSax->setContentHandler(this);
    readerSax->parse(inputSource)
    return x;

The subparser's startElement method would be almost identical to the
superparser's.  Its endElement method would have to check the element
stack for emptiness after popping it, and tell the superparser when it
had finished by calling a method, at which time the superparser would
make itself the active ContentHandler again, give the subtree
generated by the subparser to its parent element (our old friend the
stack top, again), and free the subparser.

I'd prefer to avoid having the subparser be aware of its superparser
if at all possible, but do not really see how this could be done short
of having the superparser feed the SAX events to the subparser itself.

Perhaps I'm missing a better, simpler solution to the problem.  Feel
free to send suggestions, comments, or criticism (constructive or
otherwise:^)~)

Thanks,
Dave

--
_|     _   |\     David SLACK-SMITH                              |_
_|    / \__| \    Software Engineer                              |_
_|   /        \   Canon Information Systems Research Australia   |_
_|   \   __  ./   1 Thomas Holt Drive, North Ryde, NSW 2113      |_
_|    \_/  \_/    Phone: +61 2 9805 2813                         |_
_|          v     Email: david.slack-smith@cisra.canon.com.au    |_

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org