You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Florent Philippe <ph...@yahoo.fr> on 2006/09/08 22:04:40 UTC
How to parse an XML string in memory? and get a DOM tree out of it
Hi,
Is it possible to parse the output of a DOM created xml that is in a memoy buffer
to be able to parse it with functions like getchild, getroot or something?
I found out about using handler to retreive data on parsing but i thought it woul be easier to deal with the solution above
thx
Re: How to parse an XML string in memory? and get a DOM tree out
of it
Posted by David Bertoni <db...@apache.org>.
Andrew Patterson wrote:
>> Sure you can, you just can't use DOMBuilderImpl::parseWithContext().
>> Just use the regular parse() member function:
>>
>> DOMDocument* DOMBuilderImpl::parse(const DOMInputSource& source)
>
> Ah, okay, got it working. I tried parse() first and couldn't get it
> working right. That's when I moved on to parseWithContext() (it seemed
> more appropriate for what I was doing). I've figured out what I was
> doing wrong with parse() -- all is good! Many, many thanks!
>
> One last question, purely out of curiosity. I've been dumping unused
> nodes into document fragment and then writing it out to string for
> storage. What I've been trying to do is pull it back out later on (the
> XML parsers may be terminated in the interim which is why I have to save
> it as a string temporarily instead of just leaving the DOM fragments
> intact).
>
> If the document fragment had one node in it, all was fine. Got the first
> child of the parsed string and there it was. But if there was more than
> one node, they seem to evaporate when parsed (Maybe I'm just traveling
> the resultant DOM wrong though). I.e. when I ask the first child for
> it's sibling, I get zero.
I doubt they evaporated. Are you sure you installed a DOMErrorHandler
instance in the DOMBuilder instance? If you did, then you would have seen
an error, since what you describe is not a well-formed XML document,
although it is a well-formed external general parsed entity:
http://www.w3.org/TR/REC-xml/#wf-entities
>
> I solved this by adding a 'container' element to the fragment first when
> saving, and dumping extra nodes into that -- and, of course, moving down
> an additional level when parsing it back out. Still, it seems like a
> strangely unnecessary step. Is this an unavoidable result of not having
> a root node in my fragment? Or am I just not traversing the parsed out
> document fragment correctly?
You are using the canonical method to turn an external general parsed
entity into a well-formed document -- wrapping it in a root element:
http://www.w3.org/TR/xslt#section-XML-Output-Method
Dave
>
> ..............................
> Andrew Patterson
> Software Engineer
> Avenza Systems Inc.
>
> email: andrew@avenza.com
> phone: 416.487.5116
>
Re: How to parse an XML string in memory? and get a DOM tree out
of it
Posted by Andrew Patterson <an...@avenza.com>.
> Sure you can, you just can't use DOMBuilderImpl::parseWithContext().
> Just use the regular parse() member function:
>
> DOMDocument* DOMBuilderImpl::parse(const DOMInputSource& source)
Ah, okay, got it working. I tried parse() first and couldn't get it
working right. That's when I moved on to parseWithContext() (it seemed
more appropriate for what I was doing). I've figured out what I was
doing wrong with parse() -- all is good! Many, many thanks!
One last question, purely out of curiosity. I've been dumping unused
nodes into document fragment and then writing it out to string for
storage. What I've been trying to do is pull it back out later on (the
XML parsers may be terminated in the interim which is why I have to save
it as a string temporarily instead of just leaving the DOM fragments
intact).
If the document fragment had one node in it, all was fine. Got the first
child of the parsed string and there it was. But if there was more than
one node, they seem to evaporate when parsed (Maybe I'm just traveling
the resultant DOM wrong though). I.e. when I ask the first child for
it's sibling, I get zero.
I solved this by adding a 'container' element to the fragment first when
saving, and dumping extra nodes into that -- and, of course, moving down
an additional level when parsing it back out. Still, it seems like a
strangely unnecessary step. Is this an unavoidable result of not having
a root node in my fragment? Or am I just not traversing the parsed out
document fragment correctly?
..............................
Andrew Patterson
Software Engineer
Avenza Systems Inc.
email: andrew@avenza.com
phone: 416.487.5116
Re: How to parse an XML string in memory? and get a DOM tree out
of it
Posted by David Bertoni <db...@apache.org>.
Andrew Patterson wrote:
>> You have the source code available, so please consider using it. ;-)
>
> Doh! Sorry, I just downloaded the source too to get the sample -- should
> have looked in there :P So how then can I turn a string into a DOM tree?
> Is it simply not possible to parse a string into a DOM tree (yet?) or is
> there some other way I should be doing this?
>
Sure you can, you just can't use DOMBuilderImpl::parseWithContext(). Just
use the regular parse() member function:
DOMDocument* DOMBuilderImpl::parse(const DOMInputSource& source)
Dave
Re: How to parse an XML string in memory? and get a DOM tree out
of it
Posted by Andrew Patterson <an...@avenza.com>.
> You have the source code available, so please consider using it. ;-)
Doh! Sorry, I just downloaded the source too to get the sample -- should
have looked in there :P So how then can I turn a string into a DOM tree?
Is it simply not possible to parse a string into a DOM tree (yet?) or is
there some other way I should be doing this?
..............................
Andrew Patterson
Software Engineer
Avenza Systems Inc.
email: andrew@avenza.com
phone: 416.487.5116
Re: How to parse an XML string in memory? and get a DOM tree out
of it
Posted by David Bertoni <db...@apache.org>.
Andrew Patterson wrote:
>> I don't know what the problem is with the DOMBuilder, but it's clear
>> why the exception message is not printed properly. If you check the
>> return type from DOMException::getMessage(), you'll see it's "const
>> XMLCh*" which means it's a UTF-16 string. Unfortunately, you've told
>> sprintf that the parameter is "const char*" so you won't get anything
>> interesting as a result.
>
> Ah, okay -- thanks! Took about 15 seconds to add and the result is now
> 'The implementation does not support the requested type of object or
> operation'. Any idea which object or operation it's talking about?
> Perhaps it's not happy about the DOMBuilder::ACTION_APPEND_AS_CHILDREN
> action I've requested?
You have the source code available, so please consider using it. ;-)
Here's the relevant code from DOMBuilderImpl.cpp:
void DOMBuilderImpl::parseWithContext(const DOMInputSource&,
DOMNode* const,
const short)
{
throw DOMException(DOMException::NOT_SUPPORTED_ERR, 0,
getMemoryManager());
}
Dave
Re: How to parse an XML string in memory? and get a DOM tree out
of it
Posted by Andrew Patterson <an...@avenza.com>.
> I don't know what the problem is with the DOMBuilder, but it's clear why
> the exception message is not printed properly. If you check the return
> type from DOMException::getMessage(), you'll see it's "const XMLCh*"
> which means it's a UTF-16 string. Unfortunately, you've told sprintf
> that the parameter is "const char*" so you won't get anything
> interesting as a result.
Ah, okay -- thanks! Took about 15 seconds to add and the result is now
'The implementation does not support the requested type of object or
operation'. Any idea which object or operation it's talking about?
Perhaps it's not happy about the DOMBuilder::ACTION_APPEND_AS_CHILDREN
action I've requested?
..............................
Andrew Patterson
Software Engineer
Avenza Systems Inc.
email: andrew@avenza.com
phone: 416.487.5116
Re: How to parse an XML string in memory? and get a DOM tree out
of it
Posted by David Bertoni <db...@apache.org>.
Andrew Patterson wrote:
>> I am not sure I understood what you asked, but I think you need to use
>> MemBufInputSource; see the MemParse sample for an example of its usage.
>
> I'm working on something similar -- I posted here a month or so back
> about it -- and I can't get it to work either. I looked at MemParse.cpp
> but it's using a SAX parser and I need DOM -- so the sample has some
> usefulness, but is obviously different.
>
> Here's what I'm trying:
>
> --------------------
> const XMLByte foo[256] = "<foo>Testing</foo>";
> int size = strlen((char*)foo);
>
> MemBufInputSource* stringSource = new MemBufInputSource(foo, size,
> "ignored", false);
> assert(stringSource);
>
> try {
> Wrapper4InputSource source(stringSource);
> m_parser->parseWithContext(source, &node,
> DOMBuilder::ACTION_APPEND_AS_CHILDREN);
> } catch (DOMException& e) {
> printf("EXCEPTION: '%s'\n", e.getMessage());
> }
> --------------------
>
> node is a DOMNode& that's been passed in as the place to attached the
> resultant DOM fragment & m_parser is a DOMBuilder*. I'm obviously doing
> *something* wrong, but the exception I get is extremely unhelpful. The
> output I get is "EXCEPTION: 'T'" -- not the most informative message ^_^
>
I don't know what the problem is with the DOMBuilder, but it's clear why
the exception message is not printed properly. If you check the return type
from DOMException::getMessage(), you'll see it's "const XMLCh*" which means
it's a UTF-16 string. Unfortunately, you've told sprintf that the
parameter is "const char*" so you won't get anything interesting as a result.
Look in the documentation for XMLString::transcode() to see how you can
transcode the UTF-16 string to the local code page.
Dave
Re: How to parse an XML string in memory? and get a DOM tree out
of it
Posted by Andrew Patterson <an...@avenza.com>.
> I am not sure I understood what you asked, but I think you need to use
> MemBufInputSource; see the MemParse sample for an example of its usage.
I'm working on something similar -- I posted here a month or so back
about it -- and I can't get it to work either. I looked at MemParse.cpp
but it's using a SAX parser and I need DOM -- so the sample has some
usefulness, but is obviously different.
Here's what I'm trying:
--------------------
const XMLByte foo[256] = "<foo>Testing</foo>";
int size = strlen((char*)foo);
MemBufInputSource* stringSource = new MemBufInputSource(foo, size,
"ignored", false);
assert(stringSource);
try {
Wrapper4InputSource source(stringSource);
m_parser->parseWithContext(source, &node,
DOMBuilder::ACTION_APPEND_AS_CHILDREN);
} catch (DOMException& e) {
printf("EXCEPTION: '%s'\n", e.getMessage());
}
--------------------
node is a DOMNode& that's been passed in as the place to attached the
resultant DOM fragment & m_parser is a DOMBuilder*. I'm obviously doing
*something* wrong, but the exception I get is extremely unhelpful. The
output I get is "EXCEPTION: 'T'" -- not the most informative message ^_^
Any suggestions on what I can do to alleviate this? I think I'm on the
right track, but clearly something isn't right.
Any help appreciated!
..............................
Andrew Patterson
Software Engineer
Avenza Systems Inc.
email: andrew@avenza.com
phone: 416.487.5116
Re: How to parse an XML string in memory? and get a DOM tree
out of it
Posted by Alberto Massari <am...@datadirect.com>.
Hi,
I am not sure I understood what you asked, but I think you need to
use MemBufInputSource; see the MemParse sample for an example of its usage.
Alberto
At 20.04 08/09/2006 +0000, Florent Philippe wrote:
>Hi,
>
>Is it possible to parse the output of a DOM created xml that is in a
>memoy buffer
>to be able to parse it with functions like getchild, getroot or something?
>
>I found out about using handler to retreive data on parsing but i
>thought it woul be easier to deal with the solution above
>
>thx