You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Song Li <li...@iastate.edu> on 2004/06/03 17:02:40 UTC

Newbie's question -- extracting information from XML file

Hi,

I have an XML file with the format looks like:
===================================
  <relation entry1="43" entry2="48" type="ECrel">
         <subtype name="compound" value="88"/>
     </relation>
===================================

I want to extract the "value" under the node "subtype" by using DOM,  
I've tried two methods but neither return a correct result:

method 1:  get a DOMnode(let's call it nd)  points to "relation", then 
call "nd2 = nd->getFirstChild();". The problem is  that this nd2 has 
the type "TEXT NODE", so I can't use "getAttributes", and 
getTextContent() doesn't work as well.

method2:
subList = doc->getElementsByTagName(XMLString::transcode("subtype"));

This time subList->item(0) has the type "ELEMENT NODE", but 
surprisingly "subList->item(0)->getAttributes()->getLength()" returns 1 
!!! and this only attribute is "name" ... I still cannot extract 
"value".

So what's wrong with these methods and what's the good way to get 
"value" from this file???

Thanks a lot! Please reply, if this message is not clear enough I can 
post my code in more detail......

Song


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Newbie's question -- extracting information from XML file

Posted by Nick Bastin <nb...@opnet.com>.
On Jun 3, 2004, at 11:02 AM, Song Li wrote:

> Hi,
>
> I have an XML file with the format looks like:
> ===================================
>  <relation entry1="43" entry2="48" type="ECrel">
>         <subtype name="compound" value="88"/>
>     </relation>
> ===================================
>
> I want to extract the "value" under the node "subtype" by using DOM,  
> I've tried two methods but neither return a correct result:
>
> method 1:  get a DOMnode(let's call it nd)  points to "relation", then 
> call "nd2 = nd->getFirstChild();". The problem is  that this nd2 has 
> the type "TEXT NODE", so I can't use "getAttributes", and 
> getTextContent() doesn't work as well.

You'll want to move on to the next node.  The text node that you're 
looking at is the whitespace (probably tab) in the line before the 
<subtype> tag.  You don't want to blindly assume that when you have a 
node that its' children are what you think they are - you need to check 
the tag names to make sure that they are indeed what you're looking 
for.  Or, you can validate the document and have the parser remove the 
bonus text nodes.

--
Nick


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Newbie's question -- extracting information from XML file

Posted by Erik Rydgren <er...@mandarinen.se>.
Your XML file actually looks like this in a DOM tree.

Relation - ELEMENTNODE
  Whitespace - TEXTNODE
  Subtype - ELEMENTNODE
  Whitespace - TEXTNODE

The endlines in your document is stored inside the DOM tree as well. You
can filter them out using a DOMTreeWalker.

Regards
/ Erik

> -----Original Message-----
> From: Song Li [mailto:lisong@iastate.edu]
> Sent: den 3 juni 2004 17:03
> To: xerces-c-dev@xml.apache.org
> Subject: Newbie's question -- extracting information from XML file
> 
> Hi,
> 
> I have an XML file with the format looks like:
> ===================================
>   <relation entry1="43" entry2="48" type="ECrel">
>          <subtype name="compound" value="88"/>
>      </relation>
> ===================================
> 
> I want to extract the "value" under the node "subtype" by using DOM,
> I've tried two methods but neither return a correct result:
> 
> method 1:  get a DOMnode(let's call it nd)  points to "relation", then
> call "nd2 = nd->getFirstChild();". The problem is  that this nd2 has
> the type "TEXT NODE", so I can't use "getAttributes", and
> getTextContent() doesn't work as well.
> 
> method2:
> subList = doc->getElementsByTagName(XMLString::transcode("subtype"));
> 
> This time subList->item(0) has the type "ELEMENT NODE", but
> surprisingly "subList->item(0)->getAttributes()->getLength()" returns
1
> !!! and this only attribute is "name" ... I still cannot extract
> "value".
> 
> So what's wrong with these methods and what's the good way to get
> "value" from this file???
> 
> Thanks a lot! Please reply, if this message is not clear enough I can
> post my code in more detail......
> 
> Song
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org