You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xml-commons-dev@xerces.apache.org by Jacob Kjome <ho...@visi.com> on 2009/06/01 17:43:27 UTC

Re: Proposal - A Very Simple API for Reading Simple XML Data

Have you considered commons-configuration [1]?  It seems to me it does what 
you want, while being very flexible and robust.


[1] http://commons.apache.org/configuration/

Jake

On Sun, 31 May 2009 09:24:48 -0700 (PDT)
  chris0 <te...@gmail.com> wrote:
> 
> Hello,
> 
> I'm making this proposal because I couldn't find a good solution to a
> problem that I recently had.
> 
> What I wanted to do was to configure an application with a simple XML file,
> something along the lines of:
> 
> <config>
> 	<title>test</title>
> 	<version
> 		major="1"
> 		minor="2"/>
> 	<roles>
> 		<role name="admin"/>
> 		<role name="user"/>
> 	</roles>
> 	<users>
> 		<user name="joe" password="pass" role="admin"/>
> 		<user name="harry" password="secret" role="user"/>
> 	</users>
> </config>
> 
> So the point is that it's really a very simple bit of XML data and what I
> wanted was a nice easy way to read it in. The current options seem to be
> SAX, DOM or JAXB.
> 
> To use JAXB I need an XSD which is just complicating things too much for a
> simple config file. SAX is a bit low-level and complicated too. The best
> option is probably DOM, but even DOM is quite verbose. Infact DOM is verbose
> enough that it made me think for a while if I could just use some other
> mechanism such as a properties file so that I didn't have to bother with
> DOM.
> 
>For the purpose of this proposal I've written what I think is pretty much
> the most concise way to parse this data with DOM. Also note that I want to
> provide some basic error reporting if the XML data is not in the expected
> format. The code follows below:
> 
> 	FileInputStream fileInputStream = null;
> 	try
> 	{
> 		String rootName = "config";
> 		fileInputStream = new FileInputStream("config.xml");		
> 		DocumentBuilderFactory builderFactory =
> DocumentBuilderFactory.newInstance();
> 		DocumentBuilder builder = builderFactory.newDocumentBuilder();
> 		Document document = builder.parse(fileInputStream);
> 		Element rootElement = document.getDocumentElement();
> 		if(!rootElement.getNodeName().equals(rootName)) 
> 			throw new Exception("Could not find root node: "+rootName);
> 
> 		NodeList titles = rootElement.getElementsByTagName("title");
> 		if(titles.getLength()!=1) throw new Exception("Could not find individual
> node: title");
> 		Node title = titles.item(0);
> 		out.println("title: "+title.getTextContent());
> 		
> 		NodeList versions = rootElement.getElementsByTagName("version");
> 		if(versions.getLength()!=1) throw new Exception("Could not find individual
> node: version");
> 		Node version = versions.item(0);
> 		NamedNodeMap versionAttributes = version.getAttributes();
> 		Node major = versionAttributes.getNamedItem("major");
> 		if(major==null) throw new Exception("Could not find attribute: major");
> 		Node minor = versionAttributes.getNamedItem("minor");
> 		if(minor==null) throw new Exception("Could not find attribute: minor");
> 		out.println(
> 			"version:
> "+Integer.parseInt(major.getNodeValue())+"."+Integer.parseInt(minor.getNodeValue()));
> 		
> 		NodeList roles = rootElement.getElementsByTagName("roles");
> 		if(roles.getLength()!=1) throw new Exception("Could not find individual
> node: roles");
> 		
> 		NodeList roleList = ((Element)roles.item(0)).getElementsByTagName("role");
> 		int n = roleList.getLength();
> 		for(int i=0;i<n;i++)
> 		{
> 			Node role = roleList.item(i);
> 			Node roleName = role.getAttributes().getNamedItem("name");
> 			if(roleName==null) throw new Exception("Could not find attribute: name");
> 			out.println("role: name: "+roleName.getNodeValue());
> 		}
> 		
> 		NodeList users = rootElement.getElementsByTagName("users");
> 		if(users.getLength()!=1) throw new Exception("Could not find individual
> node: users");
> 		
> 		NodeList userList = ((Element)users.item(0)).getElementsByTagName("user");
> 		n = userList.getLength();
> 		for(int i=0;i<n;i++)
> 		{
> 			Node user = userList.item(i);
> 			NamedNodeMap userAttributes = user.getAttributes();
> 			Node userName = userAttributes.getNamedItem("name");
> 			if(userName==null) throw new Exception("Could not find attribute: name");
> 			Node userPassword = userAttributes.getNamedItem("password");
> 			if(userPassword==null) throw new Exception("Could not find attribute:
> password");
> 			Node userRole = userAttributes.getNamedItem("role");
> 			if(userPassword==null) throw new Exception("Could not find attribute:
> role");
> 			out.println(
> 				"user: name: "+userName.getNodeValue()+
> 				", password: "+userPassword.getNodeValue()+
> 				", role: "+userRole.getNodeValue());
> 		}
> 	}
> 	finally
> 	{
> 		if(fileInputStream!=null) fileInputStream.close();
> 	}
> 	
> Output:
> 
> 	title: test
> 	version: 1.2
> 	role: name: admin
> 	role: name: user
> 	user: name: joe, password: pass, role: admin
> 	user: name: harry, password: secret, role: user
> 
> And it's pretty verbose, 65 lines in all.
> 
> What I then decided to do was to write a very simple utility class called
> XmlData that could be used to get this information as easily as possible,
> and that would have basic built-in error reporting. When using the XmlData
> class it is possible to write equivalent code to the above in a fraction of
> the number of lines:
> 
> 	XmlData config = new XmlData("config.xml","config");
> 
> 	out.println("title: "+config.child("title").content());
> 	
> 	XmlData version = config.child("version");
> 	out.println("version:
> "+version.integer("major")+"."+version.integer("minor"));
> 	
> 	for(XmlData role:config.child("roles").children("role")) out.println("role:
> name: "+role.string("name"));
> 	
> 	for(XmlData user:config.child("users").children("user"))
> 	{
> 		out.println(
> 			"user: name: "+user.string("name")+
> 			", password: "+user.string("password")+
> 			", role: "+user.string("role"));
> 	}
> 
> Output:
> 
> 	title: test
> 	version: 1.2
> 	role: name: admin
> 	role: name: user
> 	user: name: joe, password: pass, role: admin
> 	user: name: harry, password: secret, role: user
> 
> As you can see, this is a really simple way to read basic XML files. All
> nodes, attributes and content can very easily be accessed, and only one
> class is required.
> 
> The XmlData class uses DOM to read an XML file then builds its own
> representation of the tree in a very simple form, with only the basic data
> present and all data exposed using standard Collections classes where
> appropriate. The XmlData class source is included at the end of this post.
> 
> So in summary, the main point of this single utility class, XmlData, is to
> provide a very simple and user-friendly way of reading simple XML files that
> may contain simple data such as startup configurations. This class is not
> supposed to be a substitute for DOM and is not intended for high-performance
> scenarios, rather to make XML as easily accessible as possible.
> 
> [One improvement to this may be to base XmlData on SAX rather than DOM in
> order to increase performance]
> 
> My proposal is perhaps to see something like XmlData as a Commons utility
> class.
> 
> All comments very welcome.
> 
> Thanks,
> 
> Chris.
> 
> --
> 
> package main;
> 
> import java.io.FileInputStream;
> import java.util.ArrayList;
> import java.util.Collections;
> import java.util.HashMap;
> import java.util.List;
> import java.util.Map;
> 
> import javax.xml.parsers.DocumentBuilder;
> import javax.xml.parsers.DocumentBuilderFactory;
> 
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.NamedNodeMap;
> import org.w3c.dom.Node;
> import org.w3c.dom.NodeList;
> 
> public class XmlData
> {
> 	private static Element rootElement(String filename, String rootName) throws
> Exception
> 	{
> 		FileInputStream fileInputStream = null;
> 		try
> 		{
> 			fileInputStream = new FileInputStream(filename);		
> 			DocumentBuilderFactory builderFactory =
> DocumentBuilderFactory.newInstance();
> 			DocumentBuilder builder = builderFactory.newDocumentBuilder();
> 		    Document document = builder.parse(fileInputStream);
> 		    Element rootElement = document.getDocumentElement();
> 		    if(!rootElement.getNodeName().equals(rootName)) 
> 		    	throw new RuntimeException("Could not find root node: "+rootName);
> 		    return rootElement;
> 		}
> 		finally
> 		{
> 			if(fileInputStream!=null) fileInputStream.close();
> 		}
> 	}
> 	
> 	public XmlData(String filename, String rootName) throws Exception
> 	{
> 		this(rootElement(filename,rootName));
> 	}
> 	
> 	private XmlData(Element element)
> 	{
> 		this.name = element.getNodeName();
> 		this.content = element.getTextContent();
> 		NamedNodeMap namedNodeMap = element.getAttributes();
> 		int n = namedNodeMap.getLength();
> 		for(int i=0;i<n;i++)
> 		{
> 			Node node = namedNodeMap.item(i);
> 			String name = node.getNodeName();
>    		addAttribute(name,node.getNodeValue());
> 		}		
> 		NodeList nodes = element.getChildNodes();
> 		n = nodes.getLength();
> 	    for(int i=0;i<n;i++)
> 	    {
> 	    	Node node = nodes.item(i);
> 	    	int type = node.getNodeType();
> 	    	if(type==Node.ELEMENT_NODE) addChild(node.getNodeName(),new
> XmlData((Element)node));
> 	    }
> 	}
> 	
> 	private void addAttribute(String name, String value)
> 	{
> 		nameAttributes.put(name,value);
> 	}
> 	
> 	private void addChild(String name, XmlData child)
> 	{
> 		List<XmlData> children = nameChildren.get(name);
> 		if(children==null)
> 		{
> 			children = new ArrayList<XmlData>();
> 			nameChildren.put(name,children);
> 		}
> 		children.add(child);
> 	}
> 	
> 	public String name()
> 	{
> 		return name;
> 	}
> 	
> 	public String content()
> 	{
> 		return content;
> 	}
> 	
> 	public XmlData child(String name) throws Exception
> 	{
> 		List<XmlData> children = children(name);
> 		if(children.size()!=1) throw new Exception("Could not find individual
> child node: "+name);
> 		return children.get(0);
> 	}
> 	
> 	public List<XmlData> children(String name)
> 	{
> 		List<XmlData> children = nameChildren.get(name);
> 		return children==null ? Collections.EMPTY_LIST : children;			
> 	}
> 	
> 	public String string(String name) throws Exception
> 	{
> 		String value = nameAttributes.get(name);
> 		if(value==null) throw new Exception("Could not find attribute: "+name+",
> in node: "+this.name);
> 		return value;
> 	}
> 	
> 	public int integer(String name) throws Exception
> 	{
> 		return Integer.parseInt(string(name)); 
> 	}
> 	
> 	private String name;
> 	private String content;
> 	private Map<String,String> nameAttributes = new HashMap<String,String>();
> 	private Map<String,List<XmlData>> nameChildren = new
> HashMap<String,List<XmlData>>();
> }
> -- 
> View this message in context: 
>http://www.nabble.com/Proposal---A-Very-Simple-API-for-Reading-Simple-XML-Data-tp23804602p23804602.html
> Sent from the Apache XML - Commons - Dev mailing list archive at Nabble.com.
> 
> 


Re: Proposal - A Very Simple API for Reading Simple XML Data

Posted by Gary Rowe <g....@froot.co.uk>.
After stumbling upon this answer I thought I'd do the conversion and compare
the two approaches. 

I happened to have used the original posters method but didn't like it
because it seemed to move the persistence meta-data out of the model and
into an external adapter. Some people may like that, but I prefer to have
the persistence meta-data in the form of annotations right next to the model
(or DTO if you prefer). 

So I made the conversion and using XStream I was able to remove vast tracts
of marshalling code and condense it into a a single 3 line templated method
in a base class. All the mapping information is now held in simple to
understand annotations, and specialised date handling is managed through an
XStream converter. 

Now for the all important performance figures. In my particular
configuration that requires a *lot* of XML parsing and mucking about with
various InputStreams, XStream came out on top - which is a relief given the
benefits that it brings. The times to completion were: 

Traditional XML through DocumentFactory and Node analysis=32s
XStream=26s

Not a vast difference (~20%) but enough to justify XStream as the better way
to go.

Hope this helps someone.

Gary  


Bruno Borges wrote:
> 
> Try XStream. :-)
> 
> Cheers,
> Bruno
> 

-- 
View this message in context: http://www.nabble.com/Proposal---A-Very-Simple-API-for-Reading-Simple-XML-Data-tp23804602p25344422.html
Sent from the Apache XML - Commons - Dev mailing list archive at Nabble.com.


Re: Proposal - A Very Simple API for Reading Simple XML Data

Posted by Bruno Borges <br...@gmail.com>.
Try XStream. :-)

Cheers,
Bruno


Hi Jake,

Thanks for pointing that out - I didn't know about that.

First impressions are that yes it's [very] flexible, my only criticisms
being that it doesn't use generics, and that it's a pretty big API, whereas
the single-class solution is nice and light weight and probably caters for
the majority of use cases.

Anyway, whether that warrants another interface is questionable... probably
not.

Cheers,
Chris.


Jacob Kjome wrote:
> 
> Have you considered commons-configuration [1]?  It seems to me it does
> what 
> you want, while being very flexible and robust.
> 
> 
> [1] http://commons.apache.org/configuration/
> 
> Jake
> 
> 
> 


-----
Bruno Borges
blog.brunoborges.com.br
+55 21 76727099

"The glory of great men should always be
measured by the means they have used to
acquire it."
- Francois de La Rochefoucauld
-- 
View this message in context: http://www.nabble.com/Proposal---A-Very-Simple-API-for-Reading-Simple-XML-Data-tp23804602p24606592.html
Sent from the Apache XML - Commons - Dev mailing list archive at Nabble.com.


Re: Proposal - A Very Simple API for Reading Simple XML Data

Posted by chris0 <te...@gmail.com>.
Hi Jake,

Thanks for pointing that out - I didn't know about that.

First impressions are that yes it's [very] flexible, my only criticisms
being that it doesn't use generics, and that it's a pretty big API, whereas
the single-class solution is nice and light weight and probably caters for
the majority of use cases.

Anyway, whether that warrants another interface is questionable... probably
not.

Cheers,
Chris.


Have you considered commons-configuration [1]?  It seems to me it does what 
you want, while being very flexible and robust.


[1] http://commons.apache.org/configuration/

Jake


-- 
View this message in context: http://www.nabble.com/Proposal---A-Very-Simple-API-for-Reading-Simple-XML-Data-tp23804602p23823138.html
Sent from the Apache XML - Commons - Dev mailing list archive at Nabble.com.