You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by Dennis Sosnoski <dm...@sosnoski.com> on 2006/03/30 01:36:26 UTC

[Axis2] Axiom restructuring proposal

Right now the Axiom code is explicitly tied to StAX as the source of 
data for constructing the document tree. This creates problems in 
working with data binding frameworks which do not support marshalling 
via a StAX XMLStreamReader, and limits the usability of Axiom across a 
wider range of applications.

On the output side, Axiom generally uses XMLStreamWriter but also 
defines OMNode methods taking an OutputStream, a Writer, and a 
OMOutputImpl, as well as variations with OMOutputFormat included, and 
variations for serialize vs. serializeAndConsume. This proliferation of 
methods adds a lot of complexity to the interface while requiring all 
components of the tree to handle each of these forms of output. For 
components representing marshalling output from data binding frameworks 
these variations also force inefficient handling of output (for 
instance, by writing to an XMLStreamWriter when the framework can more 
efficiently write directly to an OutputStream).

I'd like to see Axiom instead define more generic interfaces for 
handling various data sources and output mechanisms. Really all that's 
required from potential data sources is that (1) the element holding 
some data can be converted to an Axiom tree structure on demand 
(possibly piecemeal, as in the case of elements backed by an 
XMLStreamReader), and (2) the data source can write itself to an 
OutputStream, Writer and/or XMLStreamWriter. We can abstract out these 
operations to the data source for unexpanded elements. This will allow 
cleaner handling of data binding framework extensions to Axis2, while 
also allowing flexibility for developers who have their own ways of 
processing XML (see http://issues.apache.org/jira/browse/AXIS2-483 for 
an example).

Here's a first cut at an interface for the data source of an unexpanded 
element:

public interface BackingData {
    void expandElement(OMElement element); // expand the element information
    void expandNextContent(OMElement element); // expand next content 
item of element
    boolean isReusable(); // check if data source can be used repeatedly 
(may avoid the need for expansion if so)
    void serialize(SerializationTarget target); // serialize using any 
supported approach
}

When the expandElement() method of the BackingData is called, it will 
populate at least the element's attributes and namespaces information. 
When the expandNextContent() method is called, it would be the 
responsibility of that BackingData instance to construct at least the 
next content node of the element. If that next content node is an 
element, the BackingData would be able to leave that element unexpanded 
and attach itself to the element. The idea here is to be flexible enough 
to handle both elements backed by an XMLStreamReader and those backed by 
data binding or an alternative form of XML handling. The 
expandElement()/expandNextContent() methods would need to be called in 
proper document order, so that if the data is coming from an 
XMLStreamReader it will be read sequentially (no expandNextContent() 
higher in the tree until all the content before that point in document 
order has been expanded).

Here's a first cut at an interface for the serialization handling:

public inteface SerializationTarget {
    OutputStream getOutputStream(); // return output stream if available 
for direct output, otherwise null
    Writer getWriter(); // return writer if available for direct output, 
otherwise null
    XMLStreamWriter getXMLWriter(); // return XMLStreamWriter (always 
available)

    boolean isAttachable(String contentType, long estimatedSize); // 
check if "optimizable" data should be sent as attachment
    String addAttachment(String contentType, InputStream is);  // add 
attachment in the form of a stream (returns content id)
    String addAttachment(String contentType, byte[] bytes);  // add 
attachment in the form of a byte array (returns content id)
}

The first part of this interface is the basic output handling. The rule 
here is that every SerializationTarget will supply an XMLStreamWriter on 
demand, but will also supply either an OutputStream or a Writer (so 
either of the first two methods may return null, but not both). The 
principle here is that many forms of XML handling can write directly to 
an output stream or writer but not to an XMLStreamWriter, while the 
latter provides a flush() method which should make it safe for the 
output stream or writer to be used independently for XML fragments - so 
use the XMLStreamWriter for the envelope, if that's what you want, but 
still use the stream or writer to output the body of the document.

The second part of this interface deals with attachments. It gives the 
SerializationTarget (which would be transport-dependent, of course) the 
control over what actually gets sent as an attachment, and provides the 
data to be output as an attachment in the form of either a stream or an 
array of bytes. This would allow us to fix the current broken output 
behavior which forces generation of a fully-expanded OM tree for every 
message being sent, just so the transport code can check for anything it 
wants to send as an attachment.

I'm planning to make the chat later today if anyone wants to discuss 
these ideas (and also via email exchange, of course).

  - Dennis

-- 
Dennis M. Sosnoski
SOA, Web Services, and XML
Training and Consulting
http://www.sosnoski.com - http://www.sosnoski.co.nz
Seattle, WA +1-425-296-6194 - Wellington, NZ +64-4-298-6117