You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by Glen Daniels <gl...@thoughtcraft.com> on 2006/03/30 06:40:41 UTC

[Axis2] CHAT LOG : 2006-03-29

...attached.  Discussed Dennis' proposal for AXIOM refactoring to enable 
lower-level MTOM support for DB frameworks.  We got to consensus on the 
receiving side, but didn't have a chance to finish discussing the 
sending side, which we hope to do via email ASAP.

--Glen

Re: [Axis2] Data binding attachments support (was: CHAT LOG : 2006-03-29)

Posted by Thilina Gunarathne <cs...@gmail.com>.
Hi,
> May be i am missing something...the difference in my mind is a person
> implementing a databinding layer should be able to access the
> attachements without having to build the om tree. straight from stax
> to java objects with no om and use whatever they need to store the
> attachments byte arrays or data handlers or some databinding specific
> construct.
IMO this is possible even in the current implementation. They can
easily use the functionality provided by the MIMEHelper.. I accept
that we need to come up with a much better API with much more
functionality.. Specially when talking about the outflow...

<quoting my earlier post>
But the OM will **not** get created irrespective of whether the
Caching is ON or OFF.. Remaining InputStream for the  Envelope is
buffered in a File or in memory as a FileDataSource of
MemoryDataSource depending on the size..
</quote>

As i have mentioned in my earlier post in this thread, MIME parser
operates one level lower to the stax+OM and we are buffering the
envelope at that level whenever we need to access the attachments...

Worried whether I'm making any sense....

~Thilina


>
> -- dims
>
> On 3/31/06, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
> > On Fri, 2006-03-31 at 15:35 +0600, Thilina Gunarathne wrote:
> > > <quote Glen>
> > > What we want is a "thingy" which can be stored away and LATER used to
> > > get the real attachment data after all XML pulling is done
> > > </quote>
> > >
> > > IMHO we are doing the exactly the same thing using OMTexts.. Currently
> > > we are doing this with more flexibility and of course with a catch..
> > > Flexibility is given by allowing the users to get the real attachment
> > > data if they wish while pulling the XML.. Catch is that we are
> > > buffering the SOAP Envelope in the MIME Parser which is underneath the
> > > StaxReader...
> >
> > Thilina, I don't see this as a catch - isn't it impossible to get to any
> > attachments without buffering the SOAP envelope? Or are you thinking
> > about reading the SOAP envelope and buffering it IFF someone actually
> > refers to an attachment?
> >
> > While that's interesting in theory, whoever sent the attachment more
> > often than not expected the other end to read the darn thing. I don't
> > see the point of that potential optimization. Maybe I'm missing
> > something.
> >
> > Sanjiva.
> >
> >
>
>
> --
> Davanum Srinivas : http://wso2.com/blogs/
>


--
"May the SourcE be with u"
http://webservices.apache.org/~thilina/
http://thilinag.blogspot.com/                
http://www.bloglines.com/blog/Thilina

Re: [Axis2] Data binding attachments support

Posted by Dennis Sosnoski <dm...@sosnoski.com>.
Yes, I think that's the crux of the matter - we need to be able to build 
a tree *if required* by ws-sec or such, but the common case is no ws-sec 
and hence no tree needed. And if you *do* use ws-sec, the OM 
tree-building is the least of your performance concerns.

For the JiBX interface to Axis2 I've got this part dummied out for now, 
but the intent is to just marshal output to a memory buffer and then 
build an OM tree from that output. The OMElement implementation that 
represents a JiBX data item will then just delegate to the OMElement 
constructed from the output. I'll split this apart and move the 
delegating OMElement implementation over to Axiom, both because this 
makes it easier to handle the continuing changes in the Axiom API and 
because that way the same technique can be used to support JAXB 2.0 and 
other data binding frameworks.

  - Dennis

Davanum Srinivas wrote:

>if ws-sec is turned on, then we force a build of the om tree...just
>like we do now.
>
>-- dims
>
>On 3/31/06, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
>  
>
>>On Fri, 2006-03-31 at 10:55 -0500, Davanum Srinivas wrote:
>>    
>>
>>>Let me try again...The DB framework will build the java objects
>>>directly from the MIME root part (this is the first step always!) and
>>>*then* accesses the other mime parts and sticks them where it is
>>>needed (or adds a reference) on the java objects that it already
>>>created. Except that OM tree is *never* built.
>>>      
>>>
>>Ah but that's inconsistent with XOP .. if you do XOP, then when you look
>>at the XML at the Infoset level (which is what you do when you look at
>>the root part thru Axiom) then you have to un-XOPify it and just see the
>>XML Infoset. There's no halfway point.
>>
>>What you're looking at is SwA .. MTOM is not that IMO.
>>
>>I guess we could put a flag saying "don't unXOPify" but that seems like
>>a hack.
>>
>>    
>>
>>> And on the sending
>>>side, it generates stax events directly from the the java objects into
>>>the MIME root part and adds the attachments into a bag while it is
>>>doing so...again no OM tree in the picture at all.
>>>      
>>>
>>Again, you're thinking like SwA and not like a single unified Infoset
>>that has the binary parts logically in it. Think of WS-Security- how
>>will your model work with WS-Sec turned on to sign the whole shebang?
>>
>>Sanjiva.
>>
>>
>>    
>>
>
>
>--
>Davanum Srinivas : http://wso2.com/blogs/
>
>  
>

Re: [Axis2] Data binding attachments support

Posted by Davanum Srinivas <da...@gmail.com>.
if ws-sec is turned on, then we force a build of the om tree...just
like we do now.

-- dims

On 3/31/06, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
> On Fri, 2006-03-31 at 10:55 -0500, Davanum Srinivas wrote:
> > Let me try again...The DB framework will build the java objects
> > directly from the MIME root part (this is the first step always!) and
> > *then* accesses the other mime parts and sticks them where it is
> > needed (or adds a reference) on the java objects that it already
> > created. Except that OM tree is *never* built.
>
> Ah but that's inconsistent with XOP .. if you do XOP, then when you look
> at the XML at the Infoset level (which is what you do when you look at
> the root part thru Axiom) then you have to un-XOPify it and just see the
> XML Infoset. There's no halfway point.
>
> What you're looking at is SwA .. MTOM is not that IMO.
>
> I guess we could put a flag saying "don't unXOPify" but that seems like
> a hack.
>
> >  And on the sending
> > side, it generates stax events directly from the the java objects into
> > the MIME root part and adds the attachments into a bag while it is
> > doing so...again no OM tree in the picture at all.
>
> Again, you're thinking like SwA and not like a single unified Infoset
> that has the binary parts logically in it. Think of WS-Security- how
> will your model work with WS-Sec turned on to sign the whole shebang?
>
> Sanjiva.
>
>


--
Davanum Srinivas : http://wso2.com/blogs/

Re: [Axis2] Data binding attachments support

Posted by Thilina Gunarathne <cs...@gmail.com>.
A quick comment before I run for my lecture :)
> > What you're looking at is SwA .. MTOM is not that IMO.
>
> Same deal for either really, just in one case you have hrefs and in
> another you have xop:Includes.
>
There's a huge difference.. SwA results in two data models. One for
the XML and other for the attachments.
XOP avoids this.. It's just this one XOP:include element..But it makes
a huge difference..
AFAIK when we see a href in a some XML there's no way we can tell for
sure that there's a referenced attachment which should logically map
here.It's just a ordinary attribute..  But whenever we see a
XOP:Include we can definitely tell that "okay.. There's a attachment
which should be logically map to here.". This automatic mapping
results in just one data model.. XML infoset with XOP..

This point comes in to play mostly when these two kinds of messages
were encountered by programs that look only at the payload without
looking at schema or other things.. One example is WS-Security..


~Thilina

Re: [Axis2] Data binding attachments support

Posted by Glen Daniels <gl...@thoughtcraft.com>.
Hey Sanjiva:

> On Fri, 2006-03-31 at 10:55 -0500, Davanum Srinivas wrote:
>> Let me try again...The DB framework will build the java objects
>> directly from the MIME root part (this is the first step always!) and
>> *then* accesses the other mime parts and sticks them where it is
>> needed (or adds a reference) on the java objects that it already
>> created. Except that OM tree is *never* built.
> 
> Ah but that's inconsistent with XOP .. if you do XOP, then when you look
> at the XML at the Infoset level (which is what you do when you look at
> the root part thru Axiom) then you have to un-XOPify it and just see the
> XML Infoset. There's no halfway point. 

Rrright, exactly.  The point is that while what you just said is 
perfectly true, you see the INFOSET (i.e. what you get out of 
XmlStreamReader) but not necessarily the OM tree.  So if a DB framework 
like JIBX wants to handle the XOP support itself, it can do that by 
doing something like this in the deserializer:

   ...deserializing from StAX events...
   if (currentElement.getQName().equals(XOP_INCLUDE)) {
     String contentID = getIDFromXopInclude();
     XOPThing binaryThing =
         new XOPThing(attachmentContext, contentID);
     // insert binaryThing into the object we're deserializing
   }
   ...continue...

XOPThing is this "future DataHandler" that we were talking about, maybe 
just a DataHandler.  But the point is it knows how to ask the 
AttachmentContext object (with an API like what Dennis proposes) for the 
actual InputStream and process it at the right time (obviously after all 
the XML has been pulled/deserialized from the root part - when the 
object tree is compete and the application asks for the image/data/etc).

As I'm thinking about it I wonder if we want to add a configuration 
option which would automatically "pre-cache" all the attachments into 
files immediately upon parsing the end of the root part, or if that 
should indeed be the default behaviour.  In other words, do we want to 
support pausing the read on the actual HTTP InputStream until the 
application asks for an attachment, much as we do with the SOAP envelope?

> What you're looking at is SwA .. MTOM is not that IMO.

Same deal for either really, just in one case you have hrefs and in 
another you have xop:Includes.

> I guess we could put a flag saying "don't unXOPify" but that seems like
> a hack.

You don't need a flag other than "don't cache" - which is really "don't 
build the Object Model".  If OM isn't building the Object Model, it 
can't very well do unXOPification (where would it put anything?).

>>  And on the sending
>> side, it generates stax events directly from the the java objects into
>> the MIME root part and adds the attachments into a bag while it is
>> doing so...again no OM tree in the picture at all.

+1 dims.

> Again, you're thinking like SwA and not like a single unified Infoset
> that has the binary parts logically in it. Think of WS-Security- how
> will your model work with WS-Sec turned on to sign the whole shebang?

Great point.  WS-Security (or anything else which requires the full 
infoset to be preserved in the OM as well as the databound objects) will 
need to switch on a flag which indicates "build the object model 
always".  So really, on the receiving side that flag should be an option 
on the OM builder which overrides a call to 
getXMLStreamReaderWithoutCaching() and turns it into a caching call. 
This is transparent to both the security code (which wants an OM) and 
the DB code (which wants StAX events) - both sides can get at the 
attachments using the correct APIs.  The optimization only works when no 
one sets the "always build the OM" flag, but other than that it should 
work transparently.

The same is true on the outbound side - if the flag is set, the StAX 
events won't go directly out to the OutputStream representing the root 
part of the MIME envelope (as they could in the optimized case), but 
instead build up an OM so that a security (or checksum, or whatever) 
handler can deal with it later.

Make sense?

--Glen

Re: [Axis2] Data binding attachments support

Posted by Sanjiva Weerawarana <sa...@opensource.lk>.
On Fri, 2006-03-31 at 10:55 -0500, Davanum Srinivas wrote:
> Let me try again...The DB framework will build the java objects
> directly from the MIME root part (this is the first step always!) and
> *then* accesses the other mime parts and sticks them where it is
> needed (or adds a reference) on the java objects that it already
> created. Except that OM tree is *never* built.

Ah but that's inconsistent with XOP .. if you do XOP, then when you look
at the XML at the Infoset level (which is what you do when you look at
the root part thru Axiom) then you have to un-XOPify it and just see the
XML Infoset. There's no halfway point. 

What you're looking at is SwA .. MTOM is not that IMO.

I guess we could put a flag saying "don't unXOPify" but that seems like
a hack.

>  And on the sending
> side, it generates stax events directly from the the java objects into
> the MIME root part and adds the attachments into a bag while it is
> doing so...again no OM tree in the picture at all.

Again, you're thinking like SwA and not like a single unified Infoset
that has the binary parts logically in it. Think of WS-Security- how
will your model work with WS-Sec turned on to sign the whole shebang?

Sanjiva.


Re: [Axis2] Data binding attachments support

Posted by Davanum Srinivas <da...@gmail.com>.
Sanjiva,

/me scratches his head and wonders How come glen understands what he
says - "exactly what dims describes here" and sanjiva doesn't :)

Let me try again...The DB framework will build the java objects
directly from the MIME root part (this is the first step always!) and
*then* accesses the other mime parts and sticks them where it is
needed (or adds a reference) on the java objects that it already
created. Except that OM tree is *never* built. And on the sending
side, it generates stax events directly from the the java objects into
the MIME root part and adds the attachments into a bag while it is
doing so...again no OM tree in the picture at all.

-- dims

On 3/31/06, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
> On Fri, 2006-03-31 at 09:52 -0500, Glen Daniels wrote:
> > Hi dims:
> >
> > > May be i am missing something...the difference in my mind is a person
> > > implementing a databinding layer should be able to access the
> > > attachements without having to build the om tree. straight from stax
> > > to java objects with no om and use whatever they need to store the
> > > attachments byte arrays or data handlers or some databinding specific
> > > construct.
>
> What's buffered are the bytes of the MIME root part that contains the
> SOAP envelope. I agree we shouldn't build the tree for that part (and
> I'm pretty certain we don't) until its needed but the bytes have to be
> read and buffered. Unless you guys know of some magic technology I don't
> see how that can be done any other way ;-) .. a stream is ordered you
> know and those bytes come after these bytes. Simple as that.
>
> > +1.  OM was built to allow you to optimize out the
> > tree-building/buffering for the normal XML case - you call
> > getXMLStreamReaderWithoutCaching() and go.  MTOM/attachments are sort of
> > the fly in the ointment there, in that you need another layer below StAX
> > in order to get at the attachments.  We've got that layer now, but it's
> > hidden and tightly coupled to the OM tree framework.  The suggestion is
> > simply to open it up so you can do exactly what dims describes here.
>
> Ah ok that part I agree with .. that getting to the MIME data via the
> MIMEHelper is a good thing. Is that what you're looking for?
>
> Sanjiva.
>
>


--
Davanum Srinivas : http://wso2.com/blogs/

Re: [Axis2] Data binding attachments support

Posted by Sanjiva Weerawarana <sa...@opensource.lk>.
On Fri, 2006-03-31 at 09:52 -0500, Glen Daniels wrote:
> Hi dims:
> 
> > May be i am missing something...the difference in my mind is a person
> > implementing a databinding layer should be able to access the
> > attachements without having to build the om tree. straight from stax
> > to java objects with no om and use whatever they need to store the
> > attachments byte arrays or data handlers or some databinding specific
> > construct.

What's buffered are the bytes of the MIME root part that contains the
SOAP envelope. I agree we shouldn't build the tree for that part (and
I'm pretty certain we don't) until its needed but the bytes have to be
read and buffered. Unless you guys know of some magic technology I don't
see how that can be done any other way ;-) .. a stream is ordered you
know and those bytes come after these bytes. Simple as that.

> +1.  OM was built to allow you to optimize out the 
> tree-building/buffering for the normal XML case - you call 
> getXMLStreamReaderWithoutCaching() and go.  MTOM/attachments are sort of 
> the fly in the ointment there, in that you need another layer below StAX 
> in order to get at the attachments.  We've got that layer now, but it's 
> hidden and tightly coupled to the OM tree framework.  The suggestion is 
> simply to open it up so you can do exactly what dims describes here.

Ah ok that part I agree with .. that getting to the MIME data via the
MIMEHelper is a good thing. Is that what you're looking for?

Sanjiva.


Re: [Axis2] Data binding attachments support

Posted by Glen Daniels <gl...@thoughtcraft.com>.
Hi dims:

> May be i am missing something...the difference in my mind is a person
> implementing a databinding layer should be able to access the
> attachements without having to build the om tree. straight from stax
> to java objects with no om and use whatever they need to store the
> attachments byte arrays or data handlers or some databinding specific
> construct.

+1.  OM was built to allow you to optimize out the 
tree-building/buffering for the normal XML case - you call 
getXMLStreamReaderWithoutCaching() and go.  MTOM/attachments are sort of 
the fly in the ointment there, in that you need another layer below StAX 
in order to get at the attachments.  We've got that layer now, but it's 
hidden and tightly coupled to the OM tree framework.  The suggestion is 
simply to open it up so you can do exactly what dims describes here.

--Glen

Re: [Axis2] Data binding attachments support (was: CHAT LOG : 2006-03-29)

Posted by Davanum Srinivas <da...@gmail.com>.
Sanjiva,

May be i am missing something...the difference in my mind is a person
implementing a databinding layer should be able to access the
attachements without having to build the om tree. straight from stax
to java objects with no om and use whatever they need to store the
attachments byte arrays or data handlers or some databinding specific
construct.

-- dims

On 3/31/06, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
> On Fri, 2006-03-31 at 15:35 +0600, Thilina Gunarathne wrote:
> > <quote Glen>
> > What we want is a "thingy" which can be stored away and LATER used to
> > get the real attachment data after all XML pulling is done
> > </quote>
> >
> > IMHO we are doing the exactly the same thing using OMTexts.. Currently
> > we are doing this with more flexibility and of course with a catch..
> > Flexibility is given by allowing the users to get the real attachment
> > data if they wish while pulling the XML.. Catch is that we are
> > buffering the SOAP Envelope in the MIME Parser which is underneath the
> > StaxReader...
>
> Thilina, I don't see this as a catch - isn't it impossible to get to any
> attachments without buffering the SOAP envelope? Or are you thinking
> about reading the SOAP envelope and buffering it IFF someone actually
> refers to an attachment?
>
> While that's interesting in theory, whoever sent the attachment more
> often than not expected the other end to read the darn thing. I don't
> see the point of that potential optimization. Maybe I'm missing
> something.
>
> Sanjiva.
>
>


--
Davanum Srinivas : http://wso2.com/blogs/

Re: [Axis2] Data binding attachments support

Posted by Dennis Sosnoski <dm...@sosnoski.com>.
Hi Thilina,

I agree that buffering the actual SOAP message body text in the case of 
attachments is fine, at least until we see some reason to change. If 
somebody's using attachments hopefully the SOAP message body is not 
going to be huge and buffering it will not add a lot of overhead. The 
actual attachments should not be buffered unless necessary, though (as 
when the user is accessing them out of order). If the user accesses the 
attachments in order there should be no need for buffering these.

There should also be no need for buffering attachments on output, in 
general. Some forms of output need to know the size of the data, so for 
these the attachment API should provide a way of getting that 
information without reading and buffering all the attachment data. If 
the source of the attachment data doesn't know the size, and this 
information is needed, buffering would still be required.

  - Dennis

Thilina Gunarathne wrote:

>Hi,
>
>I'm thinking of Buffering the stream containing the SOAP envelope only
>when someone actually refering to the attachment..
>
>  
>
>>While that's interesting in theory, whoever sent the attachment more
>>often than not expected the other end to read the darn thing. I don't
>>see the point of that potential optimization. Maybe I'm missing
>>something.
>>    
>>
>I'm also feel like this is over engineering and not sure whether it'll
>become a overhead rather than a optimisation. IMO we need to have some
>serious discussions in the maling list before doing something like
>that..That's the very reason this is avoided in the initial impl.
>Most suitable usecase in here will be the intermediaries, which
>normally do not read the attachments.. Interesting figure in here will
>be the number of usecases where people actually use attchments with
>intermediaries..
>
>~Thilina
>

Re: [Axis2] Data binding attachments support (was: CHAT LOG : 2006-03-29)

Posted by Thilina Gunarathne <cs...@gmail.com>.
Hi,

I'm thinking of Buffering the stream containing the SOAP envelope only
when someone actually refering to the attachment..

> While that's interesting in theory, whoever sent the attachment more
> often than not expected the other end to read the darn thing. I don't
> see the point of that potential optimization. Maybe I'm missing
> something.
I'm also feel like this is over engineering and not sure whether it'll
become a overhead rather than a optimisation. IMO we need to have some
serious discussions in the maling list before doing something like
that..That's the very reason this is avoided in the initial impl.
Most suitable usecase in here will be the intermediaries, which
normally do not read the attachments.. Interesting figure in here will
be the number of usecases where people actually use attchments with
intermediaries..

~Thilina

>
>


--
"May the SourcE be with u"
http://webservices.apache.org/~thilina/
http://thilinag.blogspot.com/                
http://www.bloglines.com/blog/Thilina

Re: [Axis2] Data binding attachments support (was: CHAT LOG : 2006-03-29)

Posted by Sanjiva Weerawarana <sa...@opensource.lk>.
On Fri, 2006-03-31 at 15:35 +0600, Thilina Gunarathne wrote:
> <quote Glen>
> What we want is a "thingy" which can be stored away and LATER used to
> get the real attachment data after all XML pulling is done
> </quote>
> 
> IMHO we are doing the exactly the same thing using OMTexts.. Currently
> we are doing this with more flexibility and of course with a catch..
> Flexibility is given by allowing the users to get the real attachment
> data if they wish while pulling the XML.. Catch is that we are
> buffering the SOAP Envelope in the MIME Parser which is underneath the
> StaxReader...

Thilina, I don't see this as a catch - isn't it impossible to get to any
attachments without buffering the SOAP envelope? Or are you thinking
about reading the SOAP envelope and buffering it IFF someone actually
refers to an attachment? 

While that's interesting in theory, whoever sent the attachment more
often than not expected the other end to read the darn thing. I don't
see the point of that potential optimization. Maybe I'm missing
something.

Sanjiva.


Re: [Axis2] Data binding attachments support

Posted by Dennis Sosnoski <dm...@sosnoski.com>.
Thilina Gunarathne wrote:

>Hi Dennis,
>Following is what I understood about your proposal for MTOM after
>reading the chat log...Please correct me if I'm wrong..
>JIBX expects a raw input stream of the SOAP envelope with
>xop:include's.JIBX handles the XOP:Include internally, but it needs
>Axis2 to do the mime parsing.. The interface you expects Axis2 to
>provide an interface with a method to access the InputStream of the
>SOAP envelope together with the methods to access
>DataHandlers/inputStreams of subsequent mime parts given the
>content-id.
>  
>
Yes, though I'd say that it's up to the transport layer to sort out the 
attachments (since attachment support is a transport issue). I haven't 
looked into how this is actually implemented in Axis2.

>Also it seems that you have the misunderstanding that Axis2 will build
>the whole OM if somebody needs to access the data referred in a
>XOP:Include..
>
I'm not sure how you mean this. If you're talking about incoming 
messages and the user application accessing the data, then no, I'm 
trying to avoid Axis2 building the whole OM. If you're talking about 
messages in either direction with parts of the framework code (such as 
WS-Security) accessing the data, then yes, I think Axis2 needs to build 
the OM.

>...
>2. In the case of a MIME message (MTOM or SWA) Axis2 reads in the root
>mime part containing the envelope and stores it in a DataHandler..Then
>takes and InputStream from the DataHandler and gives it to the
>StaxReader for the normal XML processing to be done..The data in this
>mime part will be directly streamed to a file, If the the file caching
>is switched on and the size of the part is bigger than the threshold
>value.. Direct streaming of the contents of the Root part to the
>StaXReader till the user requests the subsequent mime parts would have
>been the ideal scenario. But we decided it to be a post 1.0 goal at
>that time..
>  
>
Okay, I hadn't realized the root part was always being buffered but this 
should not be a major performance hit in and of itself.

>4.References to the MIMEParser and the attachments are stored in the
>MIMEHelper class and the user can access it directly if needed..  This
>takes care of deferred parsing of Mime parts and storing the unused
>parts in Memory or temp files for future use.. Deferred parsing of
>MIME parts means that we will parse the stream only till we get the
>required mime parts..
>5. Axis2 starts processing of the headers and reads in the
>XOP:Include, which in turn would create an OMText instance with just a
>pointer to the relevant MIME part. Axis2 is not doing any actual
>readings of the relevant MIME part even at this point..
>  
>
What I'm discussing is the case where a data binding framework is 
handling the body content, so as long as the data binding framework has 
access to the MIMEHelper class everything should be fine.

>6. Somebody needs to access the data in the above created OMText and
>calls it's getDataHandler method. At this point Axis2 MIME parser goes
>ahead and read the relevant MIME part. But the OM will **not** get
>created irrespective of whether the Caching is ON or OFF.. Remaining
>InputStream for the  Envelope is buffered in a File or in memory as a
>FileDataSource of MemoryDataSource depending on the size..
>  
>
I only said the OM always gets created on output in the current code, 
due to the transport code scanning the OM to find optimizable data items.

>My suggestion to your proposal is to use the MIMEHelper to access the
>attachments. We can also add a method to access the inputStream of the
>root part without much trouble.. We are also in the process of adding
>attachment streaming capability to the MIMEHelper, which is a blocker
>for 1.0..
>
>Functionality of the MIMEHelper might get delegated to SOAPMessage and
>it's currently under discussion. Personally i prefer to have
>SOAPMessage with well defined consistent API's for attachment
>handling...
>
>
><Quote Dennis>
>The second part of this interface deals with attachments. It gives the
>SerializationTarget (which would be transport-dependent, of course) the
>control over what actually gets sent as an attachment, and provides the
>data to be output as an attachment in the form of either a stream or an
>array of bytes. This would allow us to fix the current broken output
>behavior which forces generation of a fully-expanded OM tree for every
>message being sent, just so the transport code can check for anything it
>wants to send as an attachment.
></quote>
>
>Current behaviour of traversing the tree to find whether there are any
>optimised parts is a hack till the MTOM policy comes in to play.. When
>MTOM policy is there it'll define whether to send the message inside a
>MIME envelope or not.. Till that we can use a message level flag like
>Glen suggested in the chat to avoid traversing the tree.. Also with
>with the ongoing improvements like attachments streaming ability and
>Attachment API we can avoid the above behavior even in a SOAP
>intermediary..
>  
>
I'm not sure I completely understand your response on this. The 
important issue from my standpoint is that the attachments can be added 
directly by user code (or in the main case of interest, by data binding 
code), without going through OMText or such.

>IMO the ability to decide what data to be optimised should remained on
>the hands of the user..   Even in the current impl data to be output
>as attachment is given as streams or array of bytes using
>DataHandlers.. Please provide a more detailed explanation of the inner
>workings of the above proposed way of handling attachments..
>  
>
There's a separation of concerns issue here. The data binding framework 
may know that certain types of items can potentially be handled as 
attachments, but doesn't directly know whether (1) attachments are 
supported by the transport in use, and (2) whether a particular data 
size is worth sending as an attachment. For instance, with MIME over 
HTTP sending a 1KB attachment is likely to be slower and bulkier than 
just sending the 1KB as base64Encoding. With other transports and 
encodings this might not be the case (such as TCP/IP with block sizes). 
That's why I wanted to structure things so the transport layer could 
make the decisions.

  - Dennis

Re: [Axis2] Data binding attachments support (was: CHAT LOG : 2006-03-29)

Posted by Thilina Gunarathne <cs...@gmail.com>.
Hi Dennis,
Following is what I understood about your proposal for MTOM after
reading the chat log...Please correct me if I'm wrong..
JIBX expects a raw input stream of the SOAP envelope with
xop:include's.JIBX handles the XOP:Include internally, but it needs
Axis2 to do the mime parsing.. The interface you expects Axis2 to
provide an interface with a method to access the InputStream of the
SOAP envelope together with the methods to access
DataHandlers/inputStreams of subsequent mime parts given the
content-id.

Also it seems that you have the misunderstanding that Axis2 will build
the whole OM if somebody needs to access the data referred in a
XOP:Include.. Let's consider a use case of Axis2 receiving a message
which refers to an MTOM attachment from a header..
1. Axis2 receives the message and looking at the content-type of the
message Axis2 decides whether it's a MTOM/SWA/pure XML.
2. In the case of a MIME message (MTOM or SWA) Axis2 reads in the root
mime part containing the envelope and stores it in a DataHandler..Then
takes and InputStream from the DataHandler and gives it to the
StaxReader for the normal XML processing to be done..The data in this
mime part will be directly streamed to a file, If the the file caching
is switched on and the size of the part is bigger than the threshold
value.. Direct streaming of the contents of the Root part to the
StaXReader till the user requests the subsequent mime parts would have
been the ideal scenario. But we decided it to be a post 1.0 goal at
that time..
4.References to the MIMEParser and the attachments are stored in the
MIMEHelper class and the user can access it directly if needed..  This
takes care of deferred parsing of Mime parts and storing the unused
parts in Memory or temp files for future use.. Deferred parsing of
MIME parts means that we will parse the stream only till we get the
required mime parts..
5. Axis2 starts processing of the headers and reads in the
XOP:Include, which in turn would create an OMText instance with just a
pointer to the relevant MIME part. Axis2 is not doing any actual
readings of the relevant MIME part even at this point..
6. Somebody needs to access the data in the above created OMText and
calls it's getDataHandler method. At this point Axis2 MIME parser goes
ahead and read the relevant MIME part. But the OM will **not** get
created irrespective of whether the Caching is ON or OFF.. Remaining
InputStream for the  Envelope is buffered in a File or in memory as a
FileDataSource of MemoryDataSource depending on the size..

My suggestion to your proposal is to use the MIMEHelper to access the
attachments. We can also add a method to access the inputStream of the
root part without much trouble.. We are also in the process of adding
attachment streaming capability to the MIMEHelper, which is a blocker
for 1.0..

Functionality of the MIMEHelper might get delegated to SOAPMessage and
it's currently under discussion. Personally i prefer to have
SOAPMessage with well defined consistent API's for attachment
handling...


<Quote Dennis>
The second part of this interface deals with attachments. It gives the
SerializationTarget (which would be transport-dependent, of course) the
control over what actually gets sent as an attachment, and provides the
data to be output as an attachment in the form of either a stream or an
array of bytes. This would allow us to fix the current broken output
behavior which forces generation of a fully-expanded OM tree for every
message being sent, just so the transport code can check for anything it
wants to send as an attachment.
</quote>

Current behaviour of traversing the tree to find whether there are any
optimised parts is a hack till the MTOM policy comes in to play.. When
MTOM policy is there it'll define whether to send the message inside a
MIME envelope or not.. Till that we can use a message level flag like
Glen suggested in the chat to avoid traversing the tree.. Also with
with the ongoing improvements like attachments streaming ability and
Attachment API we can avoid the above behavior even in a SOAP
intermediary..

IMO the ability to decide what data to be optimised should remained on
the hands of the user..   Even in the current impl data to be output
as attachment is given as streams or array of bytes using
DataHandlers.. Please provide a more detailed explanation of the inner
workings of the above proposed way of handling attachments..

<quote Glen>
I'm saying that I don't know the DH API well enough to know if we can
subclass it or whatever in order to get a "future-looking" DH.  If we
can't, we'll need a wrapper class.
</quote>
What we can do is to sub class the Data Sources.. Data Sources are the
actual data containers inside the Data Handlers.. We are already doing
this in the case of FileDataSource, which actually contains a just a
file pointer inside...

<quote Glen>
What we want is a "thingy" which can be stored away and LATER used to
get the real attachment data after all XML pulling is done
</quote>

IMHO we are doing the exactly the same thing using OMTexts.. Currently
we are doing this with more flexibility and of course with a catch..
Flexibility is given by allowing the users to get the real attachment
data if they wish while pulling the XML.. Catch is that we are
buffering the SOAP Envelope in the MIME Parser which is underneath the
StaxReader...

Unfortunately these days I'm busy with my final year project which is
due in another months time and hardly find any time to work on Axis2.
It is the cause to the suggested improvements to Attachment codes like
streaming and defined API's getting delayed so much...

~Thilina


On 3/30/06, Dennis Sosnoski <dm...@sosnoski.com> wrote:
> For the JAXB 2.0 handling of attachments see
> http://java.sun.com/javase/6/docs/api/javax/xml/bind/attachment/AttachmentMarshaller.html
> and
> http://java.sun.com/javase/6/docs/api/javax/xml/bind/attachment/AttachmentUnmarshaller.html
> These are set on the marshaller and unmarshaller, respectively.
>
> Though now that I think about it, I wonder how the
> getAttachmentAsByteArray() is handled, given that I'd think this would
> be called during unmarshalling - is there buffering of the input stream
> going on? Kohsuke, can you comment on this?
>
>  - Dennis
>
> Glen Daniels wrote:
>
> > ...attached.  Discussed Dennis' proposal for AXIOM refactoring to
> > enable lower-level MTOM support for DB frameworks.  We got to
> > consensus on the receiving side, but didn't have a chance to finish
> > discussing the sending side, which we hope to do via email ASAP.
> >
> > --Glen
>
>


--
"May the SourcE be with u"
http://webservices.apache.org/~thilina/
http://thilinag.blogspot.com/                
http://www.bloglines.com/blog/Thilina

Re: [Axis2] Data binding attachments support

Posted by Dennis Sosnoski <dm...@sosnoski.com>.
Ah, that makes sense. It does seem to create potential problems, though, 
since it relies on the data magically appearing in the byte[] (or in the 
case of an unknown size, the byte[] object itself magically appearing) 
sometime after the initial unmarshalling. With an DataHandler you can at 
least control access to the attachment data with internal hooks.

Is the idea that this will always be populated by the time the 
unmarshalling is completed? In other words, the code unmarshals the 
document and then processes all attachments, caching them on the file 
system if necessary.

  - Dennis

Kohsuke Kawaguchi wrote:

> Dennis Sosnoski wrote:
>
>> Though now that I think about it, I wonder how the 
>> getAttachmentAsByteArray() is handled, given that I'd think this 
>> would be called during unmarshalling - is there buffering of the 
>> input stream going on? Kohsuke, can you comment on this?
>
>
> This method is intended to be used when the data binding layer knows 
> that the data is eventually transformed into byte[].
>
> Often the code that implements AttachmentUnmarshaller knows the exact 
> size of the attachment (such is often the case with MIME attachments), 
> and if so, byte[] can be created rather efficiently (compared to have 
> the data binding layer try to build byte[] from DataHandler.)
>
> If not, then most likely you'll have to read InputStream and create 
> byte[] with reallocation of byte[] to get to the exact size, but 
> that's no worse than what the data binding layer would have done if 
> this method didn't exist.
>
> So that's how it's used.
>

[Axis2] Data binding attachments support (was: CHAT LOG : 2006-03-29)

Posted by Dennis Sosnoski <dm...@sosnoski.com>.
For the JAXB 2.0 handling of attachments see 
http://java.sun.com/javase/6/docs/api/javax/xml/bind/attachment/AttachmentMarshaller.html 
and 
http://java.sun.com/javase/6/docs/api/javax/xml/bind/attachment/AttachmentUnmarshaller.html 
These are set on the marshaller and unmarshaller, respectively.

Though now that I think about it, I wonder how the 
getAttachmentAsByteArray() is handled, given that I'd think this would 
be called during unmarshalling - is there buffering of the input stream 
going on? Kohsuke, can you comment on this?

  - Dennis

Glen Daniels wrote:

> ...attached.  Discussed Dennis' proposal for AXIOM refactoring to 
> enable lower-level MTOM support for DB frameworks.  We got to 
> consensus on the receiving side, but didn't have a chance to finish 
> discussing the sending side, which we hope to do via email ASAP.
>
> --Glen