You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by Mark Pimentel <ma...@gmail.com> on 2005/08/16 20:55:05 UTC

[Axis2] Google SoC - DFDL Advice

Hi everyone,

I have been looking into DFDL for the second half of the project, but
am not too clear on the steps necessary to move forward.  Hopefully
someone can offer advice or guidance.

>From the examples I've seen, DFDL seems to work by taking a non-XML
file and generating two files from it: The first is an XML file that
tags the strings, numbers, and other parts of the non-XML file. The
second is a DFDL description file (.xsd) that contains the structure
and representation of the information in the XML file. This may seem
like doing a lot, but by specifying structure in this external file,
it is easier to break down large, repetitive areas of information into
concise XML chunks.

(See example DFDL .xml and .xsd files posted at
http://wiki.apache.org/ws/SummerOfCode/2005/binarySerialization/14)

However, I am not quite clear on what I should be coding to implement
this.  Should I be taking non-structured content and generating these
two DFDL files from them?  Also, what kind of test data should I be
working with?  I am not too familiar with AXIOM, but my understanding
was that the information to be serialized would already be in XML.  Is
this the right approach?  Any advice?

Thanks for your help!
Mark

Re: [Axis2] Google SoC - DFDL Advice

Posted by Mark Pimentel <ma...@gmail.com>.
Chinthaka, Dennis,

Sure, got it.  I'll continue looking at the XBIS code and let you know
what I can do.

Thanks,
Mark

On 8/25/05, Eran Chinthaka <ch...@opensource.lk> wrote:
> Oops, Denis, sorry for the confusion. My fault.
> 
> Mark, here we go....
> 
> -- EC
> 
> Dennis Sosnoski wrote:
> 
> > I think there was some confusion over this, Chinthaka. I said in the
> > chat that I'd be implementing StAX support for JiBX, and that I'd try
> > JibxSoap support for XBIS - not that I'd add StAX support for XBIS.
> > I'll still probably add StAX support for XBIS at some point if Mark
> > isn't able to do it, but it's definitely not a priority for me at this
> > time.
> >
> >  - Dennis
> >
> > Eran Chinthaka wrote:
> >
> >> Hi Mark,
> >>
> >> Today, in our weekly chat, Dennis expressed his willingness to
> >> integrate XBIS binary stuff with Axis2. So I think u don't need to do
> >> that. Anway, Denis is the one who wrote it so he knows in and out of
> >> XBIS than anyone.
> >> But at the same time Axis2 team wanted to have some other impl of
> >> binary stuff in to Axis2. So you still have some more space to
> >> contribute.
> >>
> >> -- Chinthaka
> >>
> >> Mark Pimentel wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> Chinthaka and I will likely move forward with trying to implement a
> >>> StAX parser for Dennis' XBIS format for Axis2, instead of sticking
> >>> with DFDL.  Does that sound reasonable to everyone?
> >>>
> >>> Thanks,
> >>> Mark
> >>>
> >>> On 8/16/05, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
> >>>
> >>>
> >>>> On Wed, 2005-08-17 at 07:01 +0600, Sanjiva Weerawarana wrote:
> >>>>
> >>>>
> >>>>> On Tue, 2005-08-16 at 12:16 -0700, Dennis Sosnoski wrote:
> >>>>>
> >>>>>
> >>>>>> I'll just refer back to our earlier discussion on this topic:
> >>>>>> http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)
> >>>>>>
> >>>>>>  - Dennis
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> +1 to using some XML Infoset binary serialization format rather than
> >>>>> DFDL.
> >>>>>
> >>>>
> >>>> I forgot to mention .. I would like to introduce a typed pull API
> >>>> (extending StAX) into Axis2 (and implement it the obvious way). When
> >>>> data binding is in place and if the data is being serialized, for
> >>>> non-string data this can make a big difference. A common use case is
> >>>> using Axis2 to deal with large numerical data sets .. if we go with a
> >>>> binary typed stax approach, there's no intrinsic reason that should be
> >>>> any slower or memory intensive than any other binary protocol
> >>>> approach.
> >>>>
> >>>> After v1.0, of course ;-).
> >>>>
> >>>> Sanjiva.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >
> 
>

Re: [Axis2] Google SoC - DFDL Advice

Posted by Eran Chinthaka <ch...@opensource.lk>.
Oops, Denis, sorry for the confusion. My fault.

Mark, here we go....

-- EC

Dennis Sosnoski wrote:

> I think there was some confusion over this, Chinthaka. I said in the 
> chat that I'd be implementing StAX support for JiBX, and that I'd try 
> JibxSoap support for XBIS - not that I'd add StAX support for XBIS. 
> I'll still probably add StAX support for XBIS at some point if Mark 
> isn't able to do it, but it's definitely not a priority for me at this 
> time.
>
>  - Dennis
>
> Eran Chinthaka wrote:
>
>> Hi Mark,
>>
>> Today, in our weekly chat, Dennis expressed his willingness to 
>> integrate XBIS binary stuff with Axis2. So I think u don't need to do 
>> that. Anway, Denis is the one who wrote it so he knows in and out of 
>> XBIS than anyone.
>> But at the same time Axis2 team wanted to have some other impl of 
>> binary stuff in to Axis2. So you still have some more space to 
>> contribute.
>>
>> -- Chinthaka
>>
>> Mark Pimentel wrote:
>>
>>> Hi everyone,
>>>
>>> Chinthaka and I will likely move forward with trying to implement a
>>> StAX parser for Dennis' XBIS format for Axis2, instead of sticking
>>> with DFDL.  Does that sound reasonable to everyone?
>>>
>>> Thanks,
>>> Mark
>>>
>>> On 8/16/05, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
>>>  
>>>
>>>> On Wed, 2005-08-17 at 07:01 +0600, Sanjiva Weerawarana wrote:
>>>>   
>>>>
>>>>> On Tue, 2005-08-16 at 12:16 -0700, Dennis Sosnoski wrote:
>>>>>     
>>>>>
>>>>>> I'll just refer back to our earlier discussion on this topic:
>>>>>> http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)
>>>>>>
>>>>>>  - Dennis
>>>>>>
>>>>>>       
>>>>>
>>>>> +1 to using some XML Infoset binary serialization format rather than
>>>>> DFDL.
>>>>>     
>>>>
>>>> I forgot to mention .. I would like to introduce a typed pull API
>>>> (extending StAX) into Axis2 (and implement it the obvious way). When
>>>> data binding is in place and if the data is being serialized, for
>>>> non-string data this can make a big difference. A common use case is
>>>> using Axis2 to deal with large numerical data sets .. if we go with a
>>>> binary typed stax approach, there's no intrinsic reason that should be
>>>> any slower or memory intensive than any other binary protocol 
>>>> approach.
>>>>
>>>> After v1.0, of course ;-).
>>>>
>>>> Sanjiva.
>>>>
>>>>
>>>>
>>>>   
>>>
>>>
>>>
>>>  
>>>
>


Re: [Axis2] Google SoC - DFDL Advice

Posted by Dennis Sosnoski <dm...@sosnoski.com>.
I think there was some confusion over this, Chinthaka. I said in the 
chat that I'd be implementing StAX support for JiBX, and that I'd try 
JibxSoap support for XBIS - not that I'd add StAX support for XBIS. I'll 
still probably add StAX support for XBIS at some point if Mark isn't 
able to do it, but it's definitely not a priority for me at this time.

  - Dennis

Eran Chinthaka wrote:

> Hi Mark,
>
> Today, in our weekly chat, Dennis expressed his willingness to 
> integrate XBIS binary stuff with Axis2. So I think u don't need to do 
> that. Anway, Denis is the one who wrote it so he knows in and out of 
> XBIS than anyone.
> But at the same time Axis2 team wanted to have some other impl of 
> binary stuff in to Axis2. So you still have some more space to 
> contribute.
>
> -- Chinthaka
>
> Mark Pimentel wrote:
>
>>Hi everyone,
>>
>>Chinthaka and I will likely move forward with trying to implement a
>>StAX parser for Dennis' XBIS format for Axis2, instead of sticking
>>with DFDL.  Does that sound reasonable to everyone?
>>
>>Thanks,
>>Mark
>>
>>On 8/16/05, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
>>  
>>
>>>On Wed, 2005-08-17 at 07:01 +0600, Sanjiva Weerawarana wrote:
>>>    
>>>
>>>>On Tue, 2005-08-16 at 12:16 -0700, Dennis Sosnoski wrote:
>>>>      
>>>>
>>>>>I'll just refer back to our earlier discussion on this topic:
>>>>>http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)
>>>>>
>>>>>  - Dennis
>>>>>
>>>>>        
>>>>>
>>>>+1 to using some XML Infoset binary serialization format rather than
>>>>DFDL.
>>>>      
>>>>
>>>I forgot to mention .. I would like to introduce a typed pull API
>>>(extending StAX) into Axis2 (and implement it the obvious way). When
>>>data binding is in place and if the data is being serialized, for
>>>non-string data this can make a big difference. A common use case is
>>>using Axis2 to deal with large numerical data sets .. if we go with a
>>>binary typed stax approach, there's no intrinsic reason that should be
>>>any slower or memory intensive than any other binary protocol approach.
>>>
>>>After v1.0, of course ;-).
>>>
>>>Sanjiva.
>>>
>>>
>>>
>>>    
>>>
>>
>>
>>  
>>

Re: [Axis2] Google SoC - DFDL Advice

Posted by Eran Chinthaka <ch...@opensource.lk>.
Hi Mark,

Today, in our weekly chat, Dennis expressed his willingness to integrate 
XBIS binary stuff with Axis2. So I think u don't need to do that. Anway, 
Denis is the one who wrote it so he knows in and out of XBIS than anyone.
But at the same time Axis2 team wanted to have some other impl of binary 
stuff in to Axis2. So you still have some more space to contribute.

-- Chinthaka

Mark Pimentel wrote:

>Hi everyone,
>
>Chinthaka and I will likely move forward with trying to implement a
>StAX parser for Dennis' XBIS format for Axis2, instead of sticking
>with DFDL.  Does that sound reasonable to everyone?
>
>Thanks,
>Mark
>
>On 8/16/05, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
>  
>
>>On Wed, 2005-08-17 at 07:01 +0600, Sanjiva Weerawarana wrote:
>>    
>>
>>>On Tue, 2005-08-16 at 12:16 -0700, Dennis Sosnoski wrote:
>>>      
>>>
>>>>I'll just refer back to our earlier discussion on this topic:
>>>>http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)
>>>>
>>>>  - Dennis
>>>>
>>>>        
>>>>
>>>+1 to using some XML Infoset binary serialization format rather than
>>>DFDL.
>>>      
>>>
>>I forgot to mention .. I would like to introduce a typed pull API
>>(extending StAX) into Axis2 (and implement it the obvious way). When
>>data binding is in place and if the data is being serialized, for
>>non-string data this can make a big difference. A common use case is
>>using Axis2 to deal with large numerical data sets .. if we go with a
>>binary typed stax approach, there's no intrinsic reason that should be
>>any slower or memory intensive than any other binary protocol approach.
>>
>>After v1.0, of course ;-).
>>
>>Sanjiva.
>>
>>
>>
>>    
>>
>
>
>  
>

Re: [Axis2] Google SoC - DFDL Advice

Posted by Dennis Sosnoski <dm...@sosnoski.com>.
Sounds good to me.  :-)  I just checked in changes I've had laying 
around on my local system, which fixed some obscure namespace issues 
found by a company which was implementing XBIS for use with their XML 
product (they've since been bought by Sun, so I suppose they'll switch 
to the ASN.1 Infoset encoding...).

I can provide some limited help when you run into problem - you can just 
email me directly to discuss.

  - Dennis

Mark Pimentel wrote:

>Hi everyone,
>
>Chinthaka and I will likely move forward with trying to implement a
>StAX parser for Dennis' XBIS format for Axis2, instead of sticking
>with DFDL.  Does that sound reasonable to everyone?
>
>Thanks,
>Mark
>
>On 8/16/05, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
>  
>
>>On Wed, 2005-08-17 at 07:01 +0600, Sanjiva Weerawarana wrote:
>>    
>>
>>>On Tue, 2005-08-16 at 12:16 -0700, Dennis Sosnoski wrote:
>>>      
>>>
>>>>I'll just refer back to our earlier discussion on this topic:
>>>>http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)
>>>>
>>>>  - Dennis
>>>>
>>>>        
>>>>
>>>+1 to using some XML Infoset binary serialization format rather than
>>>DFDL.
>>>      
>>>
>>I forgot to mention .. I would like to introduce a typed pull API
>>(extending StAX) into Axis2 (and implement it the obvious way). When
>>data binding is in place and if the data is being serialized, for
>>non-string data this can make a big difference. A common use case is
>>using Axis2 to deal with large numerical data sets .. if we go with a
>>binary typed stax approach, there's no intrinsic reason that should be
>>any slower or memory intensive than any other binary protocol approach.
>>
>>After v1.0, of course ;-).
>>
>>Sanjiva.
>>
>>
>>
>>    
>>
>
>  
>

Re: [Axis2] Google SoC - DFDL Advice

Posted by Mark Pimentel <ma...@gmail.com>.
Hi everyone,

Chinthaka and I will likely move forward with trying to implement a
StAX parser for Dennis' XBIS format for Axis2, instead of sticking
with DFDL.  Does that sound reasonable to everyone?

Thanks,
Mark

On 8/16/05, Sanjiva Weerawarana <sa...@opensource.lk> wrote:
> On Wed, 2005-08-17 at 07:01 +0600, Sanjiva Weerawarana wrote:
> > On Tue, 2005-08-16 at 12:16 -0700, Dennis Sosnoski wrote:
> > > I'll just refer back to our earlier discussion on this topic:
> > > http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)
> > >
> > >   - Dennis
> > >
> >
> > +1 to using some XML Infoset binary serialization format rather than
> > DFDL.
> 
> I forgot to mention .. I would like to introduce a typed pull API
> (extending StAX) into Axis2 (and implement it the obvious way). When
> data binding is in place and if the data is being serialized, for
> non-string data this can make a big difference. A common use case is
> using Axis2 to deal with large numerical data sets .. if we go with a
> binary typed stax approach, there's no intrinsic reason that should be
> any slower or memory intensive than any other binary protocol approach.
> 
> After v1.0, of course ;-).
> 
> Sanjiva.
> 
> 
>

Re: [Axis2] Google SoC - DFDL Advice

Posted by Sanjiva Weerawarana <sa...@opensource.lk>.
On Wed, 2005-08-17 at 07:01 +0600, Sanjiva Weerawarana wrote:
> On Tue, 2005-08-16 at 12:16 -0700, Dennis Sosnoski wrote:
> > I'll just refer back to our earlier discussion on this topic: 
> > http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)
> > 
> >   - Dennis
> > 
> 
> +1 to using some XML Infoset binary serialization format rather than
> DFDL.

I forgot to mention .. I would like to introduce a typed pull API
(extending StAX) into Axis2 (and implement it the obvious way). When
data binding is in place and if the data is being serialized, for
non-string data this can make a big difference. A common use case is
using Axis2 to deal with large numerical data sets .. if we go with a
binary typed stax approach, there's no intrinsic reason that should be
any slower or memory intensive than any other binary protocol approach.

After v1.0, of course ;-).

Sanjiva.



Re: [Axis2] Google SoC - DFDL Advice

Posted by Sanjiva Weerawarana <sa...@opensource.lk>.
On Tue, 2005-08-16 at 12:16 -0700, Dennis Sosnoski wrote:
> I'll just refer back to our earlier discussion on this topic: 
> http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)
> 
>   - Dennis
> 

+1 to using some XML Infoset binary serialization format rather than
DFDL.

Sanjiva.



Re: [Axis2] Google SoC - DFDL Advice

Posted by Dennis Sosnoski <dm...@sosnoski.com>.
I'll just refer back to our earlier discussion on this topic: 
http://marc.theaimsgroup.com/?l=axis-dev&m=112131569306784&w=2 ;-)

  - Dennis

Mark Pimentel wrote:

>Hi everyone,
>
>I have been looking into DFDL for the second half of the project, but
>am not too clear on the steps necessary to move forward.  Hopefully
>someone can offer advice or guidance.
>
>>>From the examples I've seen, DFDL seems to work by taking a non-XML
>file and generating two files from it: The first is an XML file that
>tags the strings, numbers, and other parts of the non-XML file. The
>second is a DFDL description file (.xsd) that contains the structure
>and representation of the information in the XML file. This may seem
>like doing a lot, but by specifying structure in this external file,
>it is easier to break down large, repetitive areas of information into
>concise XML chunks.
>
>(See example DFDL .xml and .xsd files posted at
>http://wiki.apache.org/ws/SummerOfCode/2005/binarySerialization/14)
>
>However, I am not quite clear on what I should be coding to implement
>this.  Should I be taking non-structured content and generating these
>two DFDL files from them?  Also, what kind of test data should I be
>working with?  I am not too familiar with AXIOM, but my understanding
>was that the information to be serialized would already be in XML.  Is
>this the right approach?  Any advice?
>
>Thanks for your help!
>Mark
>
>  
>