You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Jacob Kjome <ho...@visi.com> on 2006/12/13 21:05:57 UTC

VTD-XML thoughts?


I noticed an article on TheServerside [1] about VDT-XML [2].  I'm curious what
the Xerces team thinks about it?  It looks like performance and a small memory
footprint are priority #1 [3][4].  I'm not sure how good they are about
correctness?  Is VDT-XML a niche product, or could it's approach provide the
way forward for XML parsers?  Are they competition for Xerces or a performant,
but limited, alternative?

Jake

[1] http://www.theserverside.com/news/thread.tss?thread_id=43432
[2] http://vtd-xml.sourceforge.net/
[3] http://vtd-xml.sourceforge.net/benchmark.html
[4] http://vtd-xml.sourceforge.net/benchmark2.html


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: VTD-XML thoughts?

Posted by ke...@us.ibm.com.
How does this differ from all the (many!) other attempts to come up with a
less verbose mapping of XML content? When that's been tried in the past, it
has generally turned out to be not much more efficient than just processing
compressed XML, sometimes less so... except in those cases where the
authors subsetted XML, which further breaks the ability to claim that it is
still XML rather than a custom data representation inspired by XML.

______________________________________
"... Three things see no end: A loop with exit code done wrong,
A semaphore untested, And the change that comes along. ..."
  -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish
(http://www.ovff.org/pegasus/songs/threes-rev-11.html)

Re: VTD-XML thoughts?

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
"Eric J. Schwarzenbach" <Er...@wrycan.com> wrote on 
12/27/2006 02:03:28 PM:

> Isn't a billion characters far more that you could ever load into a DOM
> in any practical environment?

In general yes. However, if you know in advance which regions of the 
document actually require random access and which parts of the document 
you will never visit it may be possible to keep the memory usage low by 
using a filter [1].

> The question I find more interesting that of the feasibility of VTD as
> general DOM / Xerces competitor, is could some of the underlying
> techniques [1] they use in processing the XML be used to make a more
> performant and scalable DOM parser? In their FAQ they suggest no [2],
> but they seem to assume that the whole DOM object structure need be
> created and not just parts of it lazily as they are actually used.

I understand Apache AXIOM [2] (which is an XML object model used by Axis2) 
will lazily contstuct its object tree by pulling document events from the 
parser as needed. There's no reason you couldn't do the same (or even 
better) with DOM.

> Eric

[1] 
http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407/load-save.html#LS-LSParserFilter
[2] http://ws.apache.org/commons/axiom/index.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: VTD-XML thoughts?

Posted by "Eric J. Schwarzenbach" <Er...@wrycan.com>.
Isn't a billion characters far more that you could ever load into a DOM
in any practical environment?

The question I find more interesting that of the feasibility of VTD as
general DOM / Xerces competitor, is could some of the underlying
techniques [1] they use in processing the XML be used to make a more
performant and scalable DOM parser? In their FAQ they suggest no [2],
but they seem to assume that the whole DOM object structure need be
created and not just parts of it lazily as they are actually used.

Eric

[1] http://vtd-xml.sourceforge.net/technical/0.html
[2] 
http://vtd-xml.sourceforge.net/faq.html#Is%20there%20a%20plan%20to%20offer%20DOM%20interface%20over%20VTD



Michael Glavassevich wrote:
> >From what I've read [1] about it, it isn't a conforming XML parser. It 
> apparently doesn't support DTDs, external entities, documents larger than 
> a billion characters and who knows what else. Sounds like a niche product 
> to me.
>
> [1] http://www.cafeconleche.org/oldnews/news2006April13.html
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> Jacob Kjome <ho...@visi.com> wrote on 12/13/2006 03:05:57 PM:
>
>   
>> I noticed an article on TheServerside [1] about VDT-XML [2].  I'm 
>>     
> curious what
>   
>> the Xerces team thinks about it?  It looks like performance and a small 
>>     
> memory
>   
>> footprint are priority #1 [3][4].  I'm not sure how good they are about
>> correctness?  Is VDT-XML a niche product, or could it's approach provide 
>>     
> the
>   
>> way forward for XML parsers?  Are they competition for Xerces or a 
>>     
> performant,
>   
>> but limited, alternative?
>>
>> Jake
>>
>> [1] http://www.theserverside.com/news/thread.tss?thread_id=43432
>> [2] http://vtd-xml.sourceforge.net/
>> [3] http://vtd-xml.sourceforge.net/benchmark.html
>> [4] http://vtd-xml.sourceforge.net/benchmark2.html
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: j-users-help@xerces.apache.org
>>     
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
>
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: VTD-XML thoughts?

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
>From what I've read [1] about it, it isn't a conforming XML parser. It 
apparently doesn't support DTDs, external entities, documents larger than 
a billion characters and who knows what else. Sounds like a niche product 
to me.

[1] http://www.cafeconleche.org/oldnews/news2006April13.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

Jacob Kjome <ho...@visi.com> wrote on 12/13/2006 03:05:57 PM:

> I noticed an article on TheServerside [1] about VDT-XML [2].  I'm 
curious what
> the Xerces team thinks about it?  It looks like performance and a small 
memory
> footprint are priority #1 [3][4].  I'm not sure how good they are about
> correctness?  Is VDT-XML a niche product, or could it's approach provide 
the
> way forward for XML parsers?  Are they competition for Xerces or a 
performant,
> but limited, alternative?
> 
> Jake
> 
> [1] http://www.theserverside.com/news/thread.tss?thread_id=43432
> [2] http://vtd-xml.sourceforge.net/
> [3] http://vtd-xml.sourceforge.net/benchmark.html
> [4] http://vtd-xml.sourceforge.net/benchmark2.html
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org