You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Rob Cameron <ro...@international-characters.com> on 2013/01/27 17:18:33 UTC

Xerces Performance Acceleration Project: icXML

icXML is the name of our project to dramatically accelerate
Xerces performance on modern commodity processors by taking
advantage of SIMD and multicore capabilities and parallel bit
stream technology.

We are interested in feedback and interest from developers and
potential users.   We are also interested in identifying a potential
Champion who could help put us on track to become an official
Xerces subproject.

Version 0.8 of icXML has been released and is available together
with a development version on the costar.sfu.ca server.
svn co http://parabix.costar.sfu.ca/svn/icXML/icXML-0.8
svn co http://parabix.costar.sfu.ca/svn/icXML/icXML-devel
Trac browser: http://parabix.costar.sfu.ca/browser/icXML

To get an idea of the performance prospects, here are end-to-end
figures using Xerces-C 3.1.1 and icXML with a GML-to-SVG conversion
application.

Xerces-C 3.1.1

 Performance counter stats for './gml2svg_3_1_1 ../../data/layer/gml-10 out_3':

    24,444,713,630 instructions:u            #    1.83  insns per
cycle         [83.35%]
    13,344,529,298 cycles:u                  #    0.000 GHz
         [83.33%]
        41,915,991 branch-misses:u           #    0.70% of all
branches         [83.33%]
     6,013,112,976 branches:u
         [83.34%]
        81,290,233 L1-dcache-misses:u
         [83.33%]
       153,198,046 L1-icache-misses:u
         [66.73%]

       3.764054961 seconds time elapsed

icXML

 Performance counter stats for './gml2svg_icx ../../data/layer/gml-10 out_3':

    16,470,263,948 instructions:u            #    1.89  insns per
cycle         [83.33%]
     8,707,613,130 cycles:u                  #    0.000 GHz
         [83.33%]
        13,912,341 branch-misses:u           #    0.43% of all
branches         [83.35%]
     3,244,282,034 branches:u
         [83.33%]
        67,380,609 L1-dcache-misses:u
         [83.33%]
        32,141,837 L1-icache-misses:u
         [66.66%]

       2.554010404 seconds time elapsed

icXML experimental version with 2-thread pipeline parallelism

 Performance counter stats for './gml2svg_icx_pipeline
../../data/layer/gml-10 out_3':

    16,544,368,151 instructions:u            #    1.37  insns per
cycle         [84.11%]
    12,060,226,476 cycles:u                  #    0.000 GHz
         [83.93%]
        13,212,826 branch-misses:u           #    0.39% of all
branches         [83.92%]
     3,357,152,226 branches:u
         [83.78%]
        77,941,092 L1-dcache-misses:u
         [83.10%]
        25,757,287 L1-icache-misses:u
         [67.36%]

       2.180680680 seconds time elapsed



Rob Cameron,
CTO, International Characters, Inc
Professor of Computing Science, Simon Fraser University
http://www.international-characters.com/
http://parabix.costar.sfu.ca/

Re: Xerces Performance Acceleration Project: icXML

Posted by Rob Cameron <ro...@international-characters.com>.
That's great, Gareth.   I do think these lists are the right starting
point, and that creating a community of developers/users is key.    I think
I won't approach the general@incubator.apache.org list immediately because
it just seems premature.    (But thanks anyway, Michael, it seems like a
good route if we aren't able to find a champion through the process of
community building.)

We'll certainly be continuing the work within both ICI and the research
lab at SFU.   Incidentally, for someone with the interest,  doing open-source
project work could be combined with graduate studies ...

On Fri, Feb 1, 2013 at 4:20 AM, Gareth Reakes <ga...@we7.com> wrote:
> Hey Rob,
>
>>>
>>> We are interested in feedback and interest from developers and
>>> potential users.   We are also interested in identifying a potential
>>> Champion who could help put us on track to become an official
>>> Xerces subproject.
>>
>> If you're actively looking for a champion you may want to post to the
>> general@incubator.apache.org list to get the attention of a larger
>> audience of ASF members who might be interested in that role.
>>
>
> I would be happy to help out here although posting to that list still makes sense. The key thing we need to do is work on creating a community that will support the ongoing development and maintenance. There are not many people that contribute to Xerces these days and any reticence you may sense is because we would want to make sure that there was enough interest to support over the medium/long term. Its not going to be code we know about or can support easily so the worst thing for us would be to accept the (very generous) code gift but then be unable to support the users that come with it.
>
> On a technical level I think what you have done is really cool and useful.
>
> I would be happy to chat more about this if you want.
>
>
> Gareth
>
>
>
> --
> Gareth Reakes, CTO         we7 - Great Free Music
> +44-20-7117-0809                   http://www.we7.com
>
> "The music business is a cruel and shallow money trench, a long plastic hallway where thieves and pimps run free, and good men die like dogs. There's also a negative side." - Hunter S. Thompson
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>

Re: Xerces Performance Acceleration Project: icXML

Posted by Rob Cameron <ro...@international-characters.com>.
That's great, Gareth.   I do think these lists are the right starting
point, and that creating a community of developers/users is key.    I think
I won't approach the general@incubator.apache.org list immediately because
it just seems premature.    (But thanks anyway, Michael, it seems like a
good route if we aren't able to find a champion through the process of
community building.)

We'll certainly be continuing the work within both ICI and the research
lab at SFU.   Incidentally, for someone with the interest,  doing open-source
project work could be combined with graduate studies ...

On Fri, Feb 1, 2013 at 4:20 AM, Gareth Reakes <ga...@we7.com> wrote:
> Hey Rob,
>
>>>
>>> We are interested in feedback and interest from developers and
>>> potential users.   We are also interested in identifying a potential
>>> Champion who could help put us on track to become an official
>>> Xerces subproject.
>>
>> If you're actively looking for a champion you may want to post to the
>> general@incubator.apache.org list to get the attention of a larger
>> audience of ASF members who might be interested in that role.
>>
>
> I would be happy to help out here although posting to that list still makes sense. The key thing we need to do is work on creating a community that will support the ongoing development and maintenance. There are not many people that contribute to Xerces these days and any reticence you may sense is because we would want to make sure that there was enough interest to support over the medium/long term. Its not going to be code we know about or can support easily so the worst thing for us would be to accept the (very generous) code gift but then be unable to support the users that come with it.
>
> On a technical level I think what you have done is really cool and useful.
>
> I would be happy to chat more about this if you want.
>
>
> Gareth
>
>
>
> --
> Gareth Reakes, CTO         we7 - Great Free Music
> +44-20-7117-0809                   http://www.we7.com
>
> "The music business is a cruel and shallow money trench, a long plastic hallway where thieves and pimps run free, and good men die like dogs. There's also a negative side." - Hunter S. Thompson
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Xerces Performance Acceleration Project: icXML

Posted by Gareth Reakes <ga...@we7.com>.
Hey Rob,

>> 
>> We are interested in feedback and interest from developers and
>> potential users.   We are also interested in identifying a potential
>> Champion who could help put us on track to become an official
>> Xerces subproject.
> 
> If you're actively looking for a champion you may want to post to the 
> general@incubator.apache.org list to get the attention of a larger 
> audience of ASF members who might be interested in that role.
> 

I would be happy to help out here although posting to that list still makes sense. The key thing we need to do is work on creating a community that will support the ongoing development and maintenance. There are not many people that contribute to Xerces these days and any reticence you may sense is because we would want to make sure that there was enough interest to support over the medium/long term. Its not going to be code we know about or can support easily so the worst thing for us would be to accept the (very generous) code gift but then be unable to support the users that come with it.

On a technical level I think what you have done is really cool and useful. 

I would be happy to chat more about this if you want.


Gareth



-- 
Gareth Reakes, CTO         we7 - Great Free Music
+44-20-7117-0809                   http://www.we7.com

"The music business is a cruel and shallow money trench, a long plastic hallway where thieves and pimps run free, and good men die like dogs. There's also a negative side." - Hunter S. Thompson








---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Xerces Performance Acceleration Project: icXML

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Rob Cameron <ro...@international-characters.com> wrote on 27/01/2013 
11:18:33 AM:

> From: Rob Cameron <ro...@international-characters.com>
> To: c-dev@xerces.apache.org, c-users@xerces.apache.org, 
> Date: 27/01/2013 11:19 AM
> Subject: Xerces Performance Acceleration Project: icXML
> 
> icXML is the name of our project to dramatically accelerate
> Xerces performance on modern commodity processors by taking
> advantage of SIMD and multicore capabilities and parallel bit
> stream technology.
> 
> We are interested in feedback and interest from developers and
> potential users.   We are also interested in identifying a potential
> Champion who could help put us on track to become an official
> Xerces subproject.

If you're actively looking for a champion you may want to post to the 
general@incubator.apache.org list to get the attention of a larger 
audience of ASF members who might be interested in that role.

> Version 0.8 of icXML has been released and is available together
> with a development version on the costar.sfu.ca server.
> svn co http://parabix.costar.sfu.ca/svn/icXML/icXML-0.8
> svn co http://parabix.costar.sfu.ca/svn/icXML/icXML-devel
> Trac browser: http://parabix.costar.sfu.ca/browser/icXML
> 
> To get an idea of the performance prospects, here are end-to-end
> figures using Xerces-C 3.1.1 and icXML with a GML-to-SVG conversion
> application.

<snip/>

> Rob Cameron,
> CTO, International Characters, Inc
> Professor of Computing Science, Simon Fraser University
> http://www.international-characters.com/
> http://parabix.costar.sfu.ca/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org

Thanks.

Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Xerces Performance Acceleration Project: icXML

Posted by Rob Cameron <ro...@international-characters.com>.
Hi, Boris.

On Sun, Jan 27, 2013 at 10:18 AM, Boris Kolpackov
<bo...@codesynthesis.com> wrote:
>
> I wanted to try icXML with CodeSynthesis XSD[1] for some time
> now. Just haven't been able to find the time.
>
> I have a few questions:
>
> 1. It is my understanding that icXML is interface-compatible
>    with Xerces-C++ 3-series. Is that correct?
>

Yes, this is correct.

> 2. Have you done any parallelization of the XML Schema validation
>    engine?
>

This is on our roadmap.    We have two forms of parallelization in mind:
assigning validation to separate threads (there is engineering required,
but it is quite feasible with our model), and SIMD parallelization of
data type and grammar validation (research required).

> 3. You've shown results for icXML in two configurations, single-
>    threaded and with 2 threads. Is there any documentation that
>    describes these extra parameters/options/etc. In other words,
>    how would I go about specifying the number of threads?
>

The current icXML release is single-threaded.   The experimental two-thread
version was proof-of-concept, we are presently redesigning to be able to
use multiple pipeline stages.

> [1] http://www.codesynthesis.com/products/xsd/
>
> Boris
>
> --
> Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
> Compiler-based ORM system for C++      http://codesynthesis.com/products/odb
> Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
> XML data binding for embedded systems  http://codesynthesis.com/products/xsde

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Xerces Performance Acceleration Project: icXML

Posted by Rob Cameron <ro...@international-characters.com>.
Hi, Boris.

On Sun, Jan 27, 2013 at 10:18 AM, Boris Kolpackov
<bo...@codesynthesis.com> wrote:
>
> I wanted to try icXML with CodeSynthesis XSD[1] for some time
> now. Just haven't been able to find the time.
>
> I have a few questions:
>
> 1. It is my understanding that icXML is interface-compatible
>    with Xerces-C++ 3-series. Is that correct?
>

Yes, this is correct.

> 2. Have you done any parallelization of the XML Schema validation
>    engine?
>

This is on our roadmap.    We have two forms of parallelization in mind:
assigning validation to separate threads (there is engineering required,
but it is quite feasible with our model), and SIMD parallelization of
data type and grammar validation (research required).

> 3. You've shown results for icXML in two configurations, single-
>    threaded and with 2 threads. Is there any documentation that
>    describes these extra parameters/options/etc. In other words,
>    how would I go about specifying the number of threads?
>

The current icXML release is single-threaded.   The experimental two-thread
version was proof-of-concept, we are presently redesigning to be able to
use multiple pipeline stages.

> [1] http://www.codesynthesis.com/products/xsd/
>
> Boris
>
> --
> Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
> Compiler-based ORM system for C++      http://codesynthesis.com/products/odb
> Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
> XML data binding for embedded systems  http://codesynthesis.com/products/xsde

Re: Xerces Performance Acceleration Project: icXML

Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi Rob,

Rob Cameron <ro...@international-characters.com> writes:

> icXML is the name of our project to dramatically accelerate
> Xerces performance on modern commodity processors by taking
> advantage of SIMD and multicore capabilities and parallel bit
> stream technology.

I wanted to try icXML with CodeSynthesis XSD[1] for some time
now. Just haven't been able to find the time.

I have a few questions:

1. It is my understanding that icXML is interface-compatible
   with Xerces-C++ 3-series. Is that correct?

2. Have you done any parallelization of the XML Schema validation
   engine?

3. You've shown results for icXML in two configurations, single-
   threaded and with 2 threads. Is there any documentation that
   describes these extra parameters/options/etc. In other words,
   how would I go about specifying the number of threads?

[1] http://www.codesynthesis.com/products/xsd/

Boris

-- 
Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
Compiler-based ORM system for C++      http://codesynthesis.com/products/odb
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Xerces Performance Acceleration Project: icXML

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Rob Cameron <ro...@international-characters.com> wrote on 27/01/2013 
11:18:33 AM:

> From: Rob Cameron <ro...@international-characters.com>
> To: c-dev@xerces.apache.org, c-users@xerces.apache.org, 
> Date: 27/01/2013 11:19 AM
> Subject: Xerces Performance Acceleration Project: icXML
> 
> icXML is the name of our project to dramatically accelerate
> Xerces performance on modern commodity processors by taking
> advantage of SIMD and multicore capabilities and parallel bit
> stream technology.
> 
> We are interested in feedback and interest from developers and
> potential users.   We are also interested in identifying a potential
> Champion who could help put us on track to become an official
> Xerces subproject.

If you're actively looking for a champion you may want to post to the 
general@incubator.apache.org list to get the attention of a larger 
audience of ASF members who might be interested in that role.

> Version 0.8 of icXML has been released and is available together
> with a development version on the costar.sfu.ca server.
> svn co http://parabix.costar.sfu.ca/svn/icXML/icXML-0.8
> svn co http://parabix.costar.sfu.ca/svn/icXML/icXML-devel
> Trac browser: http://parabix.costar.sfu.ca/browser/icXML
> 
> To get an idea of the performance prospects, here are end-to-end
> figures using Xerces-C 3.1.1 and icXML with a GML-to-SVG conversion
> application.

<snip/>

> Rob Cameron,
> CTO, International Characters, Inc
> Professor of Computing Science, Simon Fraser University
> http://www.international-characters.com/
> http://parabix.costar.sfu.ca/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org

Thanks.

Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org


Re: Xerces Performance Acceleration Project: icXML

Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi Rob,

Rob Cameron <ro...@international-characters.com> writes:

> icXML is the name of our project to dramatically accelerate
> Xerces performance on modern commodity processors by taking
> advantage of SIMD and multicore capabilities and parallel bit
> stream technology.

I wanted to try icXML with CodeSynthesis XSD[1] for some time
now. Just haven't been able to find the time.

I have a few questions:

1. It is my understanding that icXML is interface-compatible
   with Xerces-C++ 3-series. Is that correct?

2. Have you done any parallelization of the XML Schema validation
   engine?

3. You've shown results for icXML in two configurations, single-
   threaded and with 2 threads. Is there any documentation that
   describes these extra parameters/options/etc. In other words,
   how would I go about specifying the number of threads?

[1] http://www.codesynthesis.com/products/xsd/

Boris

-- 
Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
Compiler-based ORM system for C++      http://codesynthesis.com/products/odb
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde