You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by John Snelson <jo...@oracle.com> on 2008/04/11 17:57:26 UTC
Xerces-C API changes for XQilla
Hi everyone,
Boris Kolpackov and I were just having a discussion over on the XQilla
mailing list. He suggested that now was the time to make a few changes
in the Xerces-C API to benefit XQilla - so I've written a proposal for
what I'd like to change.
I'll give as much help as I can in making these changes, but at the
moment I'm going through the process of getting permission to sign the
contributor agreement.
Please let me know your opinions on making these changes.
John
---------------------------------------------------------
1) Problem:
XPath 2.0 is just different to XPath 1.0. We've therefore got our own
version of DOMXPathResult (XPath2Result) which makes more sense in this
context:
http://xqilla.sourceforge.net/docs/dom3-api/classXPath2Result.html
Solution:
It's probably simple enough to either extend DOMXPathResult to include
the extra functionality in XPath2Result, or to include it as a new class
called DOMXPath2Result.
---------------------------------------------------------
2) Problem:
It's necessary to get access to DOMDocumentImpl, which isn't in the
public API, in order to implement the DOM3 XPath API. Needing access to
the Xerces-C source code to compile XQilla is a big problem for our
maintainers. We need DOMDocumentImpl for a number of reasons:
a) In order to implement DOMXPathNamespace, we need to allocate using
the DOMDocumentImpl's memory manager and string pool.
b) We need to access DOMDocumentImpl::changes() to invalidate the xpath
result iterator if the DOMDocument is changed
(DOMXPathResult::getInvalidIteratorState()).
c) XQuery allows a document node to contain multiple document elements.
We derive from DOMDocumentImpl to allow that to happen.
d) In order to implement the DOMXPathEvaluator interface on the
DOMDocument object we need to derive from DOMDocumentImpl.
Solution:
Put DOMDocumentImpl in the public API.
---------------------------------------------------------
3) Problem:
We need access to DOMWriterImpl in order to override it to know how to
write namespace nodes. DOMWriterImpl isn't in the public API.
Solution:
Put DOMWriterImpl in the public API, or implement namespace node
handling in it.
---------------------------------------------------------
4) Problem:
XQilla can construct typed DOMDocuments, so we need a way to set the
type information on these nodes. Currently we use DOMTypeInfoImpl,
DOMAttrImpl and DOMElementNSImpl to do this, which aren't in the public API.
Solution:
Implement a method of setting the type information on an element or
attribute, or put DOMTypeInfoImpl, DOMAttrImpl and DOMElementNSImpl in
the public API.
---------------------------------------------------------
5) Problem:
RegularExpression is not thread safe or consistent with it's use of
MemoryManager. It's also not quite flexible enough to implement XSLT
2.0's analyze-string, and it has bugs in the replace() methods.
http://www.w3.org/TR/xslt20/#analyze-string
Solution:
I have a patch that fixes all of this in Xerces-C 2.8, and I can update
it to apply to 3.0. I'm in the process of getting permission to sign the
contributor agreement.
---------------------------------------------------------
6) Problem:
The socket and WinSock HTTP InputStream implementations have fixed
buffers which can result in buffer overflow. They needlessly duplicate a
whole load of code that could be shared. In addition, a lot of
algorithms need access to the HTTP "Content-Type" header, to decide how
to parse a file, or what encoding it is in - for instance see XSLT 2.0's
unparsed-text() function:
http://www.w3.org/TR/xslt20/#unparsed-text
Solution:
I have a patch that implements this functionality for
UnixHTTPURLInputStream and BinHTTPURLInputStream (WinSock) in Xerces-C
2.8. I added BinInputStream::getContentType() to get access to the
"Content-Type" header. I can update this code for Xerces-C 3.0.
---------------------------------------------------------
7) Problem:
GrammarResolver has a bug where it fails to initialize it's XSModel if
the XMLGrammarPool it is created with is locked.
Solution:
We hack this at the moment, but it would be great if this could be fixed.
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Gareth Reakes <ga...@we7.com>.
I have no problem with these changes John.
Gareth
On 11 Apr 2008, at 16:57, John Snelson wrote:
> Hi everyone,
>
> Boris Kolpackov and I were just having a discussion over on the
> XQilla mailing list. He suggested that now was the time to make a
> few changes in the Xerces-C API to benefit XQilla - so I've written
> a proposal for what I'd like to change.
>
> I'll give as much help as I can in making these changes, but at the
> moment I'm going through the process of getting permission to sign
> the contributor agreement.
>
> Please let me know your opinions on making these changes.
>
> John
>
> ---------------------------------------------------------
>
> 1) Problem:
>
> XPath 2.0 is just different to XPath 1.0. We've therefore got our
> own version of DOMXPathResult (XPath2Result) which makes more sense
> in this context:
>
> http://xqilla.sourceforge.net/docs/dom3-api/classXPath2Result.html
>
> Solution:
>
> It's probably simple enough to either extend DOMXPathResult to
> include the extra functionality in XPath2Result, or to include it as
> a new class called DOMXPath2Result.
>
> ---------------------------------------------------------
>
> 2) Problem:
>
> It's necessary to get access to DOMDocumentImpl, which isn't in the
> public API, in order to implement the DOM3 XPath API. Needing access
> to the Xerces-C source code to compile XQilla is a big problem for
> our maintainers. We need DOMDocumentImpl for a number of reasons:
>
> a) In order to implement DOMXPathNamespace, we need to allocate
> using the DOMDocumentImpl's memory manager and string pool.
> b) We need to access DOMDocumentImpl::changes() to invalidate the
> xpath result iterator if the DOMDocument is changed
> (DOMXPathResult::getInvalidIteratorState()).
> c) XQuery allows a document node to contain multiple document
> elements. We derive from DOMDocumentImpl to allow that to happen.
> d) In order to implement the DOMXPathEvaluator interface on the
> DOMDocument object we need to derive from DOMDocumentImpl.
>
> Solution:
>
> Put DOMDocumentImpl in the public API.
>
> ---------------------------------------------------------
>
> 3) Problem:
>
> We need access to DOMWriterImpl in order to override it to know how
> to write namespace nodes. DOMWriterImpl isn't in the public API.
>
> Solution:
>
> Put DOMWriterImpl in the public API, or implement namespace node
> handling in it.
>
> ---------------------------------------------------------
>
> 4) Problem:
>
> XQilla can construct typed DOMDocuments, so we need a way to set the
> type information on these nodes. Currently we use DOMTypeInfoImpl,
> DOMAttrImpl and DOMElementNSImpl to do this, which aren't in the
> public API.
>
> Solution:
>
> Implement a method of setting the type information on an element or
> attribute, or put DOMTypeInfoImpl, DOMAttrImpl and DOMElementNSImpl
> in the public API.
>
> ---------------------------------------------------------
>
> 5) Problem:
>
> RegularExpression is not thread safe or consistent with it's use of
> MemoryManager. It's also not quite flexible enough to implement XSLT
> 2.0's analyze-string, and it has bugs in the replace() methods.
>
> http://www.w3.org/TR/xslt20/#analyze-string
>
> Solution:
>
> I have a patch that fixes all of this in Xerces-C 2.8, and I can
> update it to apply to 3.0. I'm in the process of getting permission
> to sign the contributor agreement.
>
> ---------------------------------------------------------
>
> 6) Problem:
>
> The socket and WinSock HTTP InputStream implementations have fixed
> buffers which can result in buffer overflow. They needlessly
> duplicate a whole load of code that could be shared. In addition, a
> lot of algorithms need access to the HTTP "Content-Type" header, to
> decide how to parse a file, or what encoding it is in - for instance
> see XSLT 2.0's unparsed-text() function:
>
> http://www.w3.org/TR/xslt20/#unparsed-text
>
> Solution:
>
> I have a patch that implements this functionality for
> UnixHTTPURLInputStream and BinHTTPURLInputStream (WinSock) in Xerces-
> C 2.8. I added BinInputStream::getContentType() to get access to the
> "Content-Type" header. I can update this code for Xerces-C 3.0.
>
> ---------------------------------------------------------
>
> 7) Problem:
>
> GrammarResolver has a bug where it fails to initialize it's XSModel
> if the XMLGrammarPool it is created with is locked.
>
> Solution:
>
> We hack this at the moment, but it would be great if this could be
> fixed.
>
> --
> John Snelson, Oracle Corporation http://snelson.org.uk/john
> Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
> XQilla: http://xqilla.sourceforge.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>
>
--
Gareth Reakes, CTO WE7
+44-20-7117-0809 http://www.we7.com
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi John,
John Snelson <jo...@oracle.com> writes:
> Have you got a list of the Xerces-C API changes in 3.0 anywhere?
No, unfortunately, there does not seem to be such a list. I can
try to come up with one for the DOM API if that's helpful.
Boris
--
Boris Kolpackov, Code Synthesis Tools http://codesynthesis.com/~boris/blog
Open source XML data binding for C++: http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
Boris Kolpackov wrote:
>> I think that some headers related to DOMDocumentImpl are also needed. An
>> exhaustive list of the ones that XQilla includes are these:
>>
>> [...]
>>
>> I imagine that you also need to include the headers that these files
>> include themselves.
>
> All headers in dom/impl/ are now installed. I also checked and none
> of them include anything from internal/ so I thing there won't be any
> problems. Though it is always better to check that we are not missing
> anything on the real use-case. Are you planning to start working on
> the 3.0.0 port of XQilla before the release?
I was just thinking about that - I think that would be the best way to
go. I took a look at what would be needed a while back and I don't think
it's much. Have you got a list of the Xerces-C API changes in 3.0 anywhere?
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Gareth Reakes <ga...@we7.com>.
John Snelson wrote:
> Boris Kolpackov wrote:
>> John Snelson <jo...@oracle.com> writes:
>>
>>> I'm not so sure about that. The existing DOMXPathResult is an
>>> implementation of a W3C published DOM interface, and I think there's
>>> value in keeping it that way. There's all sorts of improvements I'd like
>>> to make to W3C DOM otherwise ;-).
>>
>> Here are some of the reasons why I believe we should merge the two
>> interfaces:
>>
>> 1. The XPath 2 interface supports requirements of XPath 1.
>>
>> 2. It is my understanding (from talking to various people and reading
>> the W3C mailing lists) that the XPath 1 DOM interface is commonly
>> believed to be a half-backed work that is full of holes and omissions
>> mainly because it was done in a hurry to get it into the DOM spec.
>> To me personally the fact that the official interface of NSResolver
>> does not allow the user to specify custom namespace-prefix mappings
>> makes it clear that the authors of the spec had no real-world
>> experience in this area.
>>
>> 3. There won't be any revisions to the DOM spec so there is no hope
>> of the official support for XPath 2.
>>
>> So I think, as far as the XPath part of the DOM spec is concerned, we
>> should try to make the interface sensible rather than trying to conform
>> to the spec. I chatted to Alberto and he also thinks that we should
>> rather generalize the interface. Any thoughts?
>
> That makes sense, and the fact that the W3C Web API group took control
> of the spec makes me think that the W3C might change it anyway. I guess
> I'm agnostic about what to do - does anyone else have any opinions?
>
We normally err on the side of supporting interfaces for the standard
reasons. In this case I don't really mind either way.
Gareth
--
Gareth Reakes, CTO WE7
+44-20-7117-0809 http://www.we7.com
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
Boris Kolpackov wrote:
> John Snelson <jo...@oracle.com> writes:
>
>> I'm not so sure about that. The existing DOMXPathResult is an
>> implementation of a W3C published DOM interface, and I think there's
>> value in keeping it that way. There's all sorts of improvements I'd like
>> to make to W3C DOM otherwise ;-).
>
> Here are some of the reasons why I believe we should merge the two
> interfaces:
>
> 1. The XPath 2 interface supports requirements of XPath 1.
>
> 2. It is my understanding (from talking to various people and reading
> the W3C mailing lists) that the XPath 1 DOM interface is commonly
> believed to be a half-backed work that is full of holes and omissions
> mainly because it was done in a hurry to get it into the DOM spec.
> To me personally the fact that the official interface of NSResolver
> does not allow the user to specify custom namespace-prefix mappings
> makes it clear that the authors of the spec had no real-world
> experience in this area.
>
> 3. There won't be any revisions to the DOM spec so there is no hope
> of the official support for XPath 2.
>
> So I think, as far as the XPath part of the DOM spec is concerned, we
> should try to make the interface sensible rather than trying to conform
> to the spec. I chatted to Alberto and he also thinks that we should
> rather generalize the interface. Any thoughts?
That makes sense, and the fact that the W3C Web API group took control
of the spec makes me think that the W3C might change it anyway. I guess
I'm agnostic about what to do - does anyone else have any opinions?
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Boris Kolpackov <bo...@codesynthesis.com> wrote on 05/06/2008 03:49:23 PM:
> 3. There won't be any revisions to the DOM spec so there is no hope
> of the official support for XPath 2.
Maybe. The Web API working group [1] took ownership of the DOM Level 3
XPath spec, so there's still some hope it will be revised and perhaps at
some point become a W3C Recommendation.
[1] http://www.w3.org/2006/webapi/
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
Re: Xerces-C API changes for XQilla
Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi John,
John Snelson <jo...@oracle.com> writes:
> I'm not so sure about that. The existing DOMXPathResult is an
> implementation of a W3C published DOM interface, and I think there's
> value in keeping it that way. There's all sorts of improvements I'd like
> to make to W3C DOM otherwise ;-).
Here are some of the reasons why I believe we should merge the two
interfaces:
1. The XPath 2 interface supports requirements of XPath 1.
2. It is my understanding (from talking to various people and reading
the W3C mailing lists) that the XPath 1 DOM interface is commonly
believed to be a half-backed work that is full of holes and omissions
mainly because it was done in a hurry to get it into the DOM spec.
To me personally the fact that the official interface of NSResolver
does not allow the user to specify custom namespace-prefix mappings
makes it clear that the authors of the spec had no real-world
experience in this area.
3. There won't be any revisions to the DOM spec so there is no hope
of the official support for XPath 2.
So I think, as far as the XPath part of the DOM spec is concerned, we
should try to make the interface sensible rather than trying to conform
to the spec. I chatted to Alberto and he also thinks that we should
rather generalize the interface. Any thoughts?
Boris
--
Boris Kolpackov, Code Synthesis Tools http://codesynthesis.com/~boris/blog
Open source XML data binding for C++: http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
Boris Kolpackov wrote:
> John Snelson <jo...@oracle.com> writes:
>
>> I've also changed my mind (again) about the DOMXPathResult object. The
>> problem is this - DOMXPathResult::iterateNext() returns DOMNode, because
>> in XPath 1 you can only get single values of a list of nodes. XQilla's
>> XPath2Result::iterateNext() returns a bool, and simply moved the
>> iterator to look at the next item in the sequence, which in XPath 2.0
>> can be a heterogeneous sequence of nodes and values. The same is true
>> for snapshotItem(). I've come to the conclusion that it would be best
>> for XQilla to stick with the XPath2Result API, and it would be good if
>> Xerces-C incorporated it into it's API.
>
> Since the XPath 2 approach is more generic and can supports XPath
> 1 requirements, I think we should just change the DOMXPathResult
> interface to return bool from iterateNext() and snapshotItem().
I'm not so sure about that. The existing DOMXPathResult is an
implementation of a W3C published DOM interface, and I think there's
value in keeping it that way. There's all sorts of improvements I'd like
to make to W3C DOM otherwise ;-).
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi John,
John Snelson <jo...@oracle.com> writes:
> I've also changed my mind (again) about the DOMXPathResult object. The
> problem is this - DOMXPathResult::iterateNext() returns DOMNode, because
> in XPath 1 you can only get single values of a list of nodes. XQilla's
> XPath2Result::iterateNext() returns a bool, and simply moved the
> iterator to look at the next item in the sequence, which in XPath 2.0
> can be a heterogeneous sequence of nodes and values. The same is true
> for snapshotItem(). I've come to the conclusion that it would be best
> for XQilla to stick with the XPath2Result API, and it would be good if
> Xerces-C incorporated it into it's API.
Since the XPath 2 approach is more generic and can supports XPath
1 requirements, I think we should just change the DOMXPathResult
interface to return bool from iterateNext() and snapshotItem().
Alberto, do you have any objections to this change?
Boris
--
Boris Kolpackov, Code Synthesis Tools http://codesynthesis.com/~boris/blog
Open source XML data binding for C++: http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
John Snelson wrote:
> Boris Kolpackov wrote:
>>> Having given it some thought, I think that merging the two would be the
>>> best idea - it would be a breaking change for XQilla users, but it would
>>> be a move to a more standard API.
>>
>> Agree. If you can come up with a patch, I will review and commit it.
>
> I've attached a patch against the SVN trunk for the XPath API changes
> we've discussed.
I think I sent that out a bit quick. This new patch implements the new
API in the existing implementation objects, so that Xerces-C will
compile with the patch applied.
I've also changed my mind (again) about the DOMXPathResult object. The
problem is this - DOMXPathResult::iterateNext() returns DOMNode, because
in XPath 1 you can only get single values of a list of nodes. XQilla's
XPath2Result::iterateNext() returns a bool, and simply moved the
iterator to look at the next item in the sequence, which in XPath 2.0
can be a heterogeneous sequence of nodes and values. The same is true
for snapshotItem(). I've come to the conclusion that it would be best
for XQilla to stick with the XPath2Result API, and it would be good if
Xerces-C incorporated it into it's API.
Any questions or opinions?
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
Boris Kolpackov wrote:
>> Having given it some thought, I think that merging the two would be the
>> best idea - it would be a breaking change for XQilla users, but it would
>> be a move to a more standard API.
>
> Agree. If you can come up with a patch, I will review and commit it.
I've attached a patch against the SVN trunk for the XPath API changes
we've discussed.
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
Re: Xerces-C API changes for XQilla
Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi John,
John Snelson <jo...@oracle.com> writes:
> Having given it some thought, I think that merging the two would be the
> best idea - it would be a breaking change for XQilla users, but it would
> be a move to a more standard API.
Agree. If you can come up with a patch, I will review and commit it.
> I think that some headers related to DOMDocumentImpl are also needed. An
> exhaustive list of the ones that XQilla includes are these:
>
> [...]
>
> I imagine that you also need to include the headers that these files
> include themselves.
All headers in dom/impl/ are now installed. I also checked and none
of them include anything from internal/ so I thing there won't be any
problems. Though it is always better to check that we are not missing
anything on the real use-case. Are you planning to start working on
the 3.0.0 port of XQilla before the release?
> I've got permission now - hurrah! In the next couple of days when I find
> some time I'll port the patches over to Xerces-C 3.0.
That's great news!
> I looked into the MacOS net accessors, and it seemed to be impossible to
> support this API with them. However, recent email on the list suggests
> that these implementations have other problems (a problem with fork?).
>
> Since you won't get any content type information back for file URLs
> either, I'd suggest that a null return should be the standard response
> when the content type is unavailable.
I agree. If there is no easy way to support this with non-mainstream
net accessors then we should simply return NULL.
Boris
--
Boris Kolpackov, Code Synthesis Tools http://codesynthesis.com/~boris/blog
Open source XML data binding for C++: http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
I've made enquiries through Oracle about whether they have signed a CCLA
- but it may be quicker to ask someone in Apache the question, if Apache
has that information available.
As for an ICLA, sure no problem - I'll try and mail that off today.
The question still remains - do you guys just trust that I've done what
I said I'll do, or does someone need to check up on me? :-)
John
Gareth Reakes wrote:
> Have you got Oracle to sign a CCLA? For good measure you could sign an
> ICLA as well.
>
> Gareth
>
> John Snelson wrote:
>> John Snelson wrote:
>>> I've got permission now - hurrah! In the next couple of days when I
>>> find some time I'll port the patches over to Xerces-C 3.0.
>>
>> So as I said, Oracle have OKed me to contribute code to Xerces-C. To
>> pre-empt any delays, what steps should I take to make sure that you
>> guys are happy? Is there anything more I have to sign? Who can I
>> contact about this?
>>
>> John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Gareth Reakes <ga...@embracemobile.com>.
Neil Graham wrote:
> Hi all. I'm not totally up-to-date these days with Apache's contribution
> requirements, but I had been under the impression that the ICLA was
> required in all cases. The CCLA is a good (probably required) too.
>
> Cheers,
> Neil
> Neil Graham
> Manager, XML Transformation and Query Development
> IBM Toronto Lab
> Phone: 905-413-3519, T/L 413-3519
> E-mail: neilg@ca.ibm.com
>
>
>
>
>
> Gareth Reakes <ga...@we7.com>
> 05/02/08 08:39 AM
> Please respond to
> c-dev@xerces.apache.org
>
>
> To
> c-dev@xerces.apache.org
> cc
>
> Subject
> Re: Xerces-C API changes for XQilla
>
>
>
>
>
>
>
>
> Have you got Oracle to sign a CCLA? For good measure you could sign an
> ICLA as well.
>
> Gareth
>
> John Snelson wrote:
>> John Snelson wrote:
>>> I've got permission now - hurrah! In the next couple of days when I
>>> find some time I'll port the patches over to Xerces-C 3.0.
>> So as I said, Oracle have OKed me to contribute code to Xerces-C. To
>> pre-empt any delays, what steps should I take to make sure that you guys
>
>> are happy? Is there anything more I have to sign? Who can I contact
>> about this?
>>
>> John
>>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Neil Graham <ne...@ca.ibm.com>.
Hi all. I'm not totally up-to-date these days with Apache's contribution
requirements, but I had been under the impression that the ICLA was
required in all cases. The CCLA is a good (probably required) too.
Cheers,
Neil
Neil Graham
Manager, XML Transformation and Query Development
IBM Toronto Lab
Phone: 905-413-3519, T/L 413-3519
E-mail: neilg@ca.ibm.com
Gareth Reakes <ga...@we7.com>
05/02/08 08:39 AM
Please respond to
c-dev@xerces.apache.org
To
c-dev@xerces.apache.org
cc
Subject
Re: Xerces-C API changes for XQilla
Have you got Oracle to sign a CCLA? For good measure you could sign an
ICLA as well.
Gareth
John Snelson wrote:
> John Snelson wrote:
>> I've got permission now - hurrah! In the next couple of days when I
>> find some time I'll port the patches over to Xerces-C 3.0.
>
> So as I said, Oracle have OKed me to contribute code to Xerces-C. To
> pre-empt any delays, what steps should I take to make sure that you guys
> are happy? Is there anything more I have to sign? Who can I contact
> about this?
>
> John
>
--
Gareth Reakes, CTO WE7
+44-20-7117-0809 http://www.we7.com
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Gareth Reakes <ga...@we7.com>.
Have you got Oracle to sign a CCLA? For good measure you could sign an
ICLA as well.
Gareth
John Snelson wrote:
> John Snelson wrote:
>> I've got permission now - hurrah! In the next couple of days when I
>> find some time I'll port the patches over to Xerces-C 3.0.
>
> So as I said, Oracle have OKed me to contribute code to Xerces-C. To
> pre-empt any delays, what steps should I take to make sure that you guys
> are happy? Is there anything more I have to sign? Who can I contact
> about this?
>
> John
>
--
Gareth Reakes, CTO WE7
+44-20-7117-0809 http://www.we7.com
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
John Snelson wrote:
> I've got permission now - hurrah! In the next couple of days when I find
> some time I'll port the patches over to Xerces-C 3.0.
So as I said, Oracle have OKed me to contribute code to Xerces-C. To
pre-empt any delays, what steps should I take to make sure that you guys
are happy? Is there anything more I have to sign? Who can I contact
about this?
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
Hi Boris,
Boris Kolpackov wrote:
> John Snelson <jo...@oracle.com> writes:
>
>> 1) Problem:
>>
>> XPath 2.0 is just different to XPath 1.0. We've therefore got our own
>> version of DOMXPathResult (XPath2Result) which makes more sense in this
>> context:
>>
>> http://xqilla.sourceforge.net/docs/dom3-api/classXPath2Result.html
>>
>> Solution:
>>
>> It's probably simple enough to either extend DOMXPathResult to include
>> the extra functionality in XPath2Result, or to include it as a new class
>> called DOMXPath2Result.
>
> I did a quick check and it appears that the DOMXPathResult is very
> similar to DOMXPath2Result. I would therefore suggest that we try
> to add the missing functionality to DOMXPathResult as non-standard
> extensions (though we should try to use names that will likely be
> used in the next version of DOM3 when it is updated to include
> support for XPath 2, for example getIntegerValue instead of asInt).
> What is your feeling on this approach? Also did you base your
> DOMXPath2Result on any draft spec (e.g., where do the asDouble,
> asInt, etc., names come from)?
XPath2Result wasn't based on any draft spec IIRC. Gareth probably
remembers more about it's design, since I wasn't heavily involved in
Pathan at the time.
Having given it some thought, I think that merging the two would be the
best idea - it would be a breaking change for XQilla users, but it would
be a move to a more standard API.
>> 2) Problem:
>>
>> It's necessary to get access to DOMDocumentImpl, which isn't in the
>> public API, in order to implement the DOM3 XPath API. Needing access to
>> the Xerces-C source code to compile XQilla is a big problem for our
>> maintainers. We need DOMDocumentImpl for a number of reasons:
>>
>> [...]
>>
>> Solution:
>>
>> Put DOMDocumentImpl in the public API.
>
> The DOMDocumentImpl.hpp is now installed with the rest of the headers.
> I've also changed all private data members and functions to be protected
> in all DOM*Impl classes. Is there anything else we need to do?
I think that some headers related to DOMDocumentImpl are also needed. An
exhaustive list of the ones that XQilla includes are these:
xercesc/dom/impl/DOMAttrImpl.hpp
xercesc/dom/impl/DOMCasts.hpp
xercesc/dom/impl/DOMDocumentImpl.hpp
xercesc/dom/impl/DOMDocumentTypeImpl.hpp
xercesc/dom/impl/DOMElementNSImpl.hpp
xercesc/dom/impl/DOMNodeImpl.hpp
xercesc/dom/impl/DOMRangeImpl.hpp
xercesc/dom/impl/DOMTypeInfoImpl.hpp
xercesc/dom/impl/DOMWriterImpl.hpp
I imagine that you also need to include the headers that these files
include themselves.
>> 5) Problem:
>>
>> RegularExpression is not thread safe or consistent with it's use of
>> MemoryManager. It's also not quite flexible enough to implement XSLT
>> 2.0's analyze-string, and it has bugs in the replace() methods.
>>
>> http://www.w3.org/TR/xslt20/#analyze-string
>>
>> Solution:
>>
>> I have a patch that fixes all of this in Xerces-C 2.8, and I can update
>> it to apply to 3.0. I'm in the process of getting permission to sign the
>> contributor agreement.
>
> Sounds good.
I've got permission now - hurrah! In the next couple of days when I find
some time I'll port the patches over to Xerces-C 3.0.
>> 6) Problem:
>>
>> The socket and WinSock HTTP InputStream implementations have fixed
>> buffers which can result in buffer overflow. They needlessly duplicate a
>> whole load of code that could be shared. In addition, a lot of
>> algorithms need access to the HTTP "Content-Type" header, to decide how
>> to parse a file, or what encoding it is in - for instance see XSLT 2.0's
>> unparsed-text() function:
>>
>> http://www.w3.org/TR/xslt20/#unparsed-text
>>
>> Solution:
>>
>> I have a patch that implements this functionality for
>> UnixHTTPURLInputStream and BinHTTPURLInputStream (WinSock) in Xerces-C
>> 2.8. I added BinInputStream::getContentType() to get access to the
>> "Content-Type" header. I can update this code for Xerces-C 3.0.
>
> Sounds good. There are also Curl, MacOS, and libWWW net accessors.
> Hopefully it will be easy to implement getContentType() for them.
For Curl the option to use seems to be CURLOPT_HEADERFUNCTION, although
I haven't investigated more than that.
http://curl.rtin.bz/libcurl/c/curl_easy_setopt.html
For libWWW I haven't looked further by I imagine it should be possible.
I looked into the MacOS net accessors, and it seemed to be impossible to
support this API with them. However, recent email on the list suggests
that these implementations have other problems (a problem with fork?).
Since you won't get any content type information back for file URLs
either, I'd suggest that a null return should be the standard response
when the content type is unavailable.
>> 7) Problem:
>>
>> GrammarResolver has a bug where it fails to initialize it's XSModel if
>> the XMLGrammarPool it is created with is locked.
>>
>> Solution:
>>
>> We hack this at the moment, but it would be great if this could be fixed.
>
> Would you be willing to work on a patch? Also I hit a bug in this area
> once that may be related. This code works:
>
> auto_ptr<GrammarResolver> gr (new GrammarResolver (0));
>
> // load some schemas into gr
>
> XMLGrammarPool* gp = gr->getGrammarPool ();
> gp->lockPool ();
> XSModel* xsm = gp->getXSModel ();
>
> While if I remove lockPool(), the returned XSModel is invalid. Or may
> be this is how it is supposed to work.
I don't know if your problem is related. I'll try to work out a fix for
the problem I'm seeing, at least.
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
Boris Kolpackov wrote:
> John Snelson <jo...@oracle.com> writes:
>> 5) Problem:
>>
>> RegularExpression is not thread safe or consistent with it's use of
>> MemoryManager. It's also not quite flexible enough to implement XSLT
>> 2.0's analyze-string, and it has bugs in the replace() methods.
>>
>> http://www.w3.org/TR/xslt20/#analyze-string
>>
>> Solution:
>>
>> I have a patch that fixes all of this in Xerces-C 2.8, and I can update
>> it to apply to 3.0. I'm in the process of getting permission to sign the
>> contributor agreement.
>
> Sounds good.
I've updated and attached my RegularExpression patch for Xerces-C 3.0.
Is it best to post these patches to the mailing list, or would it be
better to open jira items for them.
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
Re: Xerces-C API changes for XQilla
Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi John,
John Snelson <jo...@oracle.com> writes:
> I've updated and attached my RegularExpression patch for Xerces-C 3.0.
>
> Is it best to post these patches to the mailing list, or would it be
> better to open jira items for them.
For any significant amount of code it is better to create a bug and
attach the patch to it. While attaching the patch, you are presented
with the option to assign the rights to the patch to ASF which you
need to select.
When creating the bug, add 3.0.0 to the Fix Version list. This will
add it to the list of issues that needs to be resolved before 3.0.0
can be released.
Thanks,
Boris
--
Boris Kolpackov, Code Synthesis Tools http://codesynthesis.com/~boris/blog
Open source XML data binding for C++: http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
Boris Kolpackov wrote:
> John Snelson <jo...@oracle.com> writes:
>> 5) Problem:
>>
>> RegularExpression is not thread safe or consistent with it's use of
>> MemoryManager. It's also not quite flexible enough to implement XSLT
>> 2.0's analyze-string, and it has bugs in the replace() methods.
>>
>> http://www.w3.org/TR/xslt20/#analyze-string
>>
>> Solution:
>>
>> I have a patch that fixes all of this in Xerces-C 2.8, and I can update
>> it to apply to 3.0. I'm in the process of getting permission to sign the
>> contributor agreement.
>
> Sounds good.
I've updated and attached my RegularExpression patch for Xerces-C 3.0.
Is it best to post these patches to the mailing list, or would it be
better to open jira items for them.
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
Re: Xerces-C API changes for XQilla
Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi John,
John Snelson <jo...@oracle.com> writes:
> 1) Problem:
>
> XPath 2.0 is just different to XPath 1.0. We've therefore got our own
> version of DOMXPathResult (XPath2Result) which makes more sense in this
> context:
>
> http://xqilla.sourceforge.net/docs/dom3-api/classXPath2Result.html
>
> Solution:
>
> It's probably simple enough to either extend DOMXPathResult to include
> the extra functionality in XPath2Result, or to include it as a new class
> called DOMXPath2Result.
I did a quick check and it appears that the DOMXPathResult is very
similar to DOMXPath2Result. I would therefore suggest that we try
to add the missing functionality to DOMXPathResult as non-standard
extensions (though we should try to use names that will likely be
used in the next version of DOM3 when it is updated to include
support for XPath 2, for example getIntegerValue instead of asInt).
What is your feeling on this approach? Also did you base your
DOMXPath2Result on any draft spec (e.g., where do the asDouble,
asInt, etc., names come from)?
> 2) Problem:
>
> It's necessary to get access to DOMDocumentImpl, which isn't in the
> public API, in order to implement the DOM3 XPath API. Needing access to
> the Xerces-C source code to compile XQilla is a big problem for our
> maintainers. We need DOMDocumentImpl for a number of reasons:
>
> [...]
>
> Solution:
>
> Put DOMDocumentImpl in the public API.
The DOMDocumentImpl.hpp is now installed with the rest of the headers.
I've also changed all private data members and functions to be protected
in all DOM*Impl classes. Is there anything else we need to do?
> 3) Problem:
>
> We need access to DOMWriterImpl in order to override it to know how to
> write namespace nodes. DOMWriterImpl isn't in the public API.
>
> Solution:
>
> Put DOMWriterImpl in the public API, or implement namespace node
> handling in it.
The same situation as with DOMDocumentImpl.hpp. I tend to prefer to
leave the namespace node implementation in XQilla for now since it
is XPath-specific and is not used by the limited built-in XPath
support in Xerces-C++.
> 4) Problem:
>
> XQilla can construct typed DOMDocuments, so we need a way to set the
> type information on these nodes. Currently we use DOMTypeInfoImpl,
> DOMAttrImpl and DOMElementNSImpl to do this, which aren't in the public API.
>
> Solution:
>
> Implement a method of setting the type information on an element or
> attribute, or put DOMTypeInfoImpl, DOMAttrImpl and DOMElementNSImpl in
> the public API.
I don't think an end user would ever need this functionality since the
type information is tied to the grammar being used and can only be set
by a parser that can control both. The *Impl headers are available for
XQilla to use.
> 5) Problem:
>
> RegularExpression is not thread safe or consistent with it's use of
> MemoryManager. It's also not quite flexible enough to implement XSLT
> 2.0's analyze-string, and it has bugs in the replace() methods.
>
> http://www.w3.org/TR/xslt20/#analyze-string
>
> Solution:
>
> I have a patch that fixes all of this in Xerces-C 2.8, and I can update
> it to apply to 3.0. I'm in the process of getting permission to sign the
> contributor agreement.
Sounds good.
> 6) Problem:
>
> The socket and WinSock HTTP InputStream implementations have fixed
> buffers which can result in buffer overflow. They needlessly duplicate a
> whole load of code that could be shared. In addition, a lot of
> algorithms need access to the HTTP "Content-Type" header, to decide how
> to parse a file, or what encoding it is in - for instance see XSLT 2.0's
> unparsed-text() function:
>
> http://www.w3.org/TR/xslt20/#unparsed-text
>
> Solution:
>
> I have a patch that implements this functionality for
> UnixHTTPURLInputStream and BinHTTPURLInputStream (WinSock) in Xerces-C
> 2.8. I added BinInputStream::getContentType() to get access to the
> "Content-Type" header. I can update this code for Xerces-C 3.0.
Sounds good. There are also Curl, MacOS, and libWWW net accessors.
Hopefully it will be easy to implement getContentType() for them.
> 7) Problem:
>
> GrammarResolver has a bug where it fails to initialize it's XSModel if
> the XMLGrammarPool it is created with is locked.
>
> Solution:
>
> We hack this at the moment, but it would be great if this could be fixed.
Would you be willing to work on a patch? Also I hit a bug in this area
once that may be related. This code works:
auto_ptr<GrammarResolver> gr (new GrammarResolver (0));
// load some schemas into gr
XMLGrammarPool* gp = gr->getGrammarPool ();
gp->lockPool ();
XSModel* xsm = gp->getXSModel ();
While if I remove lockPool(), the returned XSModel is invalid. Or may
be this is how it is supposed to work.
Boris
--
Boris Kolpackov, Code Synthesis Tools http://codesynthesis.com/~boris/blog
Open source XML data binding for C++: http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by John Snelson <jo...@oracle.com>.
Hi Boris,
That sounds perfectly reasonable. I think that XQilla already does all
of those things, but it would be good to get it into Xerces-C too.
John
Boris Kolpackov wrote:
> Hi John,
>
> Another area where we can improvise the interface alignment
> is DOMXPathNSResolver. I just had a quick chat with Alberto
> and here are the changes that we both agree would make sense
> (I hope the motivation is self-evident, if not -- let me know):
>
> 1. Add the ability to provide custom namespace-prefix mappings
> in the DOMXPathNSResolver interface. Something along these
> lines:
>
> virtual void
> setNamespacePrefix (const XMLCh* prefix, const XMLCh* uri);
>
> The new mapping will override any existing one. If NULL or
> empty string is passed as uri then the existing mapping, if
> any, is removed.
>
> 2. Change the createNSResolver() function in DOMXPathEvaluator
> to return read-write DOMXPathNSResolver (const right now).
>
> 3. Allow NULL to be passed to createNSResolver() in which case
> empty DOMXPathNSResolver will be created (can then be populated
> with setNamespacePrefix).
>
>
> Let me know what you think.
>
> Boris
>
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Xerces-C API changes for XQilla
Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi John,
Another area where we can improvise the interface alignment
is DOMXPathNSResolver. I just had a quick chat with Alberto
and here are the changes that we both agree would make sense
(I hope the motivation is self-evident, if not -- let me know):
1. Add the ability to provide custom namespace-prefix mappings
in the DOMXPathNSResolver interface. Something along these
lines:
virtual void
setNamespacePrefix (const XMLCh* prefix, const XMLCh* uri);
The new mapping will override any existing one. If NULL or
empty string is passed as uri then the existing mapping, if
any, is removed.
2. Change the createNSResolver() function in DOMXPathEvaluator
to return read-write DOMXPathNSResolver (const right now).
3. Allow NULL to be passed to createNSResolver() in which case
empty DOMXPathNSResolver will be created (can then be populated
with setNamespacePrefix).
Let me know what you think.
Boris
--
Boris Kolpackov, Code Synthesis Tools http://codesynthesis.com/~boris/blog
Open source XML data binding for C++: http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org