You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Sandeep Shahane <sa...@yahoo.com> on 2006/07/25 14:57:24 UTC

some queries related to xerces usage.

Hi,
    I am planning to use xerces library for XML
parsing.  I have few queries related to it before
going ahead:
My application supports different platforms as below
(with minimum versions)
Sun 5.8
HP-UX 11.00
Red Hat Linux 8.0
SUSE Linux 8.0
AIX 5.1
Does xerces supports all above OS with these minimum versions?
What encoding is supported by default, is there any way to set xerces to be 
used in a specific encoding say UTF-8 or UTF-16?
Is it possible to do XML string conversions from one encoding format to other?
During parsing an XML, can I parse a partial data for a given tag say for 
example some node value is very huge and if I want to parse it in chunks, is it
possible? how can I do it?
I have used MSXML for creating XMLs, is there similiar way (APIs) to build XMLs 
or do I need to write ones using some xerces APIs internally?

Thanks in advance,
Sandeep





Re: some queries related to xerces usage.

Posted by Alberto Massari <am...@datadirect.com>.
Hi Sandeep,

At 12.57 25/07/2006 +0000, Sandeep Shahane wrote:
>Hi,
>     I am planning to use xerces library for XML
>parsing.  I have few queries related to it before
>going ahead:
>My application supports different platforms as below
>(with minimum versions)
>Sun 5.8
>HP-UX 11.00
>Red Hat Linux 8.0
>SUSE Linux 8.0
>AIX 5.1
>Does xerces supports all above OS with these minimum versions?

Yes; the list of platforms accepted by the runConfigure script is 
'aix', 'beos', 'linux', 'freebsd', 'netbsd', 'solaris', 'hp-10', 
'hp-11', 'openserver', 'unixware', 'os400', 'os390', 'irix', 'ptx', 
'tru64', 'macosx', 'cygwin', 'qnx', 'interix', 'mingw-msys'.
But it depends also on the compiler that you plan to use on those 
platforms. In any case, there is no binary distribution for all of 
them, so you will have to build it from the sources.

>What encoding is supported by default, is there any way to set xerces to be
>used in a specific encoding say UTF-8 or UTF-16?

Xerces uses internally UTF-16, and has native support for UTF-8, 
UCS-2, UCS-4, ISO-8859-1, Wind-1252, EBCDIC, IBM-1047 and IBM-1140. 
Depending on the platform you are running it supports more encoding 
by using either ICU or iconv.

>Is it possible to do XML string conversions from one encoding format to other?

Yes, using UTF-16 as a bridge between them.

>During parsing an XML, can I parse a partial data for a given tag say for
>example some node value is very huge and if I want to parse it in 
>chunks, is it
>possible? how can I do it?

Yes, using SAX; if, for example, you have a long text (or CDATA) 
fragment, the characters() callback will be invoked multiple times 
with small chunks of data (less then 50Kb).

>I have used MSXML for creating XMLs, is there similiar way (APIs) to 
>build XMLs
>or do I need to write ones using some xerces APIs internally?

There is a COM wrapper that mimics the MSXML API, but it is available 
(being COM-based) only on Windows. You will have to use the Xerces 
APIs in order to build a multiplatform application.

As for the question in the other e-mail (being able to write CDATA 
sections with big contents), you have two options:
1) you create a DOM tree in memory and serialize it using DOMWriter 
-> this means you will have the entire XML structure in memory before 
writing, so it could run out of memory
2) you instanciate a XMLFormatter and manually stream the XML; 
tedious, and prone to coding errors that generate invalid XML, but 
you will need only a minimal quantity of memory in order to do so.

Hope this helps,
Alberto


>Thanks in advance,
>Sandeep


Re: some queries related to xerces usage.

Posted by Sandeep Shahane <sa...@yahoo.com>.
> During parsing an XML, can I parse a partial data for a given tag say for 
> example some node value is very huge and if I want to parse it in chunks, is 
it
> possible? how can I do it?

Also my application needs to write the CDATA tags where the data is huge, can I 
write it in chunks?



Re: some queries related to xerces usage.

Posted by 4p...@sneakemail.com.
Hi Sandeep,

As apparently everybody else who'd know a lot better than me is in 
holiday, I (a non Xerces C developer) will try to answer a few questions:

Sandeep Shahane sandipshahane-at-yahoo.com |xerces-c-users mailing list| 
schrieb:

> Hi,
>     I am planning to use xerces library for XML
> parsing.  I have few queries related to it before
> going ahead:
> My application supports different platforms as below
> (with minimum versions)
> Sun 5.8
> HP-UX 11.00
> Red Hat Linux 8.0
> SUSE Linux 8.0
> AIX 5.1
> Does xerces supports all above OS with these minimum versions?
>   
I would suppose the binary distributions won't run on older OS versions, 
but you can build Xerces yourself:
http://xml.apache.org/xerces-c/build-winunix.html#UNIX
> What encoding is supported by default, is there any way to set xerces to be 
> used in a specific encoding say UTF-8 or UTF-16?
> Is it possible to do XML string conversions from one encoding format to other?
>   
No clue if you can reset the default, but what I've seen so far is that 
Xerces uses wide characters throughout it's API, and it has string 
conversion functions (see XMLString::transcode() in the API docs)
> During parsing an XML, can I parse a partial data for a given tag say for 
> example some node value is very huge and if I want to parse it in chunks, is it
> possible? how can I do it?
>   
I've seen API calls for incremental parsing, but I don't know details 
about that... At least for SAX there seems to be support for it (see 
SAXParser::parseFirst() and SAX2XMLReader::parseFirst() etc. in the API 
docs: http://xml.apache.org/xerces-c/api.html)
> I have used MSXML for creating XMLs, is there similiar way (APIs) to build XMLs 
> or do I need to write ones using some xerces APIs internally?
>
>   
I'm not sure I know what you mean here - are you asking if there's a 
wrapper for XercesC to mimic MSXML's API? (and if yes, I wouldn't know). 
Hower, the APIs will be similar in certain cases, as I'd expect MSXML to 
use a DOM for representing document trees and SAX for stream parsing. 
Xerces does that.
> Thanks in advance,
> Sandeep
>   
Hope that helped a bit!

Cheers,

Uwe