You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Dan Ribe <da...@gmail.com> on 2008/12/08 12:21:15 UTC

Xerces fails to parse XML which has extended characters on Mac.

Hello Everyone,
I am new to Xerces & using it in my project for parsing a XML file. I am
facing a problem with Xerces. XML parsing is failing when XML has some
extended character in it (in any field). example : å. For simple ascii
characters XML parsing is working fine.

Can anyone let me know how i can fix above mentioned problem. Is there any
specific project settings for Xerces to make it work. I am building Xerces
using XCode 3.0.

Thanks lot for your help in advance. let me know if you need any other
information from my side on this.

Cheers!

Re: Xerces fails to parse XML which has extended characters on Mac.

Posted by Dan Ribe <da...@gmail.com>.
Hi Pranav,
Not getting any exception. Code flow is working normally. Only issue is that
the elems (see below code snippet, last line) is an empty DOM node list.

XMLCh* docName = XMLString::transcode(fileName);

DOMBuilder * domBuilder = impl->createDOMBuilder(~DOMImplementationLS::
MODE_SYNCHRONOUS, NULL );

DOMDocument * domDoc = domBuilder->parseURI(docName);

XMLString::release(&docName);


XMLCh* tagname = XMLString::transcode("names");

DOMNodeList * elems = domDoc->getElementsByTagName(tagname);


Thanks much,
Cheers!

On Tue, Dec 9, 2008 at 1:41 PM, Pranav Savkur <pr...@gmail.com>wrote:

> what is the exception u r getting
>
> On Tue, Dec 9, 2008 at 12:36 PM, Dan Ribe <da...@gmail.com> wrote:
>
> > Hi Boris,
> > Thanks for the reply.
> >
> > I am creating the XML file like :
> >
> > char fileName[L_tmpnam];
> >
> > tmpnam(fileName);
> >
> > FILE * tempFile = fopen(fileName, "w+");
> >
> > fwrite(xml.c_str(), 1, xml.length(), tempFile);
> >
> > fclose(tempFile);
> >
> >
> > Where xml is a string which has xml contents. so by default it has Mac OS
> > Roman encoding.
> >
> > I think this problem has to do with encoding only. When I change the
> > encoding of file to UTF-8 (by opening and saving in TextEdit on Mac with
> > different encoding) Xerces is able to parse it properly and converting
> the
> > string using the UTF-8 encoding solves the problem.
> >
> > Now I am only looking for a way to create UTF-8 encoded file
> > programatically. I think I can do that by using Mac file creation APIs.
> > just
> > wondering if there is a way to specify the encoding in standard C/C++ API
> > directly.
> >
> > Thanks for your help.
> > Cheers!
> >
> > On Mon, Dec 8, 2008 at 7:01 PM, Boris Kolpackov <boris@codesynthesis.com
> > >wrote:
> >
> > > Hi Dan,
> > >
> > > Dan Ribe <da...@gmail.com> writes:
> > >
> > > > let me know if you need any other information from my side on this.
> > >
> > > Knowing the actual error that you get as well as the encoding
> > > specified in the XML document you are trying to parse would be
> > > helpful.
> > >
> > > Boris
> > >
> > > --
> > > Boris Kolpackov, Code Synthesis Tools
> > > http://codesynthesis.com/~boris/blog<
> http://codesynthesis.com/%7Eboris/blog>
> > > Open source XML data binding for C++:
> > > http://codesynthesis.com/products/xsd
> > > Mobile/embedded <http://codesynthesis.com/products/xsdMobile/embedded
> >validating
> > XML parsing:
> > > http://codesynthesis.com/products/xsde
> > >
> >
>

Re: Xerces fails to parse XML which has extended characters on Mac.

Posted by Pranav Savkur <pr...@gmail.com>.
what is the exception u r getting

On Tue, Dec 9, 2008 at 12:36 PM, Dan Ribe <da...@gmail.com> wrote:

> Hi Boris,
> Thanks for the reply.
>
> I am creating the XML file like :
>
> char fileName[L_tmpnam];
>
> tmpnam(fileName);
>
> FILE * tempFile = fopen(fileName, "w+");
>
> fwrite(xml.c_str(), 1, xml.length(), tempFile);
>
> fclose(tempFile);
>
>
> Where xml is a string which has xml contents. so by default it has Mac OS
> Roman encoding.
>
> I think this problem has to do with encoding only. When I change the
> encoding of file to UTF-8 (by opening and saving in TextEdit on Mac with
> different encoding) Xerces is able to parse it properly and converting the
> string using the UTF-8 encoding solves the problem.
>
> Now I am only looking for a way to create UTF-8 encoded file
> programatically. I think I can do that by using Mac file creation APIs.
> just
> wondering if there is a way to specify the encoding in standard C/C++ API
> directly.
>
> Thanks for your help.
> Cheers!
>
> On Mon, Dec 8, 2008 at 7:01 PM, Boris Kolpackov <boris@codesynthesis.com
> >wrote:
>
> > Hi Dan,
> >
> > Dan Ribe <da...@gmail.com> writes:
> >
> > > let me know if you need any other information from my side on this.
> >
> > Knowing the actual error that you get as well as the encoding
> > specified in the XML document you are trying to parse would be
> > helpful.
> >
> > Boris
> >
> > --
> > Boris Kolpackov, Code Synthesis Tools
> > http://codesynthesis.com/~boris/blog<http://codesynthesis.com/%7Eboris/blog>
> > Open source XML data binding for C++:
> > http://codesynthesis.com/products/xsd
> > Mobile/embedded <http://codesynthesis.com/products/xsdMobile/embedded>validating
> XML parsing:
> > http://codesynthesis.com/products/xsde
> >
>

Re: Xerces fails to parse XML which has extended characters on Mac.

Posted by Dan Ribe <da...@gmail.com>.
Hi Boris,
Thanks for the reply.

I am creating the XML file like :

char fileName[L_tmpnam];

tmpnam(fileName);

FILE * tempFile = fopen(fileName, "w+");

fwrite(xml.c_str(), 1, xml.length(), tempFile);

fclose(tempFile);


Where xml is a string which has xml contents. so by default it has Mac OS
Roman encoding.

I think this problem has to do with encoding only. When I change the
encoding of file to UTF-8 (by opening and saving in TextEdit on Mac with
different encoding) Xerces is able to parse it properly and converting the
string using the UTF-8 encoding solves the problem.

Now I am only looking for a way to create UTF-8 encoded file
programatically. I think I can do that by using Mac file creation APIs. just
wondering if there is a way to specify the encoding in standard C/C++ API
directly.

Thanks for your help.
Cheers!

On Mon, Dec 8, 2008 at 7:01 PM, Boris Kolpackov <bo...@codesynthesis.com>wrote:

> Hi Dan,
>
> Dan Ribe <da...@gmail.com> writes:
>
> > let me know if you need any other information from my side on this.
>
> Knowing the actual error that you get as well as the encoding
> specified in the XML document you are trying to parse would be
> helpful.
>
> Boris
>
> --
> Boris Kolpackov, Code Synthesis Tools
> http://codesynthesis.com/~boris/blog
> Open source XML data binding for C++:
> http://codesynthesis.com/products/xsd
> Mobile/embedded <http://codesynthesis.com/products/xsdMobile/embedded>validating XML parsing:
> http://codesynthesis.com/products/xsde
>

Re: Xerces fails to parse XML which has extended characters on Mac.

Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi Dan,

Dan Ribe <da...@gmail.com> writes:

> let me know if you need any other information from my side on this.

Knowing the actual error that you get as well as the encoding
specified in the XML document you are trying to parse would be
helpful.

Boris

-- 
Boris Kolpackov, Code Synthesis Tools   http://codesynthesis.com/~boris/blog
Open source XML data binding for C++:   http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde