You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Andy Heninger <an...@jtcsv.com> on 2000/12/02 02:40:24 UTC

Re: Multilanguage filename

This is wierd.  I believe you when you say that your change fixed the
problem, but I don't understand why it would make a difference.  The
backslash vs. yen conflict has been the source of troubles before, as you
can probably guess by the existance of this code at all.  (Both share the
same code in Shift-JIS)

Andy Heninger
IBM XML Technology Group, Cupertino, CA
heninger@us.ibm.com


----- Original Message -----
From: <kb...@informatica.com>
To: <xe...@xml.apache.org>
Sent: Wednesday, November 29, 2000 4:44 PM
Subject: RE: Multilanguage filename


> I added the backSlash code also to the file name conversion logic. So it
> looks like
>
>     XMLCh *tmpUName = 0;
>     const XMLCh *nameToOpen = fileName;
>
>     const XMLCh* srcPtr = fileName;
>     while (*srcPtr)
>     {
>         if (*srcPtr == chYenSign ||
>             *srcPtr == chWonSign ||
> *srcPtr == chBackSlash) //ADDED
>             break;
>         srcPtr++;
>     }
>
>     //
>     //  If we found a yen, then we have to create a temp file name. Else
>     //  go with the file name as is and save the overhead.
>     //
>     if (*srcPtr)
>     {
>         tmpUName = XMLString::replicate(fileName);
>
>         XMLCh* tmpPtr = tmpUName;
>         while (*tmpPtr)
>         {
>             if (*tmpPtr == chYenSign ||
>                 *tmpPtr == chWonSign ||
>        *tmpPtr == chBackSlash) //ADDED
>                 *tmpPtr = chForwardSlash;
>             tmpPtr++;
>         }
>         nameToOpen = tmpUName;
>     }
>
>
> Now it seems to work fine.
>
>
> Kiran
>
>
> -----Original Message-----
> From: Dean Roddey [mailto:droddey@charmedquark.com]
> Sent: Wednesday, November 29, 2000 1:55 PM
> To: xerces-c-dev@xml.apache.org
> Subject: Re: Multilanguage filename
>
>
> If you are using Win98, then that Unicode will be transcoded back to the
LCP
> before being passed to the system, but it would remain untouched until
it
> got there. So there aren't any other places you have to look to see
where
> its getting changed. It should get to that point exactly as you passed
it
> in, then get transcoded if not on NT.
>
> --------------------------
> Dean Roddey
> The CIDLib C++ Frameworks
> Charmed Quark Software
> droddey@charmedquark.com
> http://www.charmedquark.com
>
> "It takes two buttocks to make friction"
>     - African Proverb
>
>
> ----- Original Message -----
> From: <kb...@informatica.com>
> To: <xe...@xml.apache.org>
> Cc: <dr...@portal.com>
> Sent: Wednesday, November 29, 2000 12:54 PM
> Subject: RE: Multilanguage filename
>
>
> > The system I am using is a Shift-JIS Japanese Win 98 machine. I am
passing
> > the file name in Unicode which seems to be ok. I did have a look at
> > Win32PlatformUtils::openFile and I am trying to debug this problem. It
> looks
> > as if the file name got changed when the transcoding happened.
> > Any hints will be of great help
> >
> > Kiran
> >
> > -----Original Message-----
> > From: Dean Roddey [mailto:droddey@portal.com]
> > Sent: Wednesday, November 29, 2000 12:04 PM
> > To: 'xerces-c-dev@xml.apache.org'
> > Subject: RE: Multilanguage filename
> >
> >
> > Two possible issues.
> >
> > 1. What is your local code page? If its not an encoding in which
japanese
> > can be dealt with, then you'll have to transcode the name to Unicode
and
> > pass that as the file name. Any "" type strings you pass to the parser
are
> > assumed to be in the local code page.
> >
> > 2. Are you on NT and using a version before the most recent one? Some
> older
> > versions had problems because Shift_JIS is not a very good encoding
due to
> > its use of the slash and the yen sign for the same thing. On previous
> > versions, the slash would get converted to a Yen sign in Unicode. On
NT,
> > since we pass Unicode straight to the system API, it never got
transcoded
> > back and so we passed a literal Yen sign in the file name which caused
it
> to
> > fail.
> >
> > I believe that we put in a hack to deal with this problem by just
> converting
> > all Unicode yen signs to slashes in the Win32 platform utils file open
> > support.
> >
> > --------------
> > Dean Roddey
> > Software Geek Extraordinaire
> > Portal, Inc
> > droddey@portal.com
> >
> >
> >
> > -----Original Message-----
> > From: kbagepalli@informatica.com [mailto:kbagepalli@informatica.com]
> > Sent: Wednesday, November 29, 2000 11:09 AM
> > To: xerces-c-dev@xml.apache.org
> > Subject: Multilanguage filename
> >
> >
> > We have a problem with parsing Japanese XML documents. The problem is
with
> > the file name not the data itself!
> > If the file name is in English (with the data in Japanese - we are
using
> > icu) it parses fine. However if the file name
> > is in Japanese it says that the primary document could not be opened.
> >
> > Any clues on what is going wrong.
> >
> > Kiran
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>