You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xalan.apache.org by "Caputo, Steve" <St...@cognos.com> on 2003/03/13 22:03:17 UTC

Maximum Multi-byte string (Japanese) Length

Hi xAlan Experts,

Version of Apache xAlan used:  1.2 
We have a xFunction defined (i.e. named JavaScriptEncode) with the
interface:

XObjectPtr  FunctionJavascriptEncode::execute(
	XPathExecutionContext&			executionContext,
	XalanNode*						context,
	int								/*
opPos */,
	const XObjectArgVectorType&		args)
{
   // expect one argument.
   XalanDOMString theString = args[0]->str();
  .
  .
  .
}

The problem is that on Japanese UNIX (tested on HP and Solaris), we lose the
message string passed through to args, and as a result, our generated web
page displays no string.

The problem does not occur on Japanese windows, or using ascii strings on
English UNIX.

I figured out that the problem also only occurs on Japanese UNIX when the
size of the string passed (that is, a string x.c_str()) equals or exceeds 31
multi byte characters!

Is this a limitation to XObjectArgVectorType?  Any ideas to solve this
problem?  I'm new to XSLT and could not find anything useful in the xAlan
class documentation.

Thanks for any help.

Steve Caputo
Software Engineer
Cognos, Inc.



This message may contain privileged and/or confidential information.  If you
have received this e-mail in error or are not the intended recipient, you
may not use, copy, disseminate or distribute it; do not open any
attachments, delete it immediately from your system and notify the sender
promptly by e-mail that you have done so.  Thank you.

Re: Maximum Multi-byte string (Japanese) Length

Posted by David N Bertoni/Cambridge/IBM <da...@us.ibm.com>.



Hi Steve,

This is likely a problem with the local code page conversion, although I'm
just guessing because you didn't really provide enough information to
figure out what's going wrong.

Something I don't understand:

> I figured out that the problem also only occurs on Japanese UNIX when the
> size of the string passed (that is, a string x.c_str()) equals or exceeds
31
> multi byte characters!

This makes no sense.  Xalan operates only on UTF-16 strings, so I don't
understand what you mean when you say "31 multi-byte characters."  Do you
mean 31 UTF-16 code points?

Also, what is the source of this string?  Is it in a source document?  If
so, then Xalan gets the strings already transcoded to UTF-16 by the parser,
so the encoding of the instance document it irrelevant.  Is it hard-coded
in your executable?  That's guaranteed to be a problem.  In general, you
should be suspicious of embedded string constants and character constants
because they may not behave as you would expect.  Also, parameters passed
in from the command, or coming from a non-Unicode source may also cause
problems. Windows supports Unicode intrinsically, while HP and Solaris do
not, so it's very likely things will work on Windows, but not on HP or
Solaris.

It would also help to see the implementation of your function.  Or, if you
can put together a minimal sample program and minimal inputs, we can try to
debug it.  However, we don't have access to a Japanese HP or Solaris box,
so we may not be able to reproduce the problem.

Dave



                                                                                                                                               
                      "Caputo, Steve"                                                                                                          
                      <Steve.Caputo@co         To:      "'xalan-c-users@xml.apache.org'" <xa...@xml.apache.org>                        
                      gnos.com>                cc:      (bcc: David N Bertoni/Cambridge/IBM)                                                   
                                               Subject: Maximum Multi-byte string (Japanese) Length                                            
                      03/13/2003 01:03                                                                                                         
                      PM                                                                                                                       
                                                                                                                                               



Hi xAlan Experts,

Version of Apache xAlan used:  1.2
We have a xFunction defined (i.e. named JavaScriptEncode) with the
interface:

XObjectPtr  FunctionJavascriptEncode::execute(
             XPathExecutionContext&
executionContext,
             XalanNode*
             context,
             int
                               /*
opPos */,
             const XObjectArgVectorType&                     args)
{
   // expect one argument.
   XalanDOMString theString = args[0]->str();
  .
  .
  .
}

The problem is that on Japanese UNIX (tested on HP and Solaris), we lose
the
message string passed through to args, and as a result, our generated web
page displays no string.

The problem does not occur on Japanese windows, or using ascii strings on
English UNIX.

I figured out that the problem also only occurs on Japanese UNIX when the
size of the string passed (that is, a string x.c_str()) equals or exceeds
31
multi byte characters!

Is this a limitation to XObjectArgVectorType?  Any ideas to solve this
problem?  I'm new to XSLT and could not find anything useful in the xAlan
class documentation.

Thanks for any help.

Steve Caputo
Software Engineer
Cognos, Inc.



This message may contain privileged and/or confidential information.  If
you
have received this e-mail in error or are not the intended recipient, you
may not use, copy, disseminate or distribute it; do not open any
attachments, delete it immediately from your system and notify the sender
promptly by e-mail that you have done so.  Thank you.