You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Ratnesh Nath (JIRA)" <xe...@xml.apache.org> on 2012/10/15 20:04:04 UTC

[jira] [Created] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Ratnesh Nath created XERCESC-2002:
-------------------------------------

             Summary: XMLString::transcode for multi-byte characters are returning null on Linux 
                 Key: XERCESC-2002
                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
             Project: Xerces-C++
          Issue Type: Bug
          Components: Miscellaneous
    Affects Versions: 2.7.0
            Reporter: Ratnesh Nath
            Priority: Critical


XMLCh *tag1 = XMLString::transcode("test-in-english");
Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english

XMLCh *tag2 = XMLString::transcode("ã«ã¯");
Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Ratnesh Nath (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479195#comment-13479195 ] 

Ratnesh Nath commented on XERCESC-2002:
---------------------------------------

As per my understanding, database manager uses the following items in the order presented to determine the active code page:

    - LC_ALL
    - LC_CTYPE
    - LANG

Is there anyway to determine which code page is in use right now on Linux system?
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Ratnesh Nath (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479212#comment-13479212 ] 

Ratnesh Nath commented on XERCESC-2002:
---------------------------------------

Regarding a) ==> Are you saying that I should give input as "\x80\ ã«ã¯ \x80" to XMLString::transcode ? 
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Alberto Massari (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480227#comment-13480227 ] 

Alberto Massari commented on XERCESC-2002:
------------------------------------------

I am still waiting for an answer to the "If that fails, check which transcoder has been selected when building Xerces, because that changes what is considered the default locale" part of the comment
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Ratnesh Nath (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479287#comment-13479287 ] 

Ratnesh Nath commented on XERCESC-2002:
---------------------------------------


=====
Code:
=====
CCsString abc = "text-in-english";
diag(RESPONSE_XML, "In english, before transcode: %s", abc.GetString());
XMLCh *tag1 = XMLString::transcode("text-in-english");
diag(RESPONSE_XML, "In english,after transcode: %s", (char*)tag1);

CCsString abc1 = "\xC3\xA3\xC2\xAB\xC3\xA3\xC2\xAF";
diag(RESPONSE_XML, "MultiByte Char, before transcode is: %s", abc1.GetString());
XMLCh *tag2 = XMLString::transcode("\xC3\xA3\xC2\xAB\xC3\xA3\xC2\xAF");
diag(RESPONSE_XML, "MultiByte Char, after transcode: %s", (char*)tag2);

=======
Output:
=======

In english, before transcode: text-in-english
In english,after transcode: t
MultiByte Char, before transcode is: ãëãï
MultiByte Char, after transcode: 

                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Ratnesh Nath (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479231#comment-13479231 ] 

Ratnesh Nath commented on XERCESC-2002:
---------------------------------------

ok..got it.

Thanks Alberto for helping me out on this issue.

I will get back to you in some time...Thank you !!
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Ratnesh Nath (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479185#comment-13479185 ] 

Ratnesh Nath commented on XERCESC-2002:
---------------------------------------

My current locale setting is following:
=========================
[root@ucbu-aricent-srv19 default]# locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
========================

Do I need to make some change in it ?
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Ratnesh Nath (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480158#comment-13480158 ] 

Ratnesh Nath commented on XERCESC-2002:
---------------------------------------

Alberto, Did you get chance to look at this?
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Closed] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Alberto Massari (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alberto Massari closed XERCESC-2002.
------------------------------------

    Resolution: Invalid

XMLString::transcode converts to/from the current locale. It's up to the user to guarantee that the char* specified as argument is a valid string for the current locale. I guess your locale is not UTF-8, and so the conversion fails. 
You should create a transcoder for the encoding of the input string, and use XMLString::transcode only for data coming from stdin, or being printed to stdout
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Alberto Massari (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479199#comment-13479199 ] 

Alberto Massari commented on XERCESC-2002:
------------------------------------------

Try with a) writing the string to transcode using escape characters ("\x80") and b) printing the single XMLCh character instead of going through CsEws::XmlChToString so that we can rule out other possibilities. If that fails, check which transcoder has been selected when building Xerces, because that changes what is considered the default locale
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Ratnesh Nath (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480237#comment-13480237 ] 

Ratnesh Nath commented on XERCESC-2002:
---------------------------------------

So does it make sense If I  will replace build Xerces by original Xerces (download it from Xerces official site) and do this testing again?  

                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Commented] (XERCESC-2002) XMLString::transcode for multi-byte characters are returning null on Linux

Posted by "Alberto Massari (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479224#comment-13479224 ] 

Alberto Massari commented on XERCESC-2002:
------------------------------------------

No, I am saying not to use non-ASCII characters in a C++ source file, as you never know how they are read by the compiler; in your case, it should be something like \xC3\xA3\xC2\xAB\xC3\xA3\xC2\xAF (assuming you want to transcode the string ã«ã¯ )
                
> XMLString::transcode for multi-byte characters are returning null on Linux 
> ---------------------------------------------------------------------------
>
>                 Key: XERCESC-2002
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2002
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 2.7.0
>            Reporter: Ratnesh Nath
>            Priority: Critical
>
> XMLCh *tag1 = XMLString::transcode("test-in-english");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is : test-in-english
> XMLCh *tag2 = XMLString::transcode("ã«ã¯");
> Printing: CsEws::XmlChToString(tag1).c_str()); >>>>>>> output is :  <<< NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org