You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tuscany.apache.org by Simon Laws <si...@googlemail.com> on 2010/11/17 16:54:38 UTC
Multi-byte character support?
Anyone know if there is any support for or any Tuscany tests for
multi-byte character set support in any of the bindings/databindings?
Simon
--
Apache Tuscany committer: tuscany.apache.org
Co-author of a book about Tuscany and SCA: tuscanyinaction.com
Re: Multi-byte character support?
Posted by Simon Laws <si...@googlemail.com>.
On Wed, Nov 17, 2010 at 5:58 PM, Simon Laws <si...@googlemail.com> wrote:
> On Wed, Nov 17, 2010 at 5:31 PM, Raymond Feng <cy...@gmail.com> wrote:
>> Hi,
>> Java Strings are unicode encoded. The tricks are when we create Strings from
>> byte[] and vice versa (sometimes through streaming APIs). We need to make
>> sure we use the correct encoding such as UTF-8 instead of the default one
>> which is platform dependent.
>> Thanks,
>> Raymond
>> ________________________________________________________________
>> Raymond Feng
>> rfeng@apache.org
>> Apache Tuscany PMC member and committer: tuscany.apache.org
>> Co-author of Tuscany SCA In Action book: www.tuscanyinaction.com
>> Personal Web Site: www.enjoyjava.com
>> ________________________________________________________________
>> On Nov 17, 2010, at 7:54 AM, Simon Laws wrote:
>>
>> Anyone know if there is any support for or any Tuscany tests for
>> multi-byte character set support in any of the bindings/databindings?
>>
>> Simon
>>
>> --
>> Apache Tuscany committer: tuscany.apache.org
>> Co-author of a book about Tuscany and SCA: tuscanyinaction.com
>>
>>
> Right, there is some questionable code in some places. E.g.
>
> public class String2OMElement extends BaseTransformer<String,
> OMElement> implements
> PullTransformer<String, OMElement> {
>
> @SuppressWarnings("unchecked")
> public OMElement transform(String source, TransformationContext context) {
> try {
> StAXOMBuilder builder = new StAXOMBuilder(new
> ByteArrayInputStream(source.getBytes()));
> OMElement element = builder.getDocumentElement();
> AxiomHelper.adjustElementName(context, element);
> return element;
> } catch (Exception e) {
> throw new TransformationException(e);
> }
> }
>
> Where it does a source.getBytes() with no encoding. I'm assuming that
> we don't test with various encodings to find any issues. But wanted to
> check.
>
> Simon
>
>
> --
> Apache Tuscany committer: tuscany.apache.org
> Co-author of a book about Tuscany and SCA: tuscanyinaction.com
>
I raised TUSCANY-3790 to track
Simon
--
Apache Tuscany committer: tuscany.apache.org
Co-author of a book about Tuscany and SCA: tuscanyinaction.com
Re: Multi-byte character support?
Posted by Simon Laws <si...@googlemail.com>.
On Wed, Nov 17, 2010 at 5:31 PM, Raymond Feng <cy...@gmail.com> wrote:
> Hi,
> Java Strings are unicode encoded. The tricks are when we create Strings from
> byte[] and vice versa (sometimes through streaming APIs). We need to make
> sure we use the correct encoding such as UTF-8 instead of the default one
> which is platform dependent.
> Thanks,
> Raymond
> ________________________________________________________________
> Raymond Feng
> rfeng@apache.org
> Apache Tuscany PMC member and committer: tuscany.apache.org
> Co-author of Tuscany SCA In Action book: www.tuscanyinaction.com
> Personal Web Site: www.enjoyjava.com
> ________________________________________________________________
> On Nov 17, 2010, at 7:54 AM, Simon Laws wrote:
>
> Anyone know if there is any support for or any Tuscany tests for
> multi-byte character set support in any of the bindings/databindings?
>
> Simon
>
> --
> Apache Tuscany committer: tuscany.apache.org
> Co-author of a book about Tuscany and SCA: tuscanyinaction.com
>
>
Right, there is some questionable code in some places. E.g.
public class String2OMElement extends BaseTransformer<String,
OMElement> implements
PullTransformer<String, OMElement> {
@SuppressWarnings("unchecked")
public OMElement transform(String source, TransformationContext context) {
try {
StAXOMBuilder builder = new StAXOMBuilder(new
ByteArrayInputStream(source.getBytes()));
OMElement element = builder.getDocumentElement();
AxiomHelper.adjustElementName(context, element);
return element;
} catch (Exception e) {
throw new TransformationException(e);
}
}
Where it does a source.getBytes() with no encoding. I'm assuming that
we don't test with various encodings to find any issues. But wanted to
check.
Simon
--
Apache Tuscany committer: tuscany.apache.org
Co-author of a book about Tuscany and SCA: tuscanyinaction.com
Re: Multi-byte character support?
Posted by Raymond Feng <cy...@gmail.com>.
Hi,
Java Strings are unicode encoded. The tricks are when we create Strings from byte[] and vice versa (sometimes through streaming APIs). We need to make sure we use the correct encoding such as UTF-8 instead of the default one which is platform dependent.
Thanks,
Raymond
________________________________________________________________
Raymond Feng
rfeng@apache.org
Apache Tuscany PMC member and committer: tuscany.apache.org
Co-author of Tuscany SCA In Action book: www.tuscanyinaction.com
Personal Web Site: www.enjoyjava.com
________________________________________________________________
On Nov 17, 2010, at 7:54 AM, Simon Laws wrote:
> Anyone know if there is any support for or any Tuscany tests for
> multi-byte character set support in any of the bindings/databindings?
>
> Simon
>
> --
> Apache Tuscany committer: tuscany.apache.org
> Co-author of a book about Tuscany and SCA: tuscanyinaction.com