You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by "Klemen Zagar (JIRA)" <ji...@apache.org> on 2007/07/25 19:59:31 UTC

[jira] Created: (DIRSERVER-1009) ASN.1 decoder does not use UTF-8 encoding for strings

ASN.1 decoder does not use UTF-8 encoding for strings
-----------------------------------------------------

                 Key: DIRSERVER-1009
                 URL: https://issues.apache.org/jira/browse/DIRSERVER-1009
             Project: Directory ApacheDS
          Issue Type: Bug
          Components: asn1
    Affects Versions: 1.5.0
         Environment: JDK 1.6, WinXP
            Reporter: Klemen Zagar


Currently, LDAP ASN.1 decoder transforms byte[] to String via calls such as:

  String any = new String( tlv.getValue().getData() );
  (source: /ldap/src/main/java/org/apache/directory/shared/ldap/codec/actions/StoreAnyAction.java, line 74)

This uses some default encoding, though according to LDAP RFC, UTF-8 should always be used (http://www.rfc-editor.org/rfc/rfc2251.txt, section 4.1.2).

I recommend that the following code be used for converting byte[] to String:

  // static variable containing the UTF8 encoder
  private static final Charset UTF8 = Charset.forName("UTF8");

  // wherever byte[]->string conversion is needed
  String any = UTF8.decode(tlv.getValue().getData()).toString();

>From user's perspective, the problem is that non-ASCII characters in search filters are messed up, and search doesn't return any results.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSERVER-1009) ASN.1 decoder does not use UTF-8 encoding for strings

Posted by "Emmanuel Lecharny (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSERVER-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515451 ] 

Emmanuel Lecharny commented on DIRSERVER-1009:
----------------------------------------------

Fixed in 1.0-trunk :
http://svn.apache.org/viewvc?view=rev&rev=559645

Still have to fix it in 1.5 trunk

> ASN.1 decoder does not use UTF-8 encoding for strings
> -----------------------------------------------------
>
>                 Key: DIRSERVER-1009
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-1009
>             Project: Directory ApacheDS
>          Issue Type: Bug
>          Components: asn1
>    Affects Versions: 1.5.0
>         Environment: JDK 1.6, WinXP
>            Reporter: Klemen Zagar
>            Assignee: Emmanuel Lecharny
>
> Currently, LDAP ASN.1 decoder transforms byte[] to String via calls such as:
>   String any = new String( tlv.getValue().getData() );
>   (source: /ldap/src/main/java/org/apache/directory/shared/ldap/codec/actions/StoreAnyAction.java, line 74)
> This uses some default encoding, though according to LDAP RFC, UTF-8 should always be used (http://www.rfc-editor.org/rfc/rfc2251.txt, section 4.1.2).
> I recommend that the following code be used for converting byte[] to String:
>   // static variable containing the UTF8 encoder
>   private static final Charset UTF8 = Charset.forName("UTF8");
>   // wherever byte[]->string conversion is needed
>   String any = UTF8.decode(tlv.getValue().getData()).toString();
> From user's perspective, the problem is that non-ASCII characters in search filters are messed up, and search doesn't return any results.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (DIRSERVER-1009) ASN.1 decoder does not use UTF-8 encoding for strings

Posted by "Emmanuel Lecharny (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DIRSERVER-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Lecharny reassigned DIRSERVER-1009:
--------------------------------------------

    Assignee: Emmanuel Lecharny

> ASN.1 decoder does not use UTF-8 encoding for strings
> -----------------------------------------------------
>
>                 Key: DIRSERVER-1009
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-1009
>             Project: Directory ApacheDS
>          Issue Type: Bug
>          Components: asn1
>    Affects Versions: 1.5.0
>         Environment: JDK 1.6, WinXP
>            Reporter: Klemen Zagar
>            Assignee: Emmanuel Lecharny
>
> Currently, LDAP ASN.1 decoder transforms byte[] to String via calls such as:
>   String any = new String( tlv.getValue().getData() );
>   (source: /ldap/src/main/java/org/apache/directory/shared/ldap/codec/actions/StoreAnyAction.java, line 74)
> This uses some default encoding, though according to LDAP RFC, UTF-8 should always be used (http://www.rfc-editor.org/rfc/rfc2251.txt, section 4.1.2).
> I recommend that the following code be used for converting byte[] to String:
>   // static variable containing the UTF8 encoder
>   private static final Charset UTF8 = Charset.forName("UTF8");
>   // wherever byte[]->string conversion is needed
>   String any = UTF8.decode(tlv.getValue().getData()).toString();
> From user's perspective, the problem is that non-ASCII characters in search filters are messed up, and search doesn't return any results.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSERVER-1009) ASN.1 decoder does not use UTF-8 encoding for strings

Posted by "Emmanuel Lecharny (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSERVER-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515377 ] 

Emmanuel Lecharny commented on DIRSERVER-1009:
----------------------------------------------

Very good catch Klemen !!!

There are other places in the decoder where we have the same issue. I will fix them today.

I bet the pb is also occuring in 1.0, so I will fix it too.

Thanks for the report !

> ASN.1 decoder does not use UTF-8 encoding for strings
> -----------------------------------------------------
>
>                 Key: DIRSERVER-1009
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-1009
>             Project: Directory ApacheDS
>          Issue Type: Bug
>          Components: asn1
>    Affects Versions: 1.5.0
>         Environment: JDK 1.6, WinXP
>            Reporter: Klemen Zagar
>            Assignee: Emmanuel Lecharny
>
> Currently, LDAP ASN.1 decoder transforms byte[] to String via calls such as:
>   String any = new String( tlv.getValue().getData() );
>   (source: /ldap/src/main/java/org/apache/directory/shared/ldap/codec/actions/StoreAnyAction.java, line 74)
> This uses some default encoding, though according to LDAP RFC, UTF-8 should always be used (http://www.rfc-editor.org/rfc/rfc2251.txt, section 4.1.2).
> I recommend that the following code be used for converting byte[] to String:
>   // static variable containing the UTF8 encoder
>   private static final Charset UTF8 = Charset.forName("UTF8");
>   // wherever byte[]->string conversion is needed
>   String any = UTF8.decode(tlv.getValue().getData()).toString();
> From user's perspective, the problem is that non-ASCII characters in search filters are messed up, and search doesn't return any results.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (DIRSERVER-1009) ASN.1 decoder does not use UTF-8 encoding for strings

Posted by "Emmanuel Lecharny (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DIRSERVER-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Lecharny resolved DIRSERVER-1009.
------------------------------------------

    Resolution: Fixed

Solved in 1.5 too :

http://svn.apache.org/viewvc?view=rev&rev=559790

> ASN.1 decoder does not use UTF-8 encoding for strings
> -----------------------------------------------------
>
>                 Key: DIRSERVER-1009
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-1009
>             Project: Directory ApacheDS
>          Issue Type: Bug
>          Components: asn1
>    Affects Versions: 1.5.0
>         Environment: JDK 1.6, WinXP
>            Reporter: Klemen Zagar
>            Assignee: Emmanuel Lecharny
>
> Currently, LDAP ASN.1 decoder transforms byte[] to String via calls such as:
>   String any = new String( tlv.getValue().getData() );
>   (source: /ldap/src/main/java/org/apache/directory/shared/ldap/codec/actions/StoreAnyAction.java, line 74)
> This uses some default encoding, though according to LDAP RFC, UTF-8 should always be used (http://www.rfc-editor.org/rfc/rfc2251.txt, section 4.1.2).
> I recommend that the following code be used for converting byte[] to String:
>   // static variable containing the UTF8 encoder
>   private static final Charset UTF8 = Charset.forName("UTF8");
>   // wherever byte[]->string conversion is needed
>   String any = UTF8.decode(tlv.getValue().getData()).toString();
> From user's perspective, the problem is that non-ASCII characters in search filters are messed up, and search doesn't return any results.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (DIRSERVER-1009) ASN.1 decoder does not use UTF-8 encoding for strings

Posted by "Emmanuel Lecharny (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DIRSERVER-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Lecharny closed DIRSERVER-1009.
----------------------------------------


closed

> ASN.1 decoder does not use UTF-8 encoding for strings
> -----------------------------------------------------
>
>                 Key: DIRSERVER-1009
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-1009
>             Project: Directory ApacheDS
>          Issue Type: Bug
>          Components: asn1
>    Affects Versions: 1.5.0
>         Environment: JDK 1.6, WinXP
>            Reporter: Klemen Zagar
>            Assignee: Emmanuel Lecharny
>
> Currently, LDAP ASN.1 decoder transforms byte[] to String via calls such as:
>   String any = new String( tlv.getValue().getData() );
>   (source: /ldap/src/main/java/org/apache/directory/shared/ldap/codec/actions/StoreAnyAction.java, line 74)
> This uses some default encoding, though according to LDAP RFC, UTF-8 should always be used (http://www.rfc-editor.org/rfc/rfc2251.txt, section 4.1.2).
> I recommend that the following code be used for converting byte[] to String:
>   // static variable containing the UTF8 encoder
>   private static final Charset UTF8 = Charset.forName("UTF8");
>   // wherever byte[]->string conversion is needed
>   String any = UTF8.decode(tlv.getValue().getData()).toString();
> From user's perspective, the problem is that non-ASCII characters in search filters are messed up, and search doesn't return any results.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.