You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Aditya Kishore (JIRA)" <ji...@apache.org> on 2012/10/13 12:29:02 UTC

[jira] [Created] (HBASE-6991) Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary() are not always consistant

Aditya Kishore created HBASE-6991:
-------------------------------------

             Summary: Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary() are not always consistant
                 Key: HBASE-6991
                 URL: https://issues.apache.org/jira/browse/HBASE-6991
             Project: HBase
          Issue Type: Bug
          Components: util
    Affects Versions: 0.96.0
            Reporter: Aditya Kishore
            Assignee: Aditya Kishore


Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.

For example, please consider the following code snippet.

{code}
public void testConversion() {
  byte[] original = {
      '\\', 'x', 'A', 'D'
  };
  String stringFromBytes = Bytes.toStringBinary(original);
  byte[] converted = Bytes.toBytesBinary(stringFromBytes);
  System.out.println("Original: " + Arrays.toString(original));
  System.out.println("Converted: " + Arrays.toString(converted));
  System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
}

Output:
-------
Original: [92, 120, 65, 68]
Converted: [-83]
Reversible?: false
{code}

The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of unambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-6991:
-------------------------

      Resolution: Fixed
    Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)
          Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the patch Aditya.  Nice one.  That already encoded stuff should be undone properly should protect us against pre 0.96 data looking different in 0.96.
                
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of ambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491808#comment-13491808 ] 

Hudson commented on HBASE-6991:
-------------------------------

Integrated in HBase-TRUNK #3514 (See [https://builds.apache.org/job/HBase-TRUNK/3514/])
    HBASE-6991 Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary() (Revision 1406297)

     Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Bytes.java
* /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestBytes.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java

                
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of ambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Aditya Kishore (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aditya Kishore updated HBASE-6991:
----------------------------------

    Attachment: HBASE-6991_trunk.patch

Attaching the patch which modifies toStringBinary() to treat "\" as non-printable character and translate it to "\x5C"
                
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of unambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Aditya Kishore (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477706#comment-13477706 ] 

Aditya Kishore commented on HBASE-6991:
---------------------------------------

It should be noted that any previously encoded StringBinary with "\" will still get correctly decoded by the unchanged toBytesBinary() function. The change to toStringBinary() ensures that new encoding of a byte array containing "\" is 100% reversible without any ambiguity.
                
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of ambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477688#comment-13477688 ] 

Hadoop QA commented on HBASE-6991:
----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12549444/HBASE-6991_trunk.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 9 new or modified tests.

    {color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 2.0 profile.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 82 warning messages.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3060//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3060//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3060//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3060//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3060//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3060//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3060//console

This message is automatically generated.
                
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of unambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Aditya Kishore (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aditya Kishore updated HBASE-6991:
----------------------------------

    Summary: Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()  (was: Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary() are not always consistant)
    
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of unambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Aditya Kishore (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aditya Kishore updated HBASE-6991:
----------------------------------

    Fix Version/s: 0.96.0
     Hadoop Flags: Incompatible change
           Status: Patch Available  (was: Open)

The patch include the following changes:

1. Gets rid of unnecessary byte[] to String conversion. The "ISO-8859-1" charset does not do any transformation anyway. This also does away with the need of try-catch block.
{code}
-    String first = new String(b, off, len, "ISO-8859-1");
-    for (int i = 0; i < first.length() ; ++i ) {
-      int ch = first.charAt(i) & 0xFF;

+    for (int i = off; i < off + len ; ++i ) {
+      int ch = b[i] & 0xFF;
{code}

2. Removed "\" from the set of printable non-alphanumeric characters so that it can be escaped using the "\xXX" format.
{code}
-          || " `~!@#$%^&*()-_=+[]{}\\|;:'\",.<>/?".indexOf(ch) >= 0 ) {

+          || " `~!@#$%^&*()-_=+[]{}|;:'\",.<>/?".indexOf(ch) >= 0 ) {
{code}

3. Added new test case to verify that the conversion is reversible for random array of bytes. Without this change the test always fails. The test add 1 extra second to the test run.

{code:title=hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestBytes.java}
+  public void testToStringBytesBinaryReversible() {
+    //  let's run test with 1000 randomly generated byte arrays
+    Random rand = new Random(System.currentTimeMillis());
+    byte[] randomBytes = new byte[1000];
+    for (int i = 0; i < 1000; i++) {
+      rand.nextBytes(randomBytes);
+      verifyReversibleForBytes(randomBytes);
+    }
+
+    //  some specific cases
+    verifyReversibleForBytes(new  byte[] {});
+    verifyReversibleForBytes(new  byte[] {'\\', 'x', 'A', 'D'});
+    verifyReversibleForBytes(new  byte[] {'\\', 'x', 'A', 'D', '\\'});
+  }
+
+  private void verifyReversibleForBytes(byte[] originalBytes) {
+    String convertedString = Bytes.toStringBinary(originalBytes);
+    byte[] convertedBytes = Bytes.toBytesBinary(convertedString);
+    if (Bytes.compareTo(originalBytes, convertedBytes) != 0) {
+      fail("Not reversible for\nbyte[]: " + Arrays.toString(originalBytes) +
+          ",\nStringBinary: " + convertedString);
+    }
+  }
{code}

4. And finally, fixes the two test cases which were breaking because they assumed that "\" is encoded as "\".
{code}
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java

-            + "\\xD46\\xEA5\\xEA3\\xEA7\\xE7\\x00LI\\s\\xA0\\x0F\\x00\\x00"
+            + "\\xD46\\xEA5\\xEA3\\xEA7\\xE7\\x00LI\\x5Cs\\xA0\\x0F\\x00\\x00"
{code}

Setting the "Incompatible change" flag since any other code which makes the same assumption as the two test cases needs fix.
                
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of unambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Aditya Kishore (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491751#comment-13491751 ] 

Aditya Kishore commented on HBASE-6991:
---------------------------------------

[~yuzhihong@gmail.com], [~stack] Could you please review this.

https://reviews.apache.org/r/7632/
                
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of ambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491955#comment-13491955 ] 

Hudson commented on HBASE-6991:
-------------------------------

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #250 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/250/])
    HBASE-6991 Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary() (Revision 1406297)

     Result = FAILURE
stack : 
Files : 
* /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Bytes.java
* /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestBytes.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java

                
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of ambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6991) Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

Posted by "Aditya Kishore (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aditya Kishore updated HBASE-6991:
----------------------------------

    Description: 
Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.

For example, please consider the following code snippet.

{code}
public void testConversion() {
  byte[] original = {
      '\\', 'x', 'A', 'D'
  };
  String stringFromBytes = Bytes.toStringBinary(original);
  byte[] converted = Bytes.toBytesBinary(stringFromBytes);
  System.out.println("Original: " + Arrays.toString(original));
  System.out.println("Converted: " + Arrays.toString(converted));
  System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
}

Output:
-------
Original: [92, 120, 65, 68]
Converted: [-83]
Reversible?: false
{code}

The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of ambiguity during conversion.

  was:
Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.

For example, please consider the following code snippet.

{code}
public void testConversion() {
  byte[] original = {
      '\\', 'x', 'A', 'D'
  };
  String stringFromBytes = Bytes.toStringBinary(original);
  byte[] converted = Bytes.toBytesBinary(stringFromBytes);
  System.out.println("Original: " + Arrays.toString(original));
  System.out.println("Converted: " + Arrays.toString(converted));
  System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
}

Output:
-------
Original: [92, 120, 65, 68]
Converted: [-83]
Reversible?: false
{code}

The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of unambiguity during conversion.

    
> Escape "\" in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-6991
>                 URL: https://issues.apache.org/jira/browse/HBASE-6991
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.96.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6991_trunk.patch
>
>
> Since "\" is used to escape non-printable character but not treated as special character in conversion, it could lead to unexpected conversion.
> For example, please consider the following code snippet.
> {code}
> public void testConversion() {
>   byte[] original = {
>       '\\', 'x', 'A', 'D'
>   };
>   String stringFromBytes = Bytes.toStringBinary(original);
>   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
>   System.out.println("Original: " + Arrays.toString(original));
>   System.out.println("Converted: " + Arrays.toString(converted));
>   System.out.println("Reversible?: " + (Bytes.compareTo(original, converted) == 0));
> }
> Output:
> -------
> Original: [92, 120, 65, 68]
> Converted: [-83]
> Reversible?: false
> {code}
> The "\" character needs to be treated as special and must be encoded as a non-printable character ("\x5C") to avoid any kind of ambiguity during conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira