You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@netbeans.apache.org by "matthiasblaesing (via GitHub)" <gi...@apache.org> on 2023/01/29 22:42:04 UTC
[GitHub] [netbeans] matthiasblaesing commented on a diff in pull request #5299: [NETBEANS-4123] Initial implementation of handling large strings

matthiasblaesing commented on code in PR #5299:
URL: https://github.com/apache/netbeans/pull/5299#discussion_r1081855724


##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -130,7 +132,12 @@ static String getStringWithLengthControl(StringReference sr) throws InternalExce
         try {
             ReferenceType st = ObjectReferenceWrapper.referenceType(sr);
             ArrayReference sa = null;
+            //only applicable if the string implementation uses a byte[] instead
+            //of a char[]
+            boolean isUTF16 = false;
+            boolean isCompressedImpl = false;

Review Comment:
   It is JEP254 or "Compact Strings". There is no compression involved, just different character encodings (ISO-8859-1 vs. UTF-16).



##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -141,25 +148,59 @@ static String getStringWithLengthControl(StringReference sr) throws InternalExce
                             continue;
                         }
                         Type type = f.type();
-                        if (type instanceof ArrayType &&
-                            "char".equals(((ArrayType) type).componentTypeName())) {
-                            valuesField = f;
+                        if (type instanceof ArrayType) {
+                            String componentType = ((ArrayType)type).componentTypeName();
+                            if ("byte".equals(componentType)){
+                                isCompressedImpl = true;
+                                valuesField = f;
+                            }
+                            else if ("char".equals(componentType)){
+                                valuesField = f;
+                            }
+                            else{
+                                continue;
+                            }
                             break;
                         }
                     }
                 }
+                else if (valuesField.type() instanceof ArrayType &&
+                        "byte".equals(((ArrayType)valuesField.type()).
+                                componentTypeName())){
+                    isCompressedImpl = true;
+                }
                 if (valuesField == null) {
                     isShort = true; // We did not find the values field.
                 } else {
+                    if (isCompressedImpl){
+                        //is it compressed?
+                        final int LATIN1 = 0;
+                        Field coderField = ReferenceTypeWrapper.fieldByName(st,
+                                "coder");
+                        Value coderValue;
+                        if (coderField != null){
+                            coderValue = ObjectReferenceWrapper.getValue(sr,
+                                    coderField);
+                            if (coderValue instanceof IntegerValue &&
+                                    ((IntegerValue)coderValue).value() != LATIN1){

Review Comment:
   This check failed on JDK 17 for me. I got a `ByteValue` at this point. Instead of getting to narrow just switch to `PrimitiveValue`, which has helpful accessors.
   
   ```suggestion
                               if (coderValue instanceof PrimitiveValue &&
                                       ((PrimitiveValue)coderValue).intValue() != LATIN1){
   ```



##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -171,14 +212,43 @@ static String getStringWithLengthControl(StringReference sr) throws InternalExce
             } else {
                 assert sa != null;
                 int l = AbstractObjectVariable.MAX_STRING_LENGTH;
+                if (isCompressedImpl && isUTF16){
+                    l *= 2;
+                }

Review Comment:
   See below.
   
   ```suggestion
                   List<Value> values = ArrayReferenceWrapper.getValues(sa, 0, isUTF16 ? (l * 2) : l);
   ```



##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -171,14 +212,43 @@ static String getStringWithLengthControl(StringReference sr) throws InternalExce
             } else {
                 assert sa != null;
                 int l = AbstractObjectVariable.MAX_STRING_LENGTH;
+                if (isCompressedImpl && isUTF16){
+                    l *= 2;
+                }
                 List<Value> values = ArrayReferenceWrapper.getValues(sa, 0, l);
                 char[] characters = new char[l + 3];
-                for (int i = 0; i < l; i++) {
-                    Value v = values.get(i);
-                    if (!(v instanceof CharValue)) {
-                        return "<Unreadable>";
+                if (isCompressedImpl) {
+                    //java compressed string
+                    if (!isUTF16) {
+                        //we can just cast to char
+                        for (int i = 0; i < l; i++) {
+                            Value v = values.get(i);
+                            if (!(v instanceof ByteValue)) {
+                                return ERROR_RESULT;
+                            }
+                            char c = (char)((ByteValue) v).byteValue();
+                            //remove the extended sign
+                            c &= 0xFF;
+                            characters[i] = c;
+                        }
+                    }
+                    else {
+                        //life is pain
+                        //We can't just inline code for this since... native
+                        //jazz and big/little endian stuff... so...
+                        //see StringUTF16.java
+                        //implement later!
+                        return "<Not Implemented>";

Review Comment:
   To correctly decode a string we'd need to know the byte order of the target VM, but the following implemention should work correctly on little endian architectures (x86, arm, riscv), which should at least cover mainline onces.
   
   ```suggestion
                           // This assumes little endian encoding. This should work
                           // for most architectures (x86, arm, riscv), but will
                           // result in bogus data on big endian architectures
                           for (int i = 0; i < l; i++) {
                               int index = i * 2;
                               Value v = values.get(index);
                               if (!(v instanceof ByteValue)) {
                                   return ERROR_RESULT;
                               }
                               Value v2 = values.get(index + 1);
                               if (!(v instanceof ByteValue)) {
                                   return ERROR_RESULT;
                               }
                               char c1 = (char) ((ByteValue) v).byteValue();
                               char c2 = (char) ((ByteValue) v2).byteValue();
                               //remove the extended sign
                               c1 = (char) (0xFF & c1);
                               c2 = (char) (0xFF & c2);
                               // char bigEndianChar = (char) ((c1 << 8) | c2);
                               char litteEndianChar = (char) ((c2 << 8) | c1);
                               characters[i] = litteEndianChar;
                           }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@netbeans.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@netbeans.apache.org
For additional commands, e-mail: notifications-help@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists