You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@netbeans.apache.org by "matthiasblaesing (via GitHub)" <gi...@apache.org> on 2023/01/29 22:42:04 UTC
[GitHub] [netbeans] matthiasblaesing commented on a diff in pull request #5299: [NETBEANS-4123] Initial implementation of handling large strings
matthiasblaesing commented on code in PR #5299:
URL: https://github.com/apache/netbeans/pull/5299#discussion_r1081855724
##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -130,7 +132,12 @@ static String getStringWithLengthControl(StringReference sr) throws InternalExce
try {
ReferenceType st = ObjectReferenceWrapper.referenceType(sr);
ArrayReference sa = null;
+ //only applicable if the string implementation uses a byte[] instead
+ //of a char[]
+ boolean isUTF16 = false;
+ boolean isCompressedImpl = false;
Review Comment:
It is JEP254 or "Compact Strings". There is no compression involved, just different character encodings (ISO-8859-1 vs. UTF-16).
##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -141,25 +148,59 @@ static String getStringWithLengthControl(StringReference sr) throws InternalExce
continue;
}
Type type = f.type();
- if (type instanceof ArrayType &&
- "char".equals(((ArrayType) type).componentTypeName())) {
- valuesField = f;
+ if (type instanceof ArrayType) {
+ String componentType = ((ArrayType)type).componentTypeName();
+ if ("byte".equals(componentType)){
+ isCompressedImpl = true;
+ valuesField = f;
+ }
+ else if ("char".equals(componentType)){
+ valuesField = f;
+ }
+ else{
+ continue;
+ }
break;
}
}
}
+ else if (valuesField.type() instanceof ArrayType &&
+ "byte".equals(((ArrayType)valuesField.type()).
+ componentTypeName())){
+ isCompressedImpl = true;
+ }
if (valuesField == null) {
isShort = true; // We did not find the values field.
} else {
+ if (isCompressedImpl){
+ //is it compressed?
+ final int LATIN1 = 0;
+ Field coderField = ReferenceTypeWrapper.fieldByName(st,
+ "coder");
+ Value coderValue;
+ if (coderField != null){
+ coderValue = ObjectReferenceWrapper.getValue(sr,
+ coderField);
+ if (coderValue instanceof IntegerValue &&
+ ((IntegerValue)coderValue).value() != LATIN1){
Review Comment:
This check failed on JDK 17 for me. I got a `ByteValue` at this point. Instead of getting to narrow just switch to `PrimitiveValue`, which has helpful accessors.
```suggestion
if (coderValue instanceof PrimitiveValue &&
((PrimitiveValue)coderValue).intValue() != LATIN1){
```
##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -171,14 +212,43 @@ static String getStringWithLengthControl(StringReference sr) throws InternalExce
} else {
assert sa != null;
int l = AbstractObjectVariable.MAX_STRING_LENGTH;
+ if (isCompressedImpl && isUTF16){
+ l *= 2;
+ }
Review Comment:
See below.
```suggestion
List<Value> values = ArrayReferenceWrapper.getValues(sa, 0, isUTF16 ? (l * 2) : l);
```
##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -171,14 +212,43 @@ static String getStringWithLengthControl(StringReference sr) throws InternalExce
} else {
assert sa != null;
int l = AbstractObjectVariable.MAX_STRING_LENGTH;
+ if (isCompressedImpl && isUTF16){
+ l *= 2;
+ }
List<Value> values = ArrayReferenceWrapper.getValues(sa, 0, l);
char[] characters = new char[l + 3];
- for (int i = 0; i < l; i++) {
- Value v = values.get(i);
- if (!(v instanceof CharValue)) {
- return "<Unreadable>";
+ if (isCompressedImpl) {
+ //java compressed string
+ if (!isUTF16) {
+ //we can just cast to char
+ for (int i = 0; i < l; i++) {
+ Value v = values.get(i);
+ if (!(v instanceof ByteValue)) {
+ return ERROR_RESULT;
+ }
+ char c = (char)((ByteValue) v).byteValue();
+ //remove the extended sign
+ c &= 0xFF;
+ characters[i] = c;
+ }
+ }
+ else {
+ //life is pain
+ //We can't just inline code for this since... native
+ //jazz and big/little endian stuff... so...
+ //see StringUTF16.java
+ //implement later!
+ return "<Not Implemented>";
Review Comment:
To correctly decode a string we'd need to know the byte order of the target VM, but the following implemention should work correctly on little endian architectures (x86, arm, riscv), which should at least cover mainline onces.
```suggestion
// This assumes little endian encoding. This should work
// for most architectures (x86, arm, riscv), but will
// result in bogus data on big endian architectures
for (int i = 0; i < l; i++) {
int index = i * 2;
Value v = values.get(index);
if (!(v instanceof ByteValue)) {
return ERROR_RESULT;
}
Value v2 = values.get(index + 1);
if (!(v instanceof ByteValue)) {
return ERROR_RESULT;
}
char c1 = (char) ((ByteValue) v).byteValue();
char c2 = (char) ((ByteValue) v2).byteValue();
//remove the extended sign
c1 = (char) (0xFF & c1);
c2 = (char) (0xFF & c2);
// char bigEndianChar = (char) ((c1 << 8) | c2);
char litteEndianChar = (char) ((c2 << 8) | c1);
characters[i] = litteEndianChar;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@netbeans.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@netbeans.apache.org
For additional commands, e-mail: notifications-help@netbeans.apache.org
For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists