You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/10/08 08:17:59 UTC

[GitHub] [pulsar] codelipenghui commented on a diff in pull request #17948: [improve][java-client][issue-17931]Reduce call of hashFunction in SchemaHash

codelipenghui commented on code in PR #17948:
URL: https://github.com/apache/pulsar/pull/17948#discussion_r990609784


##########
pulsar-common/src/main/java/org/apache/pulsar/common/protocol/schema/SchemaHash.java:
##########
@@ -55,15 +82,118 @@ public static SchemaHash of(SchemaData schemaData) {
     }
 
     public static SchemaHash of(SchemaInfo schemaInfo) {
-        return of(schemaInfo == null ? new byte[0] : schemaInfo.getSchema(),
+        return of(schemaInfo == null ? null : schemaInfo.getSchema(),
                 schemaInfo == null ? null : schemaInfo.getType());
     }
 
-    public static SchemaHash of(byte[] schemaBytes, SchemaType schemaType) {
-        return new SchemaHash(hashFunction.hashBytes(schemaBytes), schemaType);
+    public static SchemaHash of() {
+        return of(null, null);
+    }
+
+    private static SchemaHash of(byte[] schemaBytes, SchemaType schemaType) {
+        SchemaHash result = null;
+        if (schemaBytes == null || schemaBytes.length == 0) {
+            result = EmptySchemaHashFactory.get(schemaType);
+        }
+
+        // This should not be a common occurrence if everything goes well.
+        if (result == null) {
+            log.warn("Could not get schemaHash from EmptySchemaHashFactory, will create by hashFunction. Might bring"
+                            + " performance regression. schemaBytes length:{}, schemaType:{}",
+                    schemaBytes == null ? "null" : schemaBytes.length, schemaType);
+            result = new SchemaHash(
+                    hashFunction.hashBytes(schemaBytes == null ? new byte[0] : schemaBytes), schemaType);
+        }
+        return result;
     }
 
     public byte[] asBytes() {
         return hash.asBytes();
     }
+
+    private static class EmptySchemaHashFactory {
+        private static final HashCode EMPTY_HASH = hashFunction.hashBytes(new byte[0]);
+        private static final SchemaHash NONE_SCHEMA_HASH = new SchemaHash(EMPTY_HASH, NONE);
+        private static final SchemaHash STRING_SCHEMA_HASH = new SchemaHash(EMPTY_HASH, STRING);
+        private static final SchemaHash JSON_SCHEMA_HASH = new SchemaHash(EMPTY_HASH, JSON);

Review Comment:
   We only need to handle the primitive schema? The schema data of the struct schema is not empty, we can't use EMPTY_HASH here.



##########
pulsar-common/src/main/java/org/apache/pulsar/common/protocol/schema/SchemaHash.java:
##########
@@ -55,15 +82,118 @@ public static SchemaHash of(SchemaData schemaData) {
     }
 
     public static SchemaHash of(SchemaInfo schemaInfo) {
-        return of(schemaInfo == null ? new byte[0] : schemaInfo.getSchema(),
+        return of(schemaInfo == null ? null : schemaInfo.getSchema(),
                 schemaInfo == null ? null : schemaInfo.getType());
     }
 
-    public static SchemaHash of(byte[] schemaBytes, SchemaType schemaType) {
-        return new SchemaHash(hashFunction.hashBytes(schemaBytes), schemaType);
+    public static SchemaHash of() {
+        return of(null, null);
+    }
+
+    private static SchemaHash of(byte[] schemaBytes, SchemaType schemaType) {
+        SchemaHash result = null;
+        if (schemaBytes == null || schemaBytes.length == 0) {
+            result = EmptySchemaHashFactory.get(schemaType);
+        }
+
+        // This should not be a common occurrence if everything goes well.
+        if (result == null) {
+            log.warn("Could not get schemaHash from EmptySchemaHashFactory, will create by hashFunction. Might bring"
+                            + " performance regression. schemaBytes length:{}, schemaType:{}",

Review Comment:
   For a struct schema, it should be normal case, we don't need the warning log for the normal cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org