You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/06/23 03:09:14 UTC

[GitHub] [doris] Gabriel39 commented on a diff in pull request #10296: [vectorized][udf] improvement java-udaf with group by clause

Gabriel39 commented on code in PR #10296:
URL: https://github.com/apache/doris/pull/10296#discussion_r904462502


##########
fe/java-udf/src/main/java/org/apache/doris/udf/UdafExecutor.java:
##########
@@ -132,24 +133,33 @@ protected void finalize() throws Throwable {
     /**
      * invoke add function, add row in loop [rowStart, rowEnd).
      */
-    public void add(long rowStart, long rowEnd) throws UdfRuntimeException {
+    public void add(boolean isSinglePlace, long rowStart, long rowEnd) throws UdfRuntimeException {
         try {
-            Object[] inputArgs = new Object[argTypes.length + 1];
-            inputArgs[0] = stateObj;
-            for (long row = rowStart; row < rowEnd; ++row) {
-                Object[] inputObjects = allocateInputObjects(row);
-                for (int i = 0; i < argTypes.length; ++i) {
-                    if (UdfUtils.UNSAFE.getLong(null, UdfUtils.getAddressAtOffset(inputNullsPtrs, i)) == -1
-                            || UdfUtils.UNSAFE.getByte(null,
-                                    UdfUtils.UNSAFE.getLong(null, UdfUtils.getAddressAtOffset(inputNullsPtrs, i)) + row)
-                                    == 0) {
-                        inputArgs[i + 1] = inputObjects[i];
-                    } else {
-                        inputArgs[i + 1] = null;
-                    }
+            long idx = rowStart;
+            do {
+                Long curPlace = UdfUtils.UNSAFE.getLong(null, UdfUtils.UNSAFE.getLong(null, inputPlacesPtr) + 8L * idx);
+                Object[] inputArgs = new Object[argTypes.length + 1];
+                if (!stateObjMap.containsKey(curPlace)) {
+                    Object stateObj = create();
+                    stateObjMap.put(curPlace, stateObj);
                 }
-                allMethods.get(UDAF_ADD_FUNCTION).invoke(udaf, inputArgs);
-            }
+                inputArgs[0] = stateObjMap.get(curPlace);
+                do {
+                    Object[] inputObjects = allocateInputObjects(idx);

Review Comment:
   If this row is NULL, we should avoid this unnecessary allocation.
   I think this logics is same as `evaluate` in `UdfExecutor`. Maybe you could make both `inputObjects` and `inputArgs` from these two class allocated in a common method?



##########
fe/java-udf/src/main/java/org/apache/doris/udf/UdafExecutor.java:
##########
@@ -197,15 +209,20 @@ public byte[] serialize() throws UdfRuntimeException {
      * invoke merge function and it's have done deserialze.
      * here call deserialize first, and call merge.
      */
-    public void merge(byte[] data) throws UdfRuntimeException {
+    public void merge(long place, byte[] data) throws UdfRuntimeException {
         try {
             Object[] args = new Object[2];
             ByteArrayInputStream bins = new ByteArrayInputStream(data);
             args[0] = create();
             args[1] = new DataInputStream(bins);
             allMethods.get(UDAF_DESERIALIZE_FUNCTION).invoke(udaf, args);
             args[1] = args[0];
-            args[0] = stateObj;
+            Long curPlace = place;
+            if (!stateObjMap.containsKey(curPlace)) {
+                Object stateObj = create();

Review Comment:
   change `create()` to another name. maybe `createAggState`



##########
fe/java-udf/src/main/java/org/apache/doris/udf/UdafExecutor.java:
##########
@@ -132,24 +133,33 @@ protected void finalize() throws Throwable {
     /**
      * invoke add function, add row in loop [rowStart, rowEnd).
      */
-    public void add(long rowStart, long rowEnd) throws UdfRuntimeException {
+    public void add(boolean isSinglePlace, long rowStart, long rowEnd) throws UdfRuntimeException {
         try {
-            Object[] inputArgs = new Object[argTypes.length + 1];
-            inputArgs[0] = stateObj;
-            for (long row = rowStart; row < rowEnd; ++row) {
-                Object[] inputObjects = allocateInputObjects(row);
-                for (int i = 0; i < argTypes.length; ++i) {
-                    if (UdfUtils.UNSAFE.getLong(null, UdfUtils.getAddressAtOffset(inputNullsPtrs, i)) == -1
-                            || UdfUtils.UNSAFE.getByte(null,
-                                    UdfUtils.UNSAFE.getLong(null, UdfUtils.getAddressAtOffset(inputNullsPtrs, i)) + row)
-                                    == 0) {
-                        inputArgs[i + 1] = inputObjects[i];
-                    } else {
-                        inputArgs[i + 1] = null;
-                    }
+            long idx = rowStart;
+            do {
+                Long curPlace = UdfUtils.UNSAFE.getLong(null, UdfUtils.UNSAFE.getLong(null, inputPlacesPtr) + 8L * idx);
+                Object[] inputArgs = new Object[argTypes.length + 1];
+                if (!stateObjMap.containsKey(curPlace)) {

Review Comment:
   stateObjMap.putIfAbsent



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org