You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "doki23 (via GitHub)" <gi...@apache.org> on 2023/03/25 12:13:54 UTC

[GitHub] [arrow-datafusion] doki23 opened a new issue, #5735: Check the udf output size which should be equal to the input size

doki23 opened a new issue, #5735:
URL: https://github.com/apache/arrow-datafusion/issues/5735

                 Hmmm...or we should check the result size of udf? I'm not sure wether it's proper that the sizes of input and result could be different. cc @alamb @mingmwang @tustvold
   
   _Originally posted by @doki23 in https://github.com/apache/arrow-datafusion/issues/5635#issuecomment-1475092781_
               


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Check the udf output size which should be equal to the input size [arrow-datafusion]

Posted by "ZhengLin-Li (via GitHub)" <gi...@apache.org>.
ZhengLin-Li commented on issue #5735:
URL: https://github.com/apache/arrow-datafusion/issues/5735#issuecomment-1947444069

   @alamb @doki23 it seems that this issue is fixed? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Check the udf output size which should be equal to the input size [arrow-datafusion]

Posted by "doki23 (via GitHub)" <gi...@apache.org>.
doki23 commented on issue #5735:
URL: https://github.com/apache/arrow-datafusion/issues/5735#issuecomment-1951615094

   > @alamb @doki23 it seems that this issue is fixed? 
   
   I'm not sure :(


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Check the udf output size which should be equal to the input size [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #5735:
URL: https://github.com/apache/arrow-datafusion/issues/5735#issuecomment-1951790819

   I think the idea of this ticket was to put some basic checks / assert to ensure that the output of UDFs has the correct 
   
   As I understand it this would mean adding (or seeing if there was an assert) that the number of output rows from accumulators was correct 
   
   Maybe somewhere in
   
   https://github.com/apache/arrow-datafusion/blob/main/datafusion/physical-plan/src/aggregates/row_hash.rs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Check the udf output size which should be equal to the input size [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #5735:
URL: https://github.com/apache/arrow-datafusion/issues/5735#issuecomment-2053747647

   So the idea here is that we add a check after invoking a ScalarUDF that the number of rows that came out was the same as the number that went in. If this is not the case DataFusion should raise an internal error with a clear error message


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Check the udf output size which should be equal to the input size [arrow-datafusion]

Posted by "duongcongtoai (via GitHub)" <gi...@apache.org>.
duongcongtoai commented on issue #5735:
URL: https://github.com/apache/arrow-datafusion/issues/5735#issuecomment-2053665418

   Hi, i would like to take this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Check the udf output size which should be equal to the input size [datafusion]

Posted by "duongcongtoai (via GitHub)" <gi...@apache.org>.
duongcongtoai commented on issue #5735:
URL: https://github.com/apache/datafusion/issues/5735#issuecomment-2080369588

   This was implemented in this [PR](https://github.com/apache/datafusion/pull/10148) (and we fixed 2 existing UDF violating this violation)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org
For additional commands, e-mail: github-help@datafusion.apache.org


Re: [I] Check the udf output size which should be equal to the input size [datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #5735: Check the udf output size which should be equal to the input size
URL: https://github.com/apache/datafusion/issues/5735


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org
For additional commands, e-mail: github-help@datafusion.apache.org