You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "ZuoTiJia (via GitHub)" <gi...@apache.org> on 2023/02/27 11:51:18 UTC

[GitHub] [arrow-datafusion] ZuoTiJia opened a new issue, #5415: `rpad` function causes OOM

ZuoTiJia opened a new issue, #5415:
URL: https://github.com/apache/arrow-datafusion/issues/5415

   **Describe the bug**
   When I use the rpad function, and the parameter of the rpad function is large, it causes OOM.
   Does Datafusion have a mechanism to limit the resources required for execution.
   
   **To Reproduce**
   ```sql
   create table test(a bigint) as values
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1),
       (1), (1), (1), (1), (1);
   select rpad('a', 1073741824, 'a') from test;
   ```
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   **Additional context**
   Add any other context about the problem here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on issue #5415: `rpad` function causes OOM

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on issue #5415:
URL: https://github.com/apache/arrow-datafusion/issues/5415#issuecomment-1447074368

   Postgres handles this in very interesting manner
   https://github.com/postgres/postgres/blob/d952373a987bad331c0e499463159dd142ced1ef/src/backend/utils/adt/oracle_compat.c#L282 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] ZuoTiJia commented on issue #5415: `rpad` function causes OOM

Posted by "ZuoTiJia (via GitHub)" <gi...@apache.org>.
ZuoTiJia commented on issue #5415:
URL: https://github.com/apache/arrow-datafusion/issues/5415#issuecomment-1447445848

   > We probably have to put some limits. for example postgres fails fast
   > 
   > ```
   > select rpad('a', 1073741824, 'a');
   > 
   > requested length too large
   > ```
   > 
   > My Mac is spinning CPU like crazy with the same query in datafusion-cli, I had to kill the process
   > 
   > @ZuoTiJia is such kind of rpad is needed by your real use case or you testing DF limits? :)
   
   I'm testing DF and I think it's dangerous for users if there is no limit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on issue #5415: `rpad` function causes OOM

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on issue #5415:
URL: https://github.com/apache/arrow-datafusion/issues/5415#issuecomment-1446872544

   We probably have to put some limits. for example postgres fails fast 
   
   ```
   select rpad('a', 1073741824, 'a');
   
   requested length too large
   ```
   
   @ZuoTiJia is such kind of rpad is needed by your real use case or you testing DF limits? :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on issue #5415: `rpad` function causes OOM

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on issue #5415:
URL: https://github.com/apache/arrow-datafusion/issues/5415#issuecomment-1448652595

   sounds good, I think we can implement quick win like in PG
   https://github.com/postgres/postgres/blob/c8e1ba736b2b9e8c98d37a5b77c4ed31baf94147/src/include/utils/memutils.h#L42
   
   I'll take this later this week unless no one else volunteers


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org