You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Franck Tago (Jira)" <ji...@apache.org> on 2023/08/10 18:04:00 UTC

[jira] [Updated] (SPARK-44759) Do not combine multiple Generate operators in the same WholeStageCodeGen node because it can easily cause OOM failures if arrays are relatively large

     [ https://issues.apache.org/jira/browse/SPARK-44759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Franck Tago updated SPARK-44759:
--------------------------------
    Summary: Do not combine multiple Generate operators in the same WholeStageCodeGen node because it can  easily cause OOM failures if arrays are relatively large  (was: Do not combine multiple Generate nodes in the same WholeStageCodeGen node because it can  easily cause OOM failures if arrays are relatively large)

> Do not combine multiple Generate operators in the same WholeStageCodeGen node because it can  easily cause OOM failures if arrays are relatively large
> ------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-44759
>                 URL: https://issues.apache.org/jira/browse/SPARK-44759
>             Project: Spark
>          Issue Type: Bug
>          Components: Optimizer
>    Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.1.3, 3.2.1, 3.3.0, 3.2.2, 3.3.1, 3.2.3, 3.2.4, 3.3.2, 3.4.0, 3.4.1
>            Reporter: Franck Tago
>            Priority: Major
>         Attachments: image-2023-08-10-09-27-24-124.png, image-2023-08-10-09-29-24-804.png, image-2023-08-10-09-32-46-163.png, image-2023-08-10-09-33-47-788.png, wholestagecodegen_wc1_debug_wholecodegen_passed
>
>
> This is an issue since the WSCG  implementation of the generate node. 
> The generate node used to flatten array generally  produces an amount of output rows that is significantly higher than the input rows to the node. 
> the number of output rows generated is even drastically higher when flattening a nested array .
> When we combine more that 1 generate node in the same WholeStageCodeGen  node, we run  a high risk of running out of memory for multiple reasons. 
> 1- As you can see from the attachment ,  the rows created in the nested loop are saved in writer buffer.  In my case because the rows were big , I hit an Out Of Memory Exception error .2_ The generated WholeStageCodeGen result in a nested loop that for each row  , will explode the parent array and then explode the inner array.  This is prone to OutOfmerry errors
>  
> Please view the attached Spark Gui and Spark Dag 
> In my case the wholestagecodegen includes 2 explode nodes. 
> Because the array elements are large , we end up with an Out Of Memory error. 
>  
> I recommend that we do not merge  multiple explode nodes in the same whole stage code gen node . Doing so leads to potential memory issues.
> In our case , the job execution failed with an  OOM error because the the WSCG executed  into a nested for loop . 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org