You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Izek Greenfield (JIRA)" <ji...@apache.org> on 2018/08/13 13:04:00 UTC
[jira] [Comment Edited] (SPARK-25094) proccesNext() failed to compile size is over 64kb

    [ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578127#comment-16578127 ] 

Izek Greenfield edited comment on SPARK-25094 at 8/13/18 1:03 PM:
------------------------------------------------------------------

the code that creates this plan is very complex. 
I will try to reproduce it in simple code in the meanwhile I can attach the generated code so you can see the problem is that the code does not create functions and inline all the Plan into the processNext method. [^generated_code.txt]  

it contains 2 DataFrames on with 80 columns 10 of them built from `case when` expressions:
 like that: 
CASE WHEN (`predefined_hc` IS NOT NULL) THEN '/Predefined_hc/' WHEN (`zero_volatility_adj_ind` = 'Y') THEN '/Zero_Haircuts_cases/' WHEN (`collateral_allocation_method` = 'FCSM') THEN '/FCSM_Collaterals/' WHEN ((((`underlying_type` = 'DEBT') AND (`issuer_type` = 'CGVT')) AND ((`instrument_cqs_st` <= 4) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 4)))) AND (`residual_maturity_instrument` <= 12.0D)) THEN '/Debt/Central_Government_Issuer/Eligible/res_mat_1Y/' WHEN ((((`underlying_type` = 'DEBT') AND (`issuer_type` = 'CGVT')) AND ((`instrument_cqs_st` <= 4) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 4)))) AND (`residual_maturity_instrument` <= 60.0D)) THEN '/Debt/Central_Government_Issuer/Eligible/res_mat_5Y/' WHEN (((`underlying_type` = 'DEBT') AND (`issuer_type` = 'CGVT')) AND ((`instrument_cqs_st` <= 4) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 4)))) THEN '/Debt/Central_Government_Issuer/Eligible/res_mat_G5/' WHEN ((`underlying_type` = 'DEBT') AND (`issuer_type` = 'CGVT')) THEN '/Debt/Central_Government_Issuer/Non_Eligible/' WHEN ((((`underlying_type` = 'DEBT') AND (`issuer_type` IN ('INST', 'CORP', 'PSE', 'RGLA', 'IO_LISTED'))) AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 3)))) AND (`residual_maturity_instrument` <= 12.0D)) THEN '/Debt/Other_Issuers/Eligible/res_mat_1Y/' WHEN ((((`underlying_type` = 'DEBT') AND (`issuer_type` IN ('INST', 'CORP', 'PSE', 'RGLA', 'IO_LISTED'))) AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 3)))) AND (`residual_maturity_instrument` <= 60.0D)) THEN '/Debt/Other_Issuers/Eligible/res_mat_5Y/' WHEN (((`underlying_type` = 'DEBT') AND (`issuer_type` IN ('INST', 'CORP', 'PSE', 'RGLA', 'IO_LISTED'))) AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 3)))) THEN '/Debt/Other_Issuers/Eligible/res_mat_G5/' WHEN ((`underlying_type` = 'DEBT') AND (`issuer_type` IN ('INST', 'CORP', 'PSE', 'RGLA', 'IO_LISTED'))) THEN '/Debt/Other_Issuers/Non_Eligible/' WHEN (((`underlying_type` = 'SECURITISATION') AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 3)))) AND (`residual_maturity_instrument` <= 12.0D)) THEN '/Securitisation/Eligible/res_mat_1Y/' WHEN (((`underlying_type` = 'SECURITISATION') AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 3)))) AND (`residual_maturity_instrument` <= 60.0D)) THEN '/Securitisation/Eligible/res_mat_5Y/' WHEN ((`underlying_type` = 'SECURITISATION') AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 3)))) THEN '/Securitisation/Eligible/res_mat_G5/' WHEN (`underlying_type` = 'SECURITISATION') THEN '/Securitisation/Non_Eligible/' WHEN ((`underlying_type` IN ('EQUITY', 'MAIN_INDEX_EQUITY', 'COMMODITY', 'NON_ELIGIBLE_SECURITY')) AND (`underlying_type` = 'MAIN_INDEX_EQUITY')) THEN '/Other_Securities/Main_index/' WHEN (`underlying_type` IN ('EQUITY', 'MAIN_INDEX_EQUITY', 'COMMODITY', 'NON_ELIGIBLE_SECURITY')) THEN '/Other_Securities/Others/' WHEN (`underlying_type` = 'CASH') THEN '/Cash/' WHEN (`underlying_type` = 'GOLD') THEN '/Gold/' WHEN (`underlying_type` = 'CIU') THEN '/CIU/' WHEN true THEN '/Others/' END AS `108_0___Portfolio_CRD4_Art_224_Volatility_Adjustments_Codes____path_CRD4_Art_224_Volatil`


was (Author: igreenfi):
the code that creates this plan is very complex. 
I will try to reproduce it in simple code in the meanwhile I can attach the generated code so you can see the problem is that the code does not create functions and inline all the Plan into the processNext method. [^generated_code.txt]  

> proccesNext() failed to compile size is over 64kb
> -------------------------------------------------
>
>                 Key: SPARK-25094
>                 URL: https://issues.apache.org/jira/browse/SPARK-25094
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Izek Greenfield
>            Priority: Major
>         Attachments: generated_code.txt
>
>
> I have this tree:
> 2018-08-12T07:14:31,289 WARN  [] org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen disabled for plan (id=1):
>  *(1) Project [, ... 10 more fields]
> +- *(1) Filter NOT exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)
>    +- InMemoryTableScan [, ... 11 more fields], [NOT exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)]
>          +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, deserialized, 1 replicas)
>                +- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner
>                   :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0
>                   :  +- Exchange(coordinator id: 1456511137) UnknownPartitioning(9), coordinator[target post-shuffle partition size: 67108864]
>                   :     +- *(1) Project [, ... 6 more fields]
>                   :        +- *(1) Filter (((((isnotnull(v#49) && isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) && (v#49 = DATA_REG)) && isnotnull(unique_id#39))
>                   :           +- InMemoryTableScan [, ... 6 more fields], [, ... 6 more fields]
>                   :                 +- InMemoryRelation [, ... 6 more fields], StorageLevel(memory, deserialized, 1 replicas)
>                   :                       +- *(1) FileScan csv [,... 6 more fields] , ... 6 more fields
>                   +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0
>                      +- Exchange(coordinator id: 1456511137) UnknownPartitioning(9), coordinator[target post-shuffle partition size: 67108864]
>                         +- *(3) Project [, ... 74 more fields]
>                            +- *(3) Filter (((isnotnull(v#51) && (asof_date#42 <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54))
>                               +- InMemoryTableScan [, ... 74 more fields], [, ... 4 more fields]
>                                     +- InMemoryRelation [, ... 74 more fields], StorageLevel(memory, deserialized, 1 replicas)
>                                           +- *(1) FileScan csv [,... 74 more fields] , ... 6 more fields
> Compiling "GeneratedClass": Code of method "processNext()V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" grows beyond 64 KB
> and the generated code failed to compile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org