You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2018/01/21 14:25:00 UTC

[jira] [Comment Edited] (SPARK-23171) Reduce the time costs of the rule runs that do not change the plans

    [ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333540#comment-16333540 ] 

Xiao Li edited comment on SPARK-23171 at 1/21/18 2:24 PM:
----------------------------------------------------------

cc [~maropu] This is an umbrella JIRA. Feel free to create sub-tasks


was (Author: smilegator):
cc [~maropu]

> Reduce the time costs of the rule runs that do not change the plans 
> --------------------------------------------------------------------
>
>                 Key: SPARK-23171
>                 URL: https://issues.apache.org/jira/browse/SPARK-23171
>             Project: Spark
>          Issue Type: Umbrella
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Priority: Major
>
> Below is the time stats of Analyzer/Optimizer rules. Try to improve the rules and reduce the time costs, especially for the runs that do not change the plans.
> {noformat}
> === Metrics of Analyzer/Optimizer Rules ===
> Total number of runs = 175827
> Total time: 20.699042877 seconds
> Rule                                                                                               Total Time             Effective Time         Total Runs             Effective Runs        
> org.apache.spark.sql.catalyst.optimizer.ColumnPruning                                              2340563794             1338268224             1875                   761                   
> org.apache.spark.sql.catalyst.analysis.Analyzer$CTESubstitution                                    1632672623             1625071881             788                    37                    
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions                          1395087131             347339931              1982                   38                    
> org.apache.spark.sql.catalyst.optimizer.PruneFilters                                               1177711364             21344174               1590                   3                     
> org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries                               1145135465             1131417128             285                    39                    
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences                                  1008347217             663112062              1982                   616                   
> org.apache.spark.sql.catalyst.optimizer.ReorderJoin                                                767024424              693001699              1590                   132                   
> org.apache.spark.sql.catalyst.analysis.Analyzer$FixNullability                                     598524650              40802876               742                    12                    
> org.apache.spark.sql.catalyst.analysis.DecimalPrecision                                            595384169              436153128              1982                   211                   
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery                                    548178270              459695885              1982                   49                    
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts                              423002864              139869503              1982                   86                    
> org.apache.spark.sql.catalyst.optimizer.BooleanSimplification                                      405544962              17250184               1590                   7                     
> org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin                                   383837603              284174662              1590                   708                   
> org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases                                     372901885              3362332                1590                   9                     
> org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints                                364628214              343815519              285                    192                   
> org.apache.spark.sql.execution.datasources.FindDataSourceTable                                     303293296              285344766              1982                   233                   
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions                                   233195019              92648171               1982                   294                   
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion                     220568919              73932736               1982                   38                    
> org.apache.spark.sql.catalyst.optimizer.NullPropagation                                            207976072              9072305                1590                   26                    
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings                                 207027618              37834145               1982                   40                    
> org.apache.spark.sql.catalyst.optimizer.PushDownPredicate                                          203382836              176482044              1590                   783                   
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion                                   192152216              15738573               1982                   1                     
> org.apache.spark.sql.catalyst.optimizer.ConstantFolding                                            191624610              58857553               1590                   126                   
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion                               183008262              78280172               1982                   29                    
> org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator                                   176935299              0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.ResolveTimeZone                                             170161002              74354990               1982                   417                   
> org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator                                 166173174              0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.OptimizeIn                                                 155410763              8197045                1590                   16                    
> org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions                          153726565              0                      1590                   0                     
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion                                     153013269              0                      1982                   0                     
> org.apache.spark.sql.catalyst.optimizer.SimplifyCasts                                              146693495              13537077               1590                   69                    
> org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison                                   144818581              0                      1590                   0                     
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations                             143943308              6889302                1982                   27                    
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division                                       142925142              12653147               1982                   8                     
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality                                142775965              0                      1982                   0                     
> org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals                                       141509150              0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.LikeSimplification                                         132387762              636851                 1590                   1                     
> org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions                               127412361              0                      1590                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame                                 126772671              9317887                1982                   21                    
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion                                 116484407              0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion                                    115402736              0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.ResolveCreateNamedStruct                                    115071447              0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder                                 113115366              4563584                1982                   14                    
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion                            107747140              0                      1982                   0                     
> org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin                                         105020607              13907906               1590                   11                    
> org.apache.spark.sql.catalyst.analysis.TimeWindowing                                               101018029              0                      1982                   0                     
> org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery                            98043747               7044358                1590                   7                     
> org.apache.spark.sql.catalyst.optimizer.ConstantPropagation                                        95173536               0                      1590                   0                     
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion                                  94134701               0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics                           84419135               33892351               1982                   11                    
> org.apache.spark.sql.execution.datasources.DataSourceAnalysis                                      83297816               77023484               742                    24                    
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer                                77880196               36980636               1982                   148                   
> org.apache.spark.sql.execution.datasources.PreprocessTableCreation                                 74091407               0                      742                    0                     
> org.apache.spark.sql.catalyst.analysis.CleanupAliases                                              73837147               37105855               1086                   344                   
> org.apache.spark.sql.catalyst.optimizer.RemoveRedundantProject                                     73534618               31752937               1875                   344                   
> org.apache.spark.sql.execution.datasources.v2.PushDownOperatorsToDataSource                        70120541               0                      285                    0                     
> org.apache.spark.sql.catalyst.optimizer.FoldablePropagation                                        67941776               0                      1590                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions                           62917712               22092402               1982                   23                    
> org.apache.spark.sql.catalyst.optimizer.CombineFilters                                             61116313               41021442               1590                   449                   
> org.apache.spark.sql.catalyst.optimizer.CollapseProject                                            60872313               30994661               1875                   279                   
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases                                     58453489               12511798               1982                   47                    
> org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions                                    58154315               0                      750                    0                     
> org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions                               54678669               0                      285                    0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences                           53518211               7209138                1982                   8                     
> org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates                                 45840637               29436271               285                    23                    
> org.apache.spark.sql.catalyst.optimizer.CollapseRepartition                                        43321502               0                      1590                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic                            42117785               0                      742                    0                     
> org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime                                         40843184               0                      285                    0                     
> org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion                                 39997563               5899863                1590                   10                    
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations                                   39412748               22359409               1990                   233                   
> org.apache.spark.sql.catalyst.optimizer.CombineUnions                                              38823264               1534424                1875                   17                    
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes                         38712372               7912192                1982                   9                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF                             38281659               0                      742                    0                     
> org.apache.spark.sql.catalyst.optimizer.DecimalAggregates                                          38277381               17245272               385                    100                   
> org.apache.spark.sql.execution.datasources.ResolveSQLOnFile                                        37342019               0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates                                   36958378               1207331                1982                   46                    
> org.apache.spark.sql.catalyst.optimizer.CombineLimits                                              36794793               0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.LimitPushDown                                              36378469               0                      1590                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance                                 34611065               0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast                                      33734785               0                      1982                   0                     
> org.apache.spark.sql.catalyst.optimizer.EliminateSorts                                             33731370               0                      1590                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy                  33251765               1395920                1982                   4                     
> org.apache.spark.sql.catalyst.optimizer.EliminateSerialization                                     30890996               0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.CollapseWindow                                             29512740               0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin                               29396498               1492235                300                    7                     
> org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery                                   29301037               21706110               285                    148                   
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy                           23819074               0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals                                23136089               10062248               788                    4                     
> org.apache.spark.sql.execution.datasources.PreprocessTableInsertion                                20886216               0                      742                    0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot                                       20639329               0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions                                 20293829               0                      1990                   0                     
> org.apache.spark.sql.catalyst.analysis.ResolveInlineTables                                         20255898               0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveBroadcastHints                          20250460               0                      750                    0                     
> org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer$CleanExpressions 19990727               39271                  8280                   26                    
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate                                    19578333               0                      1982                   0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases                       19414993               0                      1982                   0                     
> org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts                                     19291402               0                      285                    0                     
> org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions                                         18790135               0                      285                    0                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin                         18535762               0                      1982                   0                     
> org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects                                        17835919               0                      285                    0                     
> org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation                                     15200130               1525030                288                    3                     
> org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabase                                         14490778               0                      285                    0                     
> org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases                                    14021504               12790020               285                    215                   
> org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates                                  13439887               0                      285                    0                     
> org.apache.spark.sql.catalyst.analysis.EliminateBarriers                                           12336513               0                      1086                   0                     
> org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery                                           12082986               0                      285                    0                     
> org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences                                       10792280               0                      742                    0                     
> org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate                                8978897                0                      285                    0                     
> org.apache.spark.sql.catalyst.analysis.EliminateUnions                                             8886439                0                      788                    0                     
> org.apache.spark.sql.catalyst.analysis.AliasViewChild                                              8317231                0                      742                    0                     
> org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions                       7964788                184237                 286                    1                     
> org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution                                7396593                0                      788                    0                     
> org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints                                 6986385                0                      750                    0                     
> org.apache.spark.sql.catalyst.analysis.EliminateView                                               6518436                0                      285                    0                     
> org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation                                     6452598                0                      288                    0                     
> org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions                          5510866                0                      286                    0                     
> org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter                                    5393429                0                      300                    0                     
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateArrayOps                                     5296187                0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateStructOps                                    5261249                0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin                                  5152594                925260                 300                    1                     
> org.apache.spark.sql.catalyst.optimizer.CombineConcats                                             4916416                0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters                                        4810314                0                      285                    0                     
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateMapOps                                       4674195                0                      1590                   0                     
> org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate                               4406136                727433                 300                    15                    
> org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate                            4252456                0                      285                    0                     
> org.apache.spark.sql.catalyst.optimizer.EliminateDistinct                                          1920392                0                      285                    0                     
> org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder                                       1855658                0                      285                    0                     
>      
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org