You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2018/01/21 14:24:00 UTC
[jira] [Created] (SPARK-23171) Reduce the time costs of the rule
runs that do not change the plans
Xiao Li created SPARK-23171:
-------------------------------
Summary: Reduce the time costs of the rule runs that do not change the plans
Key: SPARK-23171
URL: https://issues.apache.org/jira/browse/SPARK-23171
Project: Spark
Issue Type: Umbrella
Components: SQL
Affects Versions: 2.3.0
Reporter: Xiao Li
Below is the time stats of Analyzer/Optimizer rules. Try to improve the rules and reduce the time costs, especially for the runs that do not change the plans.
{noformat}
=== Metrics of Analyzer/Optimizer Rules ===
Total number of runs = 175827
Total time: 20.699042877 seconds
Rule Total Time Effective Time Total Runs Effective Runs
org.apache.spark.sql.catalyst.optimizer.ColumnPruning 2340563794 1338268224 1875 761
org.apache.spark.sql.catalyst.analysis.Analyzer$CTESubstitution 1632672623 1625071881 788 37
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions 1395087131 347339931 1982 38
org.apache.spark.sql.catalyst.optimizer.PruneFilters 1177711364 21344174 1590 3
org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries 1145135465 1131417128 285 39
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences 1008347217 663112062 1982 616
org.apache.spark.sql.catalyst.optimizer.ReorderJoin 767024424 693001699 1590 132
org.apache.spark.sql.catalyst.analysis.Analyzer$FixNullability 598524650 40802876 742 12
org.apache.spark.sql.catalyst.analysis.DecimalPrecision 595384169 436153128 1982 211
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery 548178270 459695885 1982 49
org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts 423002864 139869503 1982 86
org.apache.spark.sql.catalyst.optimizer.BooleanSimplification 405544962 17250184 1590 7
org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin 383837603 284174662 1590 708
org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases 372901885 3362332 1590 9
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints 364628214 343815519 285 192
org.apache.spark.sql.execution.datasources.FindDataSourceTable 303293296 285344766 1982 233
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions 233195019 92648171 1982 294
org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion 220568919 73932736 1982 38
org.apache.spark.sql.catalyst.optimizer.NullPropagation 207976072 9072305 1590 26
org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings 207027618 37834145 1982 40
org.apache.spark.sql.catalyst.optimizer.PushDownPredicate 203382836 176482044 1590 783
org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion 192152216 15738573 1982 1
org.apache.spark.sql.catalyst.optimizer.ConstantFolding 191624610 58857553 1590 126
org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion 183008262 78280172 1982 29
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator 176935299 0 1982 0
org.apache.spark.sql.catalyst.analysis.ResolveTimeZone 170161002 74354990 1982 417
org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator 166173174 0 1590 0
org.apache.spark.sql.catalyst.optimizer.OptimizeIn 155410763 8197045 1590 16
org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions 153726565 0 1590 0
org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion 153013269 0 1982 0
org.apache.spark.sql.catalyst.optimizer.SimplifyCasts 146693495 13537077 1590 69
org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison 144818581 0 1590 0
org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations 143943308 6889302 1982 27
org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division 142925142 12653147 1982 8
org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality 142775965 0 1982 0
org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals 141509150 0 1590 0
org.apache.spark.sql.catalyst.optimizer.LikeSimplification 132387762 636851 1590 1
org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions 127412361 0 1590 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame 126772671 9317887 1982 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion 116484407 0 1982 0
org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion 115402736 0 1982 0
org.apache.spark.sql.catalyst.analysis.ResolveCreateNamedStruct 115071447 0 1982 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder 113115366 4563584 1982 14
org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion 107747140 0 1982 0
org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin 105020607 13907906 1590 11
org.apache.spark.sql.catalyst.analysis.TimeWindowing 101018029 0 1982 0
org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery 98043747 7044358 1590 7
org.apache.spark.sql.catalyst.optimizer.ConstantPropagation 95173536 0 1590 0
org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion 94134701 0 1982 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics 84419135 33892351 1982 11
org.apache.spark.sql.execution.datasources.DataSourceAnalysis 83297816 77023484 742 24
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer 77880196 36980636 1982 148
org.apache.spark.sql.execution.datasources.PreprocessTableCreation 74091407 0 742 0
org.apache.spark.sql.catalyst.analysis.CleanupAliases 73837147 37105855 1086 344
org.apache.spark.sql.catalyst.optimizer.RemoveRedundantProject 73534618 31752937 1875 344
org.apache.spark.sql.execution.datasources.v2.PushDownOperatorsToDataSource 70120541 0 285 0
org.apache.spark.sql.catalyst.optimizer.FoldablePropagation 67941776 0 1590 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions 62917712 22092402 1982 23
org.apache.spark.sql.catalyst.optimizer.CombineFilters 61116313 41021442 1590 449
org.apache.spark.sql.catalyst.optimizer.CollapseProject 60872313 30994661 1875 279
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases 58453489 12511798 1982 47
org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions 58154315 0 750 0
org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions 54678669 0 285 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences 53518211 7209138 1982 8
org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates 45840637 29436271 285 23
org.apache.spark.sql.catalyst.optimizer.CollapseRepartition 43321502 0 1590 0
org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic 42117785 0 742 0
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime 40843184 0 285 0
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion 39997563 5899863 1590 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations 39412748 22359409 1990 233
org.apache.spark.sql.catalyst.optimizer.CombineUnions 38823264 1534424 1875 17
org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes 38712372 7912192 1982 9
org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF 38281659 0 742 0
org.apache.spark.sql.catalyst.optimizer.DecimalAggregates 38277381 17245272 385 100
org.apache.spark.sql.execution.datasources.ResolveSQLOnFile 37342019 0 1982 0
org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates 36958378 1207331 1982 46
org.apache.spark.sql.catalyst.optimizer.CombineLimits 36794793 0 1590 0
org.apache.spark.sql.catalyst.optimizer.LimitPushDown 36378469 0 1590 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance 34611065 0 1982 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast 33734785 0 1982 0
org.apache.spark.sql.catalyst.optimizer.EliminateSorts 33731370 0 1590 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy 33251765 1395920 1982 4
org.apache.spark.sql.catalyst.optimizer.EliminateSerialization 30890996 0 1590 0
org.apache.spark.sql.catalyst.optimizer.CollapseWindow 29512740 0 1590 0
org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin 29396498 1492235 300 7
org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery 29301037 21706110 285 148
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy 23819074 0 1982 0
org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals 23136089 10062248 788 4
org.apache.spark.sql.execution.datasources.PreprocessTableInsertion 20886216 0 742 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot 20639329 0 1982 0
org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions 20293829 0 1990 0
org.apache.spark.sql.catalyst.analysis.ResolveInlineTables 20255898 0 1982 0
org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveBroadcastHints 20250460 0 750 0
org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer$CleanExpressions 19990727 39271 8280 26
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate 19578333 0 1982 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases 19414993 0 1982 0
org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts 19291402 0 285 0
org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions 18790135 0 285 0
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin 18535762 0 1982 0
org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects 17835919 0 285 0
org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation 15200130 1525030 288 3
org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabase 14490778 0 285 0
org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases 14021504 12790020 285 215
org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates 13439887 0 285 0
org.apache.spark.sql.catalyst.analysis.EliminateBarriers 12336513 0 1086 0
org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery 12082986 0 285 0
org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences 10792280 0 742 0
org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate 8978897 0 285 0
org.apache.spark.sql.catalyst.analysis.EliminateUnions 8886439 0 788 0
org.apache.spark.sql.catalyst.analysis.AliasViewChild 8317231 0 742 0
org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions 7964788 184237 286 1
org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution 7396593 0 788 0
org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints 6986385 0 750 0
org.apache.spark.sql.catalyst.analysis.EliminateView 6518436 0 285 0
org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation 6452598 0 288 0
org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions 5510866 0 286 0
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter 5393429 0 300 0
org.apache.spark.sql.catalyst.optimizer.SimplifyCreateArrayOps 5296187 0 1590 0
org.apache.spark.sql.catalyst.optimizer.SimplifyCreateStructOps 5261249 0 1590 0
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin 5152594 925260 300 1
org.apache.spark.sql.catalyst.optimizer.CombineConcats 4916416 0 1590 0
org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters 4810314 0 285 0
org.apache.spark.sql.catalyst.optimizer.SimplifyCreateMapOps 4674195 0 1590 0
org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate 4406136 727433 300 15
org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate 4252456 0 285 0
org.apache.spark.sql.catalyst.optimizer.EliminateDistinct 1920392 0 285 0
org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder 1855658 0 285 0
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org