You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2018/01/21 14:25:00 UTC
[jira] [Comment Edited] (SPARK-23171) Reduce the time costs of the
rule runs that do not change the plans
[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333540#comment-16333540 ]
Xiao Li edited comment on SPARK-23171 at 1/21/18 2:24 PM:
----------------------------------------------------------
cc [~maropu] This is an umbrella JIRA. Feel free to create sub-tasks
was (Author: smilegator):
cc [~maropu]
> Reduce the time costs of the rule runs that do not change the plans
> --------------------------------------------------------------------
>
> Key: SPARK-23171
> URL: https://issues.apache.org/jira/browse/SPARK-23171
> Project: Spark
> Issue Type: Umbrella
> Components: SQL
> Affects Versions: 2.3.0
> Reporter: Xiao Li
> Priority: Major
>
> Below is the time stats of Analyzer/Optimizer rules. Try to improve the rules and reduce the time costs, especially for the runs that do not change the plans.
> {noformat}
> === Metrics of Analyzer/Optimizer Rules ===
> Total number of runs = 175827
> Total time: 20.699042877 seconds
> Rule Total Time Effective Time Total Runs Effective Runs
> org.apache.spark.sql.catalyst.optimizer.ColumnPruning 2340563794 1338268224 1875 761
> org.apache.spark.sql.catalyst.analysis.Analyzer$CTESubstitution 1632672623 1625071881 788 37
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions 1395087131 347339931 1982 38
> org.apache.spark.sql.catalyst.optimizer.PruneFilters 1177711364 21344174 1590 3
> org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries 1145135465 1131417128 285 39
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences 1008347217 663112062 1982 616
> org.apache.spark.sql.catalyst.optimizer.ReorderJoin 767024424 693001699 1590 132
> org.apache.spark.sql.catalyst.analysis.Analyzer$FixNullability 598524650 40802876 742 12
> org.apache.spark.sql.catalyst.analysis.DecimalPrecision 595384169 436153128 1982 211
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery 548178270 459695885 1982 49
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts 423002864 139869503 1982 86
> org.apache.spark.sql.catalyst.optimizer.BooleanSimplification 405544962 17250184 1590 7
> org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin 383837603 284174662 1590 708
> org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases 372901885 3362332 1590 9
> org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints 364628214 343815519 285 192
> org.apache.spark.sql.execution.datasources.FindDataSourceTable 303293296 285344766 1982 233
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions 233195019 92648171 1982 294
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion 220568919 73932736 1982 38
> org.apache.spark.sql.catalyst.optimizer.NullPropagation 207976072 9072305 1590 26
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings 207027618 37834145 1982 40
> org.apache.spark.sql.catalyst.optimizer.PushDownPredicate 203382836 176482044 1590 783
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion 192152216 15738573 1982 1
> org.apache.spark.sql.catalyst.optimizer.ConstantFolding 191624610 58857553 1590 126
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion 183008262 78280172 1982 29
> org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator 176935299 0 1982 0
> org.apache.spark.sql.catalyst.analysis.ResolveTimeZone 170161002 74354990 1982 417
> org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator 166173174 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.OptimizeIn 155410763 8197045 1590 16
> org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions 153726565 0 1590 0
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion 153013269 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyCasts 146693495 13537077 1590 69
> org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison 144818581 0 1590 0
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations 143943308 6889302 1982 27
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division 142925142 12653147 1982 8
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality 142775965 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals 141509150 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.LikeSimplification 132387762 636851 1590 1
> org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions 127412361 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame 126772671 9317887 1982 21
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion 116484407 0 1982 0
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion 115402736 0 1982 0
> org.apache.spark.sql.catalyst.analysis.ResolveCreateNamedStruct 115071447 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder 113115366 4563584 1982 14
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion 107747140 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin 105020607 13907906 1590 11
> org.apache.spark.sql.catalyst.analysis.TimeWindowing 101018029 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery 98043747 7044358 1590 7
> org.apache.spark.sql.catalyst.optimizer.ConstantPropagation 95173536 0 1590 0
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion 94134701 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics 84419135 33892351 1982 11
> org.apache.spark.sql.execution.datasources.DataSourceAnalysis 83297816 77023484 742 24
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer 77880196 36980636 1982 148
> org.apache.spark.sql.execution.datasources.PreprocessTableCreation 74091407 0 742 0
> org.apache.spark.sql.catalyst.analysis.CleanupAliases 73837147 37105855 1086 344
> org.apache.spark.sql.catalyst.optimizer.RemoveRedundantProject 73534618 31752937 1875 344
> org.apache.spark.sql.execution.datasources.v2.PushDownOperatorsToDataSource 70120541 0 285 0
> org.apache.spark.sql.catalyst.optimizer.FoldablePropagation 67941776 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions 62917712 22092402 1982 23
> org.apache.spark.sql.catalyst.optimizer.CombineFilters 61116313 41021442 1590 449
> org.apache.spark.sql.catalyst.optimizer.CollapseProject 60872313 30994661 1875 279
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases 58453489 12511798 1982 47
> org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions 58154315 0 750 0
> org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions 54678669 0 285 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences 53518211 7209138 1982 8
> org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates 45840637 29436271 285 23
> org.apache.spark.sql.catalyst.optimizer.CollapseRepartition 43321502 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic 42117785 0 742 0
> org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime 40843184 0 285 0
> org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion 39997563 5899863 1590 10
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations 39412748 22359409 1990 233
> org.apache.spark.sql.catalyst.optimizer.CombineUnions 38823264 1534424 1875 17
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes 38712372 7912192 1982 9
> org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF 38281659 0 742 0
> org.apache.spark.sql.catalyst.optimizer.DecimalAggregates 38277381 17245272 385 100
> org.apache.spark.sql.execution.datasources.ResolveSQLOnFile 37342019 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates 36958378 1207331 1982 46
> org.apache.spark.sql.catalyst.optimizer.CombineLimits 36794793 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.LimitPushDown 36378469 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance 34611065 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast 33734785 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.EliminateSorts 33731370 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy 33251765 1395920 1982 4
> org.apache.spark.sql.catalyst.optimizer.EliminateSerialization 30890996 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.CollapseWindow 29512740 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin 29396498 1492235 300 7
> org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery 29301037 21706110 285 148
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy 23819074 0 1982 0
> org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals 23136089 10062248 788 4
> org.apache.spark.sql.execution.datasources.PreprocessTableInsertion 20886216 0 742 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot 20639329 0 1982 0
> org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions 20293829 0 1990 0
> org.apache.spark.sql.catalyst.analysis.ResolveInlineTables 20255898 0 1982 0
> org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveBroadcastHints 20250460 0 750 0
> org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer$CleanExpressions 19990727 39271 8280 26
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate 19578333 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases 19414993 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts 19291402 0 285 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions 18790135 0 285 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin 18535762 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects 17835919 0 285 0
> org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation 15200130 1525030 288 3
> org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabase 14490778 0 285 0
> org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases 14021504 12790020 285 215
> org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates 13439887 0 285 0
> org.apache.spark.sql.catalyst.analysis.EliminateBarriers 12336513 0 1086 0
> org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery 12082986 0 285 0
> org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences 10792280 0 742 0
> org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate 8978897 0 285 0
> org.apache.spark.sql.catalyst.analysis.EliminateUnions 8886439 0 788 0
> org.apache.spark.sql.catalyst.analysis.AliasViewChild 8317231 0 742 0
> org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions 7964788 184237 286 1
> org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution 7396593 0 788 0
> org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints 6986385 0 750 0
> org.apache.spark.sql.catalyst.analysis.EliminateView 6518436 0 285 0
> org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation 6452598 0 288 0
> org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions 5510866 0 286 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter 5393429 0 300 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateArrayOps 5296187 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateStructOps 5261249 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin 5152594 925260 300 1
> org.apache.spark.sql.catalyst.optimizer.CombineConcats 4916416 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters 4810314 0 285 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateMapOps 4674195 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate 4406136 727433 300 15
> org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate 4252456 0 285 0
> org.apache.spark.sql.catalyst.optimizer.EliminateDistinct 1920392 0 285 0
> org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder 1855658 0 285 0
>
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org