You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuming Wang (Jira)" <ji...@apache.org> on 2019/10/23 06:21:00 UTC
[jira] [Commented] (SPARK-23171) Reduce the time costs of the rule
runs that do not change the plans
[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957591#comment-16957591 ]
Yuming Wang commented on SPARK-23171:
-------------------------------------
This is a real SQL in our production.
Spark 2.3.4:
{noformat}
=== Metrics of Analyzer/Optimizer Rules ===
Total number of runs: 1602
Total time: 25.87935196 seconds
Rule Effective Time / Total Time Effective Runs / Total Runs
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations 12560629829 / 12561649545 4 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions 0 / 10442916205 0 / 5
org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions 1655041748 / 1655084280 1 / 2
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences 217766453 / 256617622 8 / 21
org.apache.spark.sql.catalyst.analysis.DecimalPrecision 48636897 / 68610147 4 / 21
org.apache.spark.sql.catalyst.optimizer.ColumnPruning 16638517 / 53422588 1 / 15
org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion 26295695 / 50081268 2 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts 0 / 49518989 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings 24587790 / 49437868 2 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions 16488193 / 32838168 8 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator 0 / 32290369 0 / 21
org.apache.spark.sql.catalyst.analysis.ResolveTimeZone 18041546 / 29396487 10 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$FixNullability 0 / 28650276 0 / 5
org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations 0 / 26619605 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion 0 / 26206521 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion 0 / 25036412 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality 0 / 24896919 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division 0 / 23821725 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame 0 / 22621115 0 / 21
org.apache.spark.sql.catalyst.analysis.ResolveCreateNamedStruct 0 / 22397612 0 / 21
org.apache.spark.sql.catalyst.analysis.EliminateView 22255584 / 22286242 1 / 2
org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion 0 / 21244351 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion 0 / 21032406 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion 0 / 20834511 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder 0 / 20644371 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion 0 / 20097683 0 / 21
org.apache.spark.sql.catalyst.analysis.TimeWindowing 0 / 19899978 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion 0 / 19819768 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery 0 / 18257140 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates 0 / 17304713 0 / 21
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints 11616056 / 11622509 1 / 2
org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin 5286165 / 8730109 8 / 13
org.apache.spark.sql.catalyst.analysis.CleanupAliases 4824121 / 8711007 4 / 9
org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases 0 / 8613203 0 / 13
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvedUuidExpressions 0 / 8607610 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases 0 / 8565199 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic 0 / 7195917 0 / 5
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions 0 / 5482631 0 / 21
org.apache.spark.sql.catalyst.optimizer.PruneFilters 0 / 5380936 0 / 13
org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin 0 / 5235551 0 / 13
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer 0 / 4861476 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast 0 / 4687780 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences 0 / 4312972 0 / 21
org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator 0 / 4091024 0 / 13
org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF 0 / 3874934 0 / 5
org.apache.spark.sql.catalyst.optimizer.NullPropagation 0 / 3643868 0 / 13
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance 0 / 3625700 0 / 21
org.apache.spark.sql.catalyst.optimizer.BooleanSimplification 0 / 3510012 0 / 13
org.apache.spark.sql.catalyst.optimizer.ConstantFolding 1056087 / 3490484 1 / 13
org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals 0 / 3377191 0 / 13
org.apache.spark.sql.catalyst.optimizer.OptimizeIn 0 / 3308039 0 / 13
org.apache.spark.sql.catalyst.optimizer.SimplifyCasts 873010 / 3101463 1 / 13
org.apache.spark.sql.execution.datasources.FindDataSourceTable 1637731 / 2953801 2 / 21
org.apache.spark.sql.catalyst.optimizer.PushDownPredicate 2620047 / 2750308 9 / 13
org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison 0 / 2597265 0 / 13
org.apache.spark.sql.catalyst.optimizer.LikeSimplification 0 / 2545636 0 / 13
org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions 0 / 2531310 0 / 13
org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions 0 / 2448239 0 / 13
org.apache.spark.sql.hive.RelationConversions 2236164 / 2419843 2 / 5
org.apache.spark.sql.catalyst.optimizer.CollapseProject 1978996 / 2332715 1 / 15
org.apache.spark.sql.execution.datasources.v2.PushDownOperatorsToDataSource 0 / 2246673 0 / 2
org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery 0 / 2224274 0 / 13
org.apache.spark.sql.hive.ResolveHiveSerdeTable 0 / 2032678 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy 242553 / 2015957 2 / 21
org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions 0 / 1919405 0 / 2
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics 0 / 1839578 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions 0 / 1779501 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate 0 / 1723020 0 / 21
org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes 0 / 1708473 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin 0 / 1699486 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases 0 / 1663794 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy 0 / 1538037 0 / 21
org.apache.spark.sql.catalyst.analysis.ResolveInlineTables 0 / 1532276 0 / 21
org.apache.spark.sql.hive.DetermineTableStats 1330091 / 1506192 2 / 5
org.apache.spark.sql.execution.datasources.ResolveSQLOnFile 0 / 1425247 0 / 21
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot 0 / 1354366 0 / 21
org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions 0 / 1192548 0 / 21
org.apache.spark.sql.catalyst.optimizer.RemoveRedundantProject 0 / 1151354 0 / 15
org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates 0 / 1131560 0 / 2
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime 0 / 1063072 0 / 2
org.apache.spark.sql.catalyst.optimizer.FoldablePropagation 0 / 1056435 0 / 13
org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries 0 / 996221 0 / 2
org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabase 0 / 944891 0 / 2
org.apache.spark.sql.catalyst.optimizer.SimplifyCreateStructOps 0 / 883333 0 / 13
org.apache.spark.sql.catalyst.optimizer.CombineConcats 0 / 768398 0 / 13
org.apache.spark.sql.catalyst.optimizer.SimplifyCreateMapOps 0 / 735980 0 / 13
org.apache.spark.sql.catalyst.optimizer.SimplifyCreateArrayOps 0 / 712176 0 / 13
org.apache.spark.sql.catalyst.optimizer.ReorderJoin 0 / 690185 0 / 13
org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts 0 / 612831 0 / 2
org.apache.spark.sql.catalyst.optimizer.CollapseRepartition 0 / 602290 0 / 13
org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveBroadcastHints 0 / 564259 0 / 5
org.apache.spark.sql.execution.datasources.PreprocessTableCreation 0 / 525547 0 / 5
org.apache.spark.sql.catalyst.optimizer.ConstantPropagation 0 / 496077 0 / 13
org.apache.spark.sql.catalyst.optimizer.LimitPushDown 0 / 476777 0 / 13
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion 0 / 462829 0 / 13
org.apache.spark.sql.catalyst.optimizer.CombineUnions 0 / 438578 0 / 15
org.apache.spark.sql.catalyst.optimizer.EliminateSerialization 0 / 415754 0 / 13
org.apache.spark.sql.catalyst.optimizer.CombineLimits 0 / 407908 0 / 13
org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences 0 / 405535 0 / 5
org.apache.spark.sql.catalyst.optimizer.CollapseWindow 0 / 402997 0 / 13
org.apache.spark.sql.catalyst.optimizer.CombineFilters 0 / 398318 0 / 13
org.apache.spark.sql.execution.datasources.DataSourceAnalysis 0 / 386729 0 / 5
org.apache.spark.sql.catalyst.optimizer.EliminateSorts 0 / 376097 0 / 13
org.apache.spark.sql.execution.datasources.PreprocessTableInsertion 0 / 363548 0 / 5
org.apache.spark.sql.catalyst.optimizer.DecimalAggregates 0 / 361091 0 / 2
org.apache.spark.sql.hive.HiveAnalysis 0 / 346856 0 / 5
org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals 161014 / 320159 2 / 7
org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases 294073 / 301760 1 / 2
org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution 0 / 267902 0 / 7
org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects 0 / 266433 0 / 2
org.apache.spark.sql.catalyst.analysis.Analyzer$CTESubstitution 0 / 231803 0 / 7
org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints 0 / 217823 0 / 5
org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates 0 / 216419 0 / 2
org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions 0 / 213748 0 / 2
org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder 0 / 212135 0 / 2
org.apache.spark.sql.catalyst.analysis.EliminateUnions 0 / 189606 0 / 7
org.apache.spark.sql.catalyst.optimizer.EliminateDistinct 0 / 158202 0 / 2
org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery 128177 / 141614 1 / 2
org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate 0 / 118593 0 / 2
org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions 0 / 114453 0 / 2
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter 0 / 105244 0 / 2
org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation 0 / 96550 0 / 2
org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate 0 / 94772 0 / 2
org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin 0 / 86855 0 / 2
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin 0 / 76847 0 / 2
org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate 0 / 72060 0 / 2
org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters 0 / 63774 0 / 2
org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation 0 / 54526 0 / 2
org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery 0 / 7935 0 / 2
{noformat}
Spark current master:
{noformat}
=== Metrics of Analyzer/Optimizer Rules ===
Total number of runs: 809
Total time: 7.40435338 seconds
Rule Effective Time / Total Time Effective Runs / Total Runs
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations 3780875846 / 3781480020 2 / 10
org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions 1599320741 / 1599320741 1 / 1
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTables 0 / 1225490761 0 / 10
org.apache.spark.sql.catalyst.analysis.DecimalPrecision 35972508 / 60328972 2 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences 37943083 / 47606219 4 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion 0 / 43309422 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion 17173986 / 34062144 1 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings 16720631 / 33556337 1 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator 0 / 31747239 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts 0 / 30757304 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions 11944923 / 24268643 4 / 10
org.apache.spark.sql.catalyst.optimizer.ColumnPruning 9150218 / 22650020 1 / 6
org.apache.spark.sql.catalyst.analysis.ResolveTimeZone 16150346 / 22412466 5 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations 0 / 20843165 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion 0 / 20268419 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion 0 / 17699544 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality 0 / 17552654 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division 0 / 16521745 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion 0 / 16110248 0 / 10
org.apache.spark.sql.catalyst.analysis.ResolveCreateNamedStruct 0 / 13557604 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$MapZipWithCoercion 0 / 12606652 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion 0 / 12406786 0 / 10
org.apache.spark.sql.catalyst.analysis.ResolveHigherOrderFunctions 0 / 12277774 0 / 10
org.apache.spark.sql.catalyst.analysis.UpdateAttributeNullability 0 / 12139170 0 / 2
org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion 0 / 11727776 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame 0 / 11671694 0 / 10
org.apache.spark.sql.catalyst.analysis.TimeWindowing 0 / 11099451 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder 0 / 11092709 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery 0 / 10943998 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion 0 / 10812967 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates 0 / 9105516 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRandomSeed 0 / 7712591 0 / 10
org.apache.spark.sql.catalyst.optimizer.PushDownPredicates 4585267 / 6253772 2 / 5
org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic 0 / 6174724 0 / 2
org.apache.spark.sql.catalyst.analysis.EliminateView 6045809 / 6045809 1 / 1
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints 5772724 / 5772724 1 / 1
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases 0 / 5041788 0 / 10
org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases 0 / 4880449 0 / 4
org.apache.spark.sql.catalyst.optimizer.CollapseProject 3636936 / 4799040 1 / 5
org.apache.spark.sql.catalyst.optimizer.BooleanSimplification 0 / 4527543 0 / 4
org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF 0 / 4396391 0 / 2
org.apache.spark.sql.catalyst.analysis.CleanupAliases 2170995 / 4155269 2 / 4
org.apache.spark.sql.catalyst.analysis.ResolveLambdaVariables 0 / 3844354 0 / 10
org.apache.spark.sql.catalyst.optimizer.NullPropagation 0 / 3779934 0 / 4
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions 0 / 3768403 0 / 10
org.apache.spark.sql.dynamicpruning.PartitionPruning 0 / 3637668 0 / 1
org.apache.spark.sql.catalyst.optimizer.ConstantFolding 1966816 / 3547813 1 / 4
org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator 0 / 3408174 0 / 4
org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions 0 / 3180418 0 / 4
org.apache.spark.sql.catalyst.optimizer.RemoveNoopOperators 0 / 3155734 0 / 6
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences 0 / 2963209 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer 0 / 2899378 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions 0 / 2744482 0 / 2
org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison 0 / 2743307 0 / 4
org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals 0 / 2716056 0 / 4
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance 0 / 2586002 0 / 10
org.apache.spark.sql.catalyst.optimizer.CollapseWindow 0 / 2425166 0 / 4
org.apache.spark.sql.catalyst.optimizer.SimplifyCasts 816626 / 2417491 1 / 4
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion 0 / 2397208 0 / 4
org.apache.spark.sql.catalyst.optimizer.LimitPushDown 0 / 2328259 0 / 4
org.apache.spark.sql.catalyst.optimizer.SimplifyExtractValueOps 0 / 2302122 0 / 4
org.apache.spark.sql.execution.datasources.FindDataSourceTable 1551141 / 2274174 1 / 10
org.apache.spark.sql.catalyst.optimizer.PruneFilters 0 / 2231986 0 / 5
org.apache.spark.sql.catalyst.optimizer.OptimizeIn 0 / 2166996 0 / 4
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast 0 / 2122195 0 / 10
org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries 0 / 2106246 0 / 2
org.apache.spark.sql.catalyst.optimizer.ReplaceNullWithFalseInPredicate 0 / 2068681 0 / 4
org.apache.spark.sql.catalyst.optimizer.LikeSimplification 0 / 2022229 0 / 4
org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions 0 / 1971553 0 / 1
org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions 0 / 1901677 0 / 4
org.apache.spark.sql.catalyst.optimizer.EliminateResolvedHint 0 / 1803260 0 / 1
org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin 0 / 1760281 0 / 4
org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery 0 / 1717584 0 / 4
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime 0 / 1694752 0 / 1
org.apache.spark.sql.catalyst.optimizer.EliminateSerialization 0 / 1630254 0 / 4
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy 121702 / 1608980 1 / 10
org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabase 0 / 1527025 0 / 1
org.apache.spark.sql.execution.python.ExtractPythonUDFs 0 / 1485685 0 / 1
org.apache.spark.sql.catalyst.optimizer.ReorderJoin 0 / 1355912 0 / 4
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAlterTable 0 / 1351841 0 / 10
org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates 0 / 1342298 0 / 1
org.apache.spark.sql.catalyst.optimizer.CollapseRepartition 0 / 1332767 0 / 4
org.apache.spark.sql.hive.RelationConversions 1247829 / 1321295 1 / 2
org.apache.spark.sql.hive.ResolveHiveSerdeTable 0 / 1316894 0 / 10
org.apache.spark.sql.catalyst.optimizer.CombineUnions 0 / 1311768 0 / 5
org.apache.spark.sql.catalyst.optimizer.PushDownLeftSemiAntiJoin 0 / 1266489 0 / 4
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics 0 / 1225179 0 / 10
org.apache.spark.sql.catalyst.optimizer.TransposeWindow 0 / 1197747 0 / 4
org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects 0 / 1192535 0 / 1
org.apache.spark.sql.execution.datasources.DataSourceResolution 0 / 1094121 0 / 10
org.apache.spark.sql.catalyst.optimizer.PushLeftSemiLeftAntiThroughJoin 0 / 1093378 0 / 4
org.apache.spark.sql.catalyst.optimizer.ConstantPropagation 0 / 1079753 0 / 4
org.apache.spark.sql.catalyst.optimizer.CombineLimits 0 / 1061371 0 / 4
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate 0 / 1027652 0 / 10
org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases 1021359 / 1021359 1 / 1
org.apache.spark.sql.catalyst.optimizer.EliminateSorts 0 / 1002209 0 / 4
org.apache.spark.sql.catalyst.optimizer.CombineFilters 0 / 1001260 0 / 4
org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation 0 / 961299 0 / 2
org.apache.spark.sql.dynamicpruning.CleanupDynamicPruningFilters 0 / 951293 0 / 1
org.apache.spark.sql.catalyst.analysis.ResolveInlineTables 0 / 928231 0 / 10
org.apache.spark.sql.execution.datasources.ResolveSQLOnFile 0 / 917800 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases 0 / 907296 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy 0 / 902294 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot 0 / 895053 0 / 10
org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes 0 / 873187 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions 0 / 843732 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin 0 / 835185 0 / 10
org.apache.spark.sql.execution.datasources.FallBackFileSourceV2 0 / 827362 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOutputRelation 0 / 808076 0 / 10
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDescribeTable 0 / 727735 0 / 10
org.apache.spark.sql.catalyst.optimizer.ExtractPythonUDFFromJoinCondition 0 / 721076 0 / 1
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveInsertInto 0 / 696180 0 / 10
org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions 0 / 685500 0 / 10
org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation 0 / 668608 0 / 2
org.apache.spark.sql.catalyst.optimizer.NormalizeFloatingNumbers 0 / 640163 0 / 1
org.apache.spark.sql.hive.DetermineTableStats 539922 / 636902 1 / 2
org.apache.spark.sql.catalyst.optimizer.DecimalAggregates 0 / 617793 0 / 1
org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate 0 / 614443 0 / 1
org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery 551849 / 551849 1 / 1
org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions 0 / 534189 0 / 1
org.apache.spark.sql.catalyst.optimizer.ReassignLambdaVariableID 0 / 513273 0 / 1
org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates 0 / 493635 0 / 1
org.apache.spark.sql.execution.python.ExtractGroupingPythonUDFFromAggregate 0 / 481898 0 / 1
org.apache.spark.sql.catalyst.optimizer.FoldablePropagation 0 / 478526 0 / 4
org.apache.spark.sql.execution.analysis.DetectAmbiguousSelfJoin 0 / 475748 0 / 2
org.apache.spark.sql.execution.datasources.PreprocessTableInsertion 0 / 387768 0 / 2
org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions 0 / 379434 0 / 1
org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughNonJoin 0 / 369548 0 / 1
org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate 0 / 369068 0 / 1
org.apache.spark.sql.catalyst.optimizer.RewriteIntersectAll 0 / 358247 0 / 1
org.apache.spark.sql.catalyst.optimizer.CombineConcats 0 / 355029 0 / 4
org.apache.spark.sql.catalyst.optimizer.RewriteExceptAll 0 / 348300 0 / 1
org.apache.spark.sql.catalyst.optimizer.OptimizeLimitZero 0 / 347372 0 / 1
org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin 0 / 332544 0 / 1
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter 0 / 330958 0 / 1
org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate 0 / 328091 0 / 1
org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences 0 / 321893 0 / 2
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin 0 / 317206 0 / 1
org.apache.spark.sql.catalyst.optimizer.ObjectSerializerPruning 0 / 295286 0 / 1
org.apache.spark.sql.catalyst.optimizer.RemoveRedundantSorts 0 / 292161 0 / 1
org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters 0 / 288516 0 / 1
org.apache.spark.sql.execution.datasources.PreprocessTableCreation 0 / 285757 0 / 2
org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveJoinStrategyHints 0 / 275111 0 / 2
org.apache.spark.sql.hive.HiveAnalysis 0 / 232008 0 / 2
org.apache.spark.sql.execution.datasources.DataSourceAnalysis 0 / 216151 0 / 2
org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals 74290 / 168666 1 / 3
org.apache.spark.sql.catalyst.analysis.CTESubstitution 0 / 151684 0 / 3
org.apache.spark.sql.catalyst.optimizer.EliminateDistinct 0 / 131440 0 / 1
org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveCoalesceHints 0 / 116909 0 / 2
org.apache.spark.sql.catalyst.analysis.EliminateUnions 0 / 102132 0 / 3
org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution 0 / 97101 0 / 3
org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints 0 / 89386 0 / 2
org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder 0 / 85457 0 / 1
org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts 0 / 41539 0 / 2
org.apache.spark.sql.execution.datasources.SchemaPruning 0 / 19673 0 / 1
org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery 0 / 7795 0 / 1
{noformat}
> Reduce the time costs of the rule runs that do not change the plans
> --------------------------------------------------------------------
>
> Key: SPARK-23171
> URL: https://issues.apache.org/jira/browse/SPARK-23171
> Project: Spark
> Issue Type: Umbrella
> Components: SQL
> Affects Versions: 2.3.0
> Reporter: Xiao Li
> Priority: Major
> Labels: bulk-closed
>
> Below is the time stats of Analyzer/Optimizer rules. Try to improve the rules and reduce the time costs, especially for the runs that do not change the plans.
> {noformat}
> === Metrics of Analyzer/Optimizer Rules ===
> Total number of runs = 175827
> Total time: 20.699042877 seconds
> Rule Total Time Effective Time Total Runs Effective Runs
> org.apache.spark.sql.catalyst.optimizer.ColumnPruning 2340563794 1338268224 1875 761
> org.apache.spark.sql.catalyst.analysis.Analyzer$CTESubstitution 1632672623 1625071881 788 37
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions 1395087131 347339931 1982 38
> org.apache.spark.sql.catalyst.optimizer.PruneFilters 1177711364 21344174 1590 3
> org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries 1145135465 1131417128 285 39
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences 1008347217 663112062 1982 616
> org.apache.spark.sql.catalyst.optimizer.ReorderJoin 767024424 693001699 1590 132
> org.apache.spark.sql.catalyst.analysis.Analyzer$FixNullability 598524650 40802876 742 12
> org.apache.spark.sql.catalyst.analysis.DecimalPrecision 595384169 436153128 1982 211
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery 548178270 459695885 1982 49
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts 423002864 139869503 1982 86
> org.apache.spark.sql.catalyst.optimizer.BooleanSimplification 405544962 17250184 1590 7
> org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin 383837603 284174662 1590 708
> org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases 372901885 3362332 1590 9
> org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints 364628214 343815519 285 192
> org.apache.spark.sql.execution.datasources.FindDataSourceTable 303293296 285344766 1982 233
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions 233195019 92648171 1982 294
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion 220568919 73932736 1982 38
> org.apache.spark.sql.catalyst.optimizer.NullPropagation 207976072 9072305 1590 26
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings 207027618 37834145 1982 40
> org.apache.spark.sql.catalyst.optimizer.PushDownPredicate 203382836 176482044 1590 783
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion 192152216 15738573 1982 1
> org.apache.spark.sql.catalyst.optimizer.ConstantFolding 191624610 58857553 1590 126
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion 183008262 78280172 1982 29
> org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator 176935299 0 1982 0
> org.apache.spark.sql.catalyst.analysis.ResolveTimeZone 170161002 74354990 1982 417
> org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator 166173174 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.OptimizeIn 155410763 8197045 1590 16
> org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions 153726565 0 1590 0
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion 153013269 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyCasts 146693495 13537077 1590 69
> org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison 144818581 0 1590 0
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations 143943308 6889302 1982 27
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division 142925142 12653147 1982 8
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality 142775965 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals 141509150 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.LikeSimplification 132387762 636851 1590 1
> org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions 127412361 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame 126772671 9317887 1982 21
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion 116484407 0 1982 0
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion 115402736 0 1982 0
> org.apache.spark.sql.catalyst.analysis.ResolveCreateNamedStruct 115071447 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder 113115366 4563584 1982 14
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion 107747140 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin 105020607 13907906 1590 11
> org.apache.spark.sql.catalyst.analysis.TimeWindowing 101018029 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery 98043747 7044358 1590 7
> org.apache.spark.sql.catalyst.optimizer.ConstantPropagation 95173536 0 1590 0
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion 94134701 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics 84419135 33892351 1982 11
> org.apache.spark.sql.execution.datasources.DataSourceAnalysis 83297816 77023484 742 24
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer 77880196 36980636 1982 148
> org.apache.spark.sql.execution.datasources.PreprocessTableCreation 74091407 0 742 0
> org.apache.spark.sql.catalyst.analysis.CleanupAliases 73837147 37105855 1086 344
> org.apache.spark.sql.catalyst.optimizer.RemoveRedundantProject 73534618 31752937 1875 344
> org.apache.spark.sql.execution.datasources.v2.PushDownOperatorsToDataSource 70120541 0 285 0
> org.apache.spark.sql.catalyst.optimizer.FoldablePropagation 67941776 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions 62917712 22092402 1982 23
> org.apache.spark.sql.catalyst.optimizer.CombineFilters 61116313 41021442 1590 449
> org.apache.spark.sql.catalyst.optimizer.CollapseProject 60872313 30994661 1875 279
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases 58453489 12511798 1982 47
> org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions 58154315 0 750 0
> org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions 54678669 0 285 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences 53518211 7209138 1982 8
> org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates 45840637 29436271 285 23
> org.apache.spark.sql.catalyst.optimizer.CollapseRepartition 43321502 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic 42117785 0 742 0
> org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime 40843184 0 285 0
> org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion 39997563 5899863 1590 10
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations 39412748 22359409 1990 233
> org.apache.spark.sql.catalyst.optimizer.CombineUnions 38823264 1534424 1875 17
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes 38712372 7912192 1982 9
> org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF 38281659 0 742 0
> org.apache.spark.sql.catalyst.optimizer.DecimalAggregates 38277381 17245272 385 100
> org.apache.spark.sql.execution.datasources.ResolveSQLOnFile 37342019 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates 36958378 1207331 1982 46
> org.apache.spark.sql.catalyst.optimizer.CombineLimits 36794793 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.LimitPushDown 36378469 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance 34611065 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast 33734785 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.EliminateSorts 33731370 0 1590 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy 33251765 1395920 1982 4
> org.apache.spark.sql.catalyst.optimizer.EliminateSerialization 30890996 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.CollapseWindow 29512740 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin 29396498 1492235 300 7
> org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery 29301037 21706110 285 148
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy 23819074 0 1982 0
> org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals 23136089 10062248 788 4
> org.apache.spark.sql.execution.datasources.PreprocessTableInsertion 20886216 0 742 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot 20639329 0 1982 0
> org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions 20293829 0 1990 0
> org.apache.spark.sql.catalyst.analysis.ResolveInlineTables 20255898 0 1982 0
> org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveBroadcastHints 20250460 0 750 0
> org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer$CleanExpressions 19990727 39271 8280 26
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate 19578333 0 1982 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases 19414993 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts 19291402 0 285 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions 18790135 0 285 0
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin 18535762 0 1982 0
> org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects 17835919 0 285 0
> org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation 15200130 1525030 288 3
> org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabase 14490778 0 285 0
> org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases 14021504 12790020 285 215
> org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates 13439887 0 285 0
> org.apache.spark.sql.catalyst.analysis.EliminateBarriers 12336513 0 1086 0
> org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery 12082986 0 285 0
> org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences 10792280 0 742 0
> org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate 8978897 0 285 0
> org.apache.spark.sql.catalyst.analysis.EliminateUnions 8886439 0 788 0
> org.apache.spark.sql.catalyst.analysis.AliasViewChild 8317231 0 742 0
> org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions 7964788 184237 286 1
> org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution 7396593 0 788 0
> org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints 6986385 0 750 0
> org.apache.spark.sql.catalyst.analysis.EliminateView 6518436 0 285 0
> org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation 6452598 0 288 0
> org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions 5510866 0 286 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter 5393429 0 300 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateArrayOps 5296187 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateStructOps 5261249 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin 5152594 925260 300 1
> org.apache.spark.sql.catalyst.optimizer.CombineConcats 4916416 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters 4810314 0 285 0
> org.apache.spark.sql.catalyst.optimizer.SimplifyCreateMapOps 4674195 0 1590 0
> org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate 4406136 727433 300 15
> org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate 4252456 0 285 0
> org.apache.spark.sql.catalyst.optimizer.EliminateDistinct 1920392 0 285 0
> org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder 1855658 0 285 0
>
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org