You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/10/23 21:56:01 UTC
[jira] [Resolved] (IMPALA-4130) Severe lock contention introduced
by Log4j causing severe slow down in planning and other phases
[ https://issues.apache.org/jira/browse/IMPALA-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong resolved IMPALA-4130.
-----------------------------------
Resolution: Cannot Reproduce
> Severe lock contention introduced by Log4j causing severe slow down in planning and other phases
> ------------------------------------------------------------------------------------------------
>
> Key: IMPALA-4130
> URL: https://issues.apache.org/jira/browse/IMPALA-4130
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 2.5.0
> Reporter: Mostafa Mokhtar
> Priority: Critical
> Labels: performance, scalability
> Attachments: IMPALA-4130-profile.txt, slow_planning_profile_1.jfr.zip
>
>
> When running high concurrency workload I noticed that query planning is taking unproportionally longer than for single user.
> Upon profiling the Planner I noticed that lots of threads are contending on a lock in Log4j.
> By default Log4j synchronizes writes which via a lock which causes this contention.
> With default Impala configuration the planner and other components write lots of info/warn messages to the log which causes heavy contention under concurrency.
> By changing "Verbose Log Level" to 0 and "Logging Threshold" to fatal concurrent queries used in the test got a 50% speedup.
> Planner timeline with default settings
> {code}
> Planner Timeline: 51s007ms
> - Analysis finished: 6s020ms (6s020ms)
> - Equivalence classes computed: 6s224ms (203.960ms)
> - Single node plan created: 39s137ms (32s912ms)
> - Runtime filters computed: 39s138ms (1.798ms)
> - Distributed plan created: 50s659ms (11s520ms)
> - Lineage info computed: 50s660ms (1.157ms)
> - Planning finished: 51s007ms (346.339ms)
> {code}
> Planner timeline with logging disable
> {code}
> Planner Timeline: 161.667ms
> - Analysis finished: 26.854ms (26.854ms)
> - Equivalence classes computed: 27.760ms (906.310us)
> - Single node plan created: 145.224ms (117.463ms)
> - Runtime filters computed: 146.017ms (792.772us)
> - Distributed plan created: 148.244ms (2.227ms)
> - Lineage info computed: 149.043ms (798.857us)
> - Planning finished: 161.667ms (12.623ms)
> {code}
>
> |Stack Trace||Count||Duration|
> |org.apache.log4j.Category.callAppenders(LoggingEvent)|91,612|32,628,691,909,737|
> | org.apache.log4j.Category.forcedLog(String, Priority, Object, Throwable)|91,612|32,628,691,909,737|
> | org.apache.log4j.Category.log(String, Priority, Object, Throwable)|91,612|32,628,691,909,737|
> | org.slf4j.impl.Log4jLoggerAdapter.debug(String)|66,573|23,520,171,003,106|
> | com.cloudera.impala.planner.HdfsScanNode.computeStats(Analyzer)|16,545|5,762,818,131,400|
> | com.cloudera.impala.planner.DistributedPlanner.createHashJoinFragment(HashJoinNode, PlanFragment, PlanFragment, long, ArrayList)|15,689|5,615,955,980,208|
> | com.cloudera.impala.planner.JoinNode.computeStats(Analyzer)|12,864|4,716,775,262,309|
> | com.cloudera.impala.planner.HdfsScanNode.computeNumNodes(Analyzer, long)|5,528|1,888,005,994,277|
> | com.cloudera.impala.planner.SingleNodePlanner.createJoinPlan(Analyzer, TableRef, List, List)|5,212|1,818,139,839,257|
> | com.cloudera.impala.analysis.SelectStmt.analyze(Analyzer)|2,754|953,068,475,018|
> | com.cloudera.impala.analysis.SelectStmt.analyzeAggregation(Analyzer)|2,755|950,232,280,428|
> | com.cloudera.impala.analysis.AggregateInfo.create(ArrayList, ArrayList, TupleDescriptor, Analyzer)|2,745|945,034,393,921|
> | com.cloudera.impala.service.Frontend.createExecRequest(TQueryCtx, StringBuilder)|651|224,074,644,208|
> | com.cloudera.impala.planner.Planner.createPlan()|449|155,274,962,910|
> | com.cloudera.impala.planner.Planner.computeResourceReqs(List, boolean, TQueryExecRequest)|419|154,737,940,541|
> | com.cloudera.impala.planner.DistributedPlanner.createPlanFragments(PlanNode)|408|146,583,876,650|
> | com.cloudera.impala.service.Frontend.analyzeStmt(TQueryCtx)|312|101,646,742,787|
> | com.cloudera.impala.service.JniFrontend.createExecRequest(byte[])|211|76,250,810,597|
> | com.cloudera.impala.planner.UnionNode.computeStats(Analyzer)|31|11,571,668,595|
> | org.slf4j.impl.Log4jLoggerAdapter.warn(String)|23,955|8,755,149,728,646|
> | com.cloudera.impala.planner.PlanNode.multiplyCardinalities(long, long)|23,955|8,755,149,728,646|
> | org.slf4j.impl.Log4jLoggerAdapter.info(String, Object[])|1,084|353,371,177,985|
> Possible workaround
> https://logging.apache.org/log4j/2.x/manual/async.html
> JMC profile attached.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org