You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/10/23 21:56:01 UTC
[jira] [Resolved] (IMPALA-4130) Severe lock contention introduced by Log4j causing severe slow down in planning and other phases

     [ https://issues.apache.org/jira/browse/IMPALA-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-4130.
-----------------------------------
    Resolution: Cannot Reproduce

> Severe lock contention introduced by Log4j causing severe slow down in planning and other phases
> ------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-4130
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4130
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.5.0
>            Reporter: Mostafa Mokhtar
>            Priority: Critical
>              Labels: performance, scalability
>         Attachments: IMPALA-4130-profile.txt, slow_planning_profile_1.jfr.zip
>
>
> When running high concurrency workload I noticed that query planning is taking unproportionally  longer than for single user.
> Upon profiling the Planner I noticed that lots of threads are contending on a lock in Log4j.
> By default Log4j synchronizes writes which via a lock which causes this contention.
> With default Impala configuration the planner and other components write lots of info/warn messages to the log which causes heavy contention under concurrency. 
> By changing "Verbose Log Level" to 0 and "Logging Threshold" to fatal concurrent queries used in the test got a 50% speedup.
> Planner timeline with default settings
> {code}
>     Planner Timeline: 51s007ms
>        - Analysis finished: 6s020ms (6s020ms)
>        - Equivalence classes computed: 6s224ms (203.960ms)
>        - Single node plan created: 39s137ms (32s912ms)
>        - Runtime filters computed: 39s138ms (1.798ms)
>        - Distributed plan created: 50s659ms (11s520ms)
>        - Lineage info computed: 50s660ms (1.157ms)
>        - Planning finished: 51s007ms (346.339ms)
> {code}
> Planner timeline with logging disable
> {code}
>     Planner Timeline: 161.667ms
>        - Analysis finished: 26.854ms (26.854ms)
>        - Equivalence classes computed: 27.760ms (906.310us)
>        - Single node plan created: 145.224ms (117.463ms)
>        - Runtime filters computed: 146.017ms (792.772us)
>        - Distributed plan created: 148.244ms (2.227ms)
>        - Lineage info computed: 149.043ms (798.857us)
>        - Planning finished: 161.667ms (12.623ms)
> {code}
>  
> |Stack Trace||Count||Duration|
> |org.apache.log4j.Category.callAppenders(LoggingEvent)|91,612|32,628,691,909,737|
> |   org.apache.log4j.Category.forcedLog(String, Priority, Object, Throwable)|91,612|32,628,691,909,737|
> |      org.apache.log4j.Category.log(String, Priority, Object, Throwable)|91,612|32,628,691,909,737|
> |         org.slf4j.impl.Log4jLoggerAdapter.debug(String)|66,573|23,520,171,003,106|
> |            com.cloudera.impala.planner.HdfsScanNode.computeStats(Analyzer)|16,545|5,762,818,131,400|
> |            com.cloudera.impala.planner.DistributedPlanner.createHashJoinFragment(HashJoinNode, PlanFragment, PlanFragment, long, ArrayList)|15,689|5,615,955,980,208|
> |            com.cloudera.impala.planner.JoinNode.computeStats(Analyzer)|12,864|4,716,775,262,309|
> |            com.cloudera.impala.planner.HdfsScanNode.computeNumNodes(Analyzer, long)|5,528|1,888,005,994,277|
> |            com.cloudera.impala.planner.SingleNodePlanner.createJoinPlan(Analyzer, TableRef, List, List)|5,212|1,818,139,839,257|
> |            com.cloudera.impala.analysis.SelectStmt.analyze(Analyzer)|2,754|953,068,475,018|
> |            com.cloudera.impala.analysis.SelectStmt.analyzeAggregation(Analyzer)|2,755|950,232,280,428|
> |            com.cloudera.impala.analysis.AggregateInfo.create(ArrayList, ArrayList, TupleDescriptor, Analyzer)|2,745|945,034,393,921|
> |            com.cloudera.impala.service.Frontend.createExecRequest(TQueryCtx, StringBuilder)|651|224,074,644,208|
> |            com.cloudera.impala.planner.Planner.createPlan()|449|155,274,962,910|
> |            com.cloudera.impala.planner.Planner.computeResourceReqs(List, boolean, TQueryExecRequest)|419|154,737,940,541|
> |            com.cloudera.impala.planner.DistributedPlanner.createPlanFragments(PlanNode)|408|146,583,876,650|
> |            com.cloudera.impala.service.Frontend.analyzeStmt(TQueryCtx)|312|101,646,742,787|
> |            com.cloudera.impala.service.JniFrontend.createExecRequest(byte[])|211|76,250,810,597|
> |            com.cloudera.impala.planner.UnionNode.computeStats(Analyzer)|31|11,571,668,595|
> |         org.slf4j.impl.Log4jLoggerAdapter.warn(String)|23,955|8,755,149,728,646|
> |            com.cloudera.impala.planner.PlanNode.multiplyCardinalities(long, long)|23,955|8,755,149,728,646|
> |         org.slf4j.impl.Log4jLoggerAdapter.info(String, Object[])|1,084|353,371,177,985|
> Possible workaround 
> https://logging.apache.org/log4j/2.x/manual/async.html
> JMC profile attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org