You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@ambari.apache.org by Aravindan Vijayan <av...@hortonworks.com> on 2017/03/02 19:38:57 UTC
Review Request 57251: AMBARI-20276 : Perf - AMS scale test for 3000
node cluster
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57251/
-----------------------------------------------------------
Review request for Ambari, Dmytro Sen, Sumit Mohanty, and Sid Wagle.
Bugs: AMBARI-20276
https://issues.apache.org/jira/browse/AMBARI-20276
Repository: ambari
Description
-------
This Jira tracks effort of AMS load simulation testing and end goal is to capture the tuning required to get AMS and Grafana working on a 3K node load simulated metrics system.
INFERENCES
AMS has performance bottlenecks due to the 5 min Host aggregation and 2min cluster aggregator. The aggregators are not able to aggregate a huge amount of data in a monolithic way in which they are implemented now. AMS with whitelisted 500 metrics stays up for >1 day without Async processess getting queued up.
PATCH CONTENTS
Fixed issues in AMS load simulator.
Added HIVE metrics to simulated data and Metrics list that is used for calculating split points.
Added App based whitelisting for AMS.
Future work planned.
Metric Schema optimization.
Host aggregator minute & Cluster Aggregator Second optimization.
Diffs
-----
ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/LoadRunner.java 203a88bc
ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/MetricsLoadSimulator.java 09db9b5
ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/data/AppID.java a130171
ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricConfiguration.java ab1716a
ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricsFilter.java 0fe979e
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/AMS-HBASE.dat 63ac9f3
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/DATANODE.dat e157630
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/FLUME_HANDLER.dat bd5852f
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HIVEMETASTORE.dat PRE-CREATION
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HIVESERVER2.dat PRE-CREATION
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HOST.dat 9295692
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/NAMENODE.dat 6e98a9c
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/NODEMANAGER.dat 239b3d4
ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/RESOURCEMANAGER.dat ec698db
ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/jmetertest/jmetertest/AMSJMeterLoadTest.java c34ac20
ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricsFilterTest.java 73c66fe
ambari-metrics/ambari-metrics-timelineservice/src/test/resources/loadsimulator/README 39e5365
ambari-metrics/ambari-metrics-timelineservice/src/test/resources/loadsimulator/ams-jmeter.properties 2c44d89
ambari-metrics/ambari-metrics-timelineservice/src/test/resources/test_data/metric_whitelist.dat 9f5e25c
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/FLUME.txt b3bfec3
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HDFS.txt 84576e9
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HIVE.txt PRE-CREATION
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HOST.txt 4b759c6
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/KAFKA.txt 1e2017c
ambari-server/src/test/python/stacks/2.0.6/common/test_stack_advisor.py 157a582
ambari-server/src/test/python/stacks/2.2/common/test_stack_advisor.py fa97604
Diff: https://reviews.apache.org/r/57251/diff/1/
Testing
-------
Load simulation manually done.
App whitelisting unit tested.
Thanks,
Aravindan Vijayan
Re: Review Request 57251: AMBARI-20276 : Perf - AMS scale test for
3000 node cluster
Posted by Sid Wagle <sw...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57251/#review167751
-----------------------------------------------------------
Ship it!
Ship It!
- Sid Wagle
On March 2, 2017, 7:38 p.m., Aravindan Vijayan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57251/
> -----------------------------------------------------------
>
> (Updated March 2, 2017, 7:38 p.m.)
>
>
> Review request for Ambari, Dmytro Sen, Sumit Mohanty, and Sid Wagle.
>
>
> Bugs: AMBARI-20276
> https://issues.apache.org/jira/browse/AMBARI-20276
>
>
> Repository: ambari
>
>
> Description
> -------
>
> This Jira tracks effort of AMS load simulation testing and end goal is to capture the tuning required to get AMS and Grafana working on a 3K node load simulated metrics system.
>
> INFERENCES
> AMS has performance bottlenecks due to the 5 min Host aggregation and 2min cluster aggregator. The aggregators are not able to aggregate a huge amount of data in a monolithic way in which they are implemented now. AMS with whitelisted 500 metrics stays up for >1 day without Async processess getting queued up.
>
> PATCH CONTENTS
> Fixed issues in AMS load simulator.
> Added HIVE metrics to simulated data and Metrics list that is used for calculating split points.
> Added App based whitelisting for AMS.
>
> Future work planned.
> Metric Schema optimization.
> Host aggregator minute & Cluster Aggregator Second optimization.
>
>
> Diffs
> -----
>
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/LoadRunner.java 203a88bc
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/MetricsLoadSimulator.java 09db9b5
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/data/AppID.java a130171
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricConfiguration.java ab1716a
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricsFilter.java 0fe979e
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/AMS-HBASE.dat 63ac9f3
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/DATANODE.dat e157630
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/FLUME_HANDLER.dat bd5852f
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HIVEMETASTORE.dat PRE-CREATION
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HIVESERVER2.dat PRE-CREATION
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HOST.dat 9295692
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/NAMENODE.dat 6e98a9c
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/NODEMANAGER.dat 239b3d4
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/RESOURCEMANAGER.dat ec698db
> ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/jmetertest/jmetertest/AMSJMeterLoadTest.java c34ac20
> ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricsFilterTest.java 73c66fe
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/loadsimulator/README 39e5365
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/loadsimulator/ams-jmeter.properties 2c44d89
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/test_data/metric_whitelist.dat 9f5e25c
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/FLUME.txt b3bfec3
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HDFS.txt 84576e9
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HIVE.txt PRE-CREATION
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HOST.txt 4b759c6
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/KAFKA.txt 1e2017c
> ambari-server/src/test/python/stacks/2.0.6/common/test_stack_advisor.py 157a582
> ambari-server/src/test/python/stacks/2.2/common/test_stack_advisor.py fa97604
>
>
> Diff: https://reviews.apache.org/r/57251/diff/1/
>
>
> Testing
> -------
>
> Load simulation manually done.
> App whitelisting unit tested.
>
>
> Thanks,
>
> Aravindan Vijayan
>
>
Re: Review Request 57251: AMBARI-20276 : Perf - AMS scale test for
3000 node cluster
Posted by Aravindan Vijayan <av...@hortonworks.com>.
> On March 2, 2017, 8:18 p.m., Sid Wagle wrote:
> > Can we consolidate the simualtor files and the ones used for split point calculation? The stack should be only source of truth.
Created AMBARI-20283 to track the effort for the consolidation.
- Aravindan
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57251/#review167726
-----------------------------------------------------------
On March 2, 2017, 7:38 p.m., Aravindan Vijayan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57251/
> -----------------------------------------------------------
>
> (Updated March 2, 2017, 7:38 p.m.)
>
>
> Review request for Ambari, Dmytro Sen, Sumit Mohanty, and Sid Wagle.
>
>
> Bugs: AMBARI-20276
> https://issues.apache.org/jira/browse/AMBARI-20276
>
>
> Repository: ambari
>
>
> Description
> -------
>
> This Jira tracks effort of AMS load simulation testing and end goal is to capture the tuning required to get AMS and Grafana working on a 3K node load simulated metrics system.
>
> INFERENCES
> AMS has performance bottlenecks due to the 5 min Host aggregation and 2min cluster aggregator. The aggregators are not able to aggregate a huge amount of data in a monolithic way in which they are implemented now. AMS with whitelisted 500 metrics stays up for >1 day without Async processess getting queued up.
>
> PATCH CONTENTS
> Fixed issues in AMS load simulator.
> Added HIVE metrics to simulated data and Metrics list that is used for calculating split points.
> Added App based whitelisting for AMS.
>
> Future work planned.
> Metric Schema optimization.
> Host aggregator minute & Cluster Aggregator Second optimization.
>
>
> Diffs
> -----
>
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/LoadRunner.java 203a88bc
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/MetricsLoadSimulator.java 09db9b5
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/data/AppID.java a130171
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricConfiguration.java ab1716a
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricsFilter.java 0fe979e
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/AMS-HBASE.dat 63ac9f3
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/DATANODE.dat e157630
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/FLUME_HANDLER.dat bd5852f
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HIVEMETASTORE.dat PRE-CREATION
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HIVESERVER2.dat PRE-CREATION
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HOST.dat 9295692
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/NAMENODE.dat 6e98a9c
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/NODEMANAGER.dat 239b3d4
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/RESOURCEMANAGER.dat ec698db
> ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/jmetertest/jmetertest/AMSJMeterLoadTest.java c34ac20
> ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricsFilterTest.java 73c66fe
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/loadsimulator/README 39e5365
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/loadsimulator/ams-jmeter.properties 2c44d89
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/test_data/metric_whitelist.dat 9f5e25c
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/FLUME.txt b3bfec3
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HDFS.txt 84576e9
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HIVE.txt PRE-CREATION
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HOST.txt 4b759c6
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/KAFKA.txt 1e2017c
> ambari-server/src/test/python/stacks/2.0.6/common/test_stack_advisor.py 157a582
> ambari-server/src/test/python/stacks/2.2/common/test_stack_advisor.py fa97604
>
>
> Diff: https://reviews.apache.org/r/57251/diff/1/
>
>
> Testing
> -------
>
> Load simulation manually done.
> App whitelisting unit tested.
>
>
> Thanks,
>
> Aravindan Vijayan
>
>
Re: Review Request 57251: AMBARI-20276 : Perf - AMS scale test for
3000 node cluster
Posted by Sid Wagle <sw...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57251/#review167726
-----------------------------------------------------------
Can we consolidate the simualtor files and the ones used for split point calculation? The stack should be only source of truth.
- Sid Wagle
On March 2, 2017, 7:38 p.m., Aravindan Vijayan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57251/
> -----------------------------------------------------------
>
> (Updated March 2, 2017, 7:38 p.m.)
>
>
> Review request for Ambari, Dmytro Sen, Sumit Mohanty, and Sid Wagle.
>
>
> Bugs: AMBARI-20276
> https://issues.apache.org/jira/browse/AMBARI-20276
>
>
> Repository: ambari
>
>
> Description
> -------
>
> This Jira tracks effort of AMS load simulation testing and end goal is to capture the tuning required to get AMS and Grafana working on a 3K node load simulated metrics system.
>
> INFERENCES
> AMS has performance bottlenecks due to the 5 min Host aggregation and 2min cluster aggregator. The aggregators are not able to aggregate a huge amount of data in a monolithic way in which they are implemented now. AMS with whitelisted 500 metrics stays up for >1 day without Async processess getting queued up.
>
> PATCH CONTENTS
> Fixed issues in AMS load simulator.
> Added HIVE metrics to simulated data and Metrics list that is used for calculating split points.
> Added App based whitelisting for AMS.
>
> Future work planned.
> Metric Schema optimization.
> Host aggregator minute & Cluster Aggregator Second optimization.
>
>
> Diffs
> -----
>
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/LoadRunner.java 203a88bc
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/MetricsLoadSimulator.java 09db9b5
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/data/AppID.java a130171
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricConfiguration.java ab1716a
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricsFilter.java 0fe979e
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/AMS-HBASE.dat 63ac9f3
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/DATANODE.dat e157630
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/FLUME_HANDLER.dat bd5852f
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HIVEMETASTORE.dat PRE-CREATION
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HIVESERVER2.dat PRE-CREATION
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/HOST.dat 9295692
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/NAMENODE.dat 6e98a9c
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/NODEMANAGER.dat 239b3d4
> ambari-metrics/ambari-metrics-timelineservice/src/main/resources/metrics_def/RESOURCEMANAGER.dat ec698db
> ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/loadsimulator/jmetertest/jmetertest/AMSJMeterLoadTest.java c34ac20
> ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/TimelineMetricsFilterTest.java 73c66fe
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/loadsimulator/README 39e5365
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/loadsimulator/ams-jmeter.properties 2c44d89
> ambari-metrics/ambari-metrics-timelineservice/src/test/resources/test_data/metric_whitelist.dat 9f5e25c
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/FLUME.txt b3bfec3
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HDFS.txt 84576e9
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HIVE.txt PRE-CREATION
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/HOST.txt 4b759c6
> ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/files/service-metrics/KAFKA.txt 1e2017c
> ambari-server/src/test/python/stacks/2.0.6/common/test_stack_advisor.py 157a582
> ambari-server/src/test/python/stacks/2.2/common/test_stack_advisor.py fa97604
>
>
> Diff: https://reviews.apache.org/r/57251/diff/1/
>
>
> Testing
> -------
>
> Load simulation manually done.
> App whitelisting unit tested.
>
>
> Thanks,
>
> Aravindan Vijayan
>
>