You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2016/11/15 20:19:59 UTC

[jira] [Commented] (AMBARI-16828) Support round-robin scheduling with failover for Sinks with distributed collector

    [ https://issues.apache.org/jira/browse/AMBARI-16828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15668176#comment-15668176 ] 

Hudson commented on AMBARI-16828:
---------------------------------

FAILURE: Integrated in Jenkins build Ambari-branch-2.5 #328 (See [https://builds.apache.org/job/Ambari-branch-2.5/328/])
AMBARI-16828. Support round-robin scheduling with failover for Sinks (avijayan: [http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=954e61ee07416bc20b52bfda84dde5ea57876130])
* (edit) ambari-metrics/ambari-metrics-storm-sink/src/main/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsSink.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/HBaseTimelineMetricStore.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TimelineWebServices.java
* (edit) ambari-server/src/main/resources/stacks/HDP/2.0.6/hooks/before-START/templates/hadoop-metrics2.properties.j2
* (edit) ambari-logsearch/ambari-logsearch-portal/src/main/java/org/apache/ambari/logsearch/solr/metrics/SolrAmsClient.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/v2/TimelineMetricHostAggregator.java
* (delete) ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/availability/TimelineMetricHAControllerTest.java
* (edit) ambari-metrics/ambari-metrics-common/pom.xml
* (edit) ambari-metrics/ambari-metrics-storm-sink/src/main/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsReporter.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/availability/AggregationTaskRunner.java
* (delete) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/availability/TimelineMetricHAController.java
* (edit) ambari-server/src/main/resources/stacks/HDP/2.0.6/hooks/before-START/scripts/params.py
* (edit) ambari-metrics/ambari-metrics-kafka-sink/src/test/java/org/apache/hadoop/metrics2/sink/kafka/KafkaTimelineMetricsReporterTest.java
* (add) ambari-metrics/ambari-metrics-common/src/main/java/org/apache/hadoop/metrics2/sink/timeline/availability/MetricSinkWriteShardStrategy.java
* (add) ambari-metrics/ambari-metrics-common/src/test/java/org/apache/hadoop/metrics2/sink/timeline/availability/MetricCollectorHATest.java
* (add) ambari-metrics/ambari-metrics-timelineservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/availability/MetricCollectorHAControllerTest.java
* (add) ambari-metrics/ambari-metrics-common/src/main/java/org/apache/hadoop/metrics2/sink/timeline/MetricsSinkInitializationException.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/TimelineMetricHostAggregator.java
* (edit) ambari-server/src/main/resources/common-services/FLUME/1.4.0.2.0/package/scripts/params.py
* (edit) ambari-metrics/ambari-metrics-storm-sink-legacy/src/main/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsReporter.java
* (add) ambari-metrics/ambari-metrics-common/src/main/java/org/apache/hadoop/metrics2/sink/timeline/availability/MetricSinkWriteShardHostnameHashingStrategy.java
* (edit) ambari-metrics/ambari-metrics-hadoop-sink/src/main/java/org/apache/hadoop/metrics2/sink/timeline/HadoopTimelineMetricsSink.java
* (edit) ambari-server/src/main/resources/common-services/STORM/0.9.1/package/scripts/params_linux.py
* (add) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/availability/MetricCollectorHAController.java
* (edit) ambari-metrics/ambari-metrics-common/src/test/java/org/apache/hadoop/metrics2/sink/timeline/cache/HandleConnectExceptionTest.java
* (edit) ambari-server/src/main/resources/common-services/STORM/0.9.1/package/templates/storm-metrics2.properties.j2
* (edit) ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/package/templates/hadoop-metrics2-hbase.properties-GANGLIA-MASTER.j2
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/TimelineMetricClusterAggregator.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/AbstractTimelineAggregator.java
* (add) ambari-metrics/ambari-metrics-common/src/main/java/org/apache/hadoop/metrics2/sink/timeline/availability/MetricCollectorUnavailableException.java
* (edit) ambari-metrics/ambari-metrics-storm-sink-legacy/src/main/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsSink.java
* (edit) ambari-metrics/ambari-metrics-storm-sink/src/test/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsSinkTest.java
* (edit) ambari-server/src/main/resources/common-services/FLUME/1.4.0.2.0/package/templates/flume-metrics2.properties.j2
* (edit) ambari-metrics/ambari-metrics-flume-sink/src/main/java/org/apache/hadoop/metrics2/sink/flume/FlumeTimelineMetricsSink.java
* (edit) ambari-metrics/ambari-metrics-common/src/main/java/org/apache/hadoop/metrics2/sink/timeline/AbstractTimelineMetricsSink.java
* (add) ambari-metrics/ambari-metrics-common/src/main/java/org/apache/hadoop/metrics2/sink/timeline/availability/MetricCollectorHAHelper.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/TimelineMetricClusterAggregatorSecond.java
* (edit) ambari-metrics/ambari-metrics-hadoop-sink/src/test/java/org/apache/hadoop/metrics2/sink/timeline/HadoopTimelineMetricsSinkTest.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/v2/TimelineMetricClusterAggregator.java
* (edit) ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/TimelineMetricAggregatorFactory.java
* (add) ambari-metrics/ambari-metrics-common/src/test/java/org/apache/hadoop/metrics2/sink/timeline/availability/ShardingStrategyTest.java
* (edit) ambari-metrics/ambari-metrics-kafka-sink/src/main/java/org/apache/hadoop/metrics2/sink/kafka/KafkaTimelineMetricsReporter.java


> Support round-robin scheduling with failover for Sinks with distributed collector
> ---------------------------------------------------------------------------------
>
>                 Key: AMBARI-16828
>                 URL: https://issues.apache.org/jira/browse/AMBARI-16828
>             Project: Ambari
>          Issue Type: Task
>          Components: amvari-me
>    Affects Versions: 2.4.1
>            Reporter: Siddharth Wagle
>            Assignee: Siddharth Wagle
>             Fix For: trunk
>
>         Attachments: AMBARI-16828.patch
>
>
> - Initial set of collectors is configured in the configuration files
> - Find available collectors by connecting to zookeeper thereafter 
> - Remember available collectors, refresh this information only when one collector cannot be reached with a very low frequency of checks, example: random interval between 10-12 minutes, check if a new collector is available. Set a low client side zk timeout.
> - Round robin the write between the collector choosing the first one at random
> - If a write timed out, choose the next available collector, remember the attempts with the first one
> - Set a configurable attempt count for failed connector (default = 3), after which the failed connector is no longer in the available collectors list. 
> - The next retry will be triggered after refresh with zookeeper is successful
> - If no failed collectors available, zk refresh interval should be chosen randomly between 1-2 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)