You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@bigtop.apache.org by kw...@apache.org on 2016/12/02 22:47:05 UTC
bigtop git commit: BIGTOP-2561: add juju bundle for hadoop-spark
(closes #166)
Repository: bigtop
Updated Branches:
refs/heads/master 56deaa7c0 -> f4d023b4c
BIGTOP-2561: add juju bundle for hadoop-spark (closes #166)
Signed-off-by: Kevin W Monroe <ke...@canonical.com>
Project: http://git-wip-us.apache.org/repos/asf/bigtop/repo
Commit: http://git-wip-us.apache.org/repos/asf/bigtop/commit/f4d023b4
Tree: http://git-wip-us.apache.org/repos/asf/bigtop/tree/f4d023b4
Diff: http://git-wip-us.apache.org/repos/asf/bigtop/diff/f4d023b4
Branch: refs/heads/master
Commit: f4d023b4c505efbb3c5b52cb0aa7ceb9dc20cc60
Parents: 56deaa7
Author: Kevin W Monroe <ke...@canonical.com>
Authored: Wed Sep 21 20:46:24 2016 -0500
Committer: Kevin W Monroe <ke...@canonical.com>
Committed: Fri Dec 2 16:46:24 2016 -0600
----------------------------------------------------------------------
bigtop-deploy/juju/hadoop-spark/.gitignore | 2 +
bigtop-deploy/juju/hadoop-spark/README.md | 356 +++++++++++++++++++
bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml | 138 +++++++
.../juju/hadoop-spark/bundle-local.yaml | 138 +++++++
bigtop-deploy/juju/hadoop-spark/bundle.yaml | 138 +++++++
bigtop-deploy/juju/hadoop-spark/copyright | 16 +
.../juju/hadoop-spark/tests/01-bundle.py | 137 +++++++
.../juju/hadoop-spark/tests/tests.yaml | 7 +
8 files changed, 932 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/.gitignore
----------------------------------------------------------------------
diff --git a/bigtop-deploy/juju/hadoop-spark/.gitignore b/bigtop-deploy/juju/hadoop-spark/.gitignore
new file mode 100644
index 0000000..a295864
--- /dev/null
+++ b/bigtop-deploy/juju/hadoop-spark/.gitignore
@@ -0,0 +1,2 @@
+*.pyc
+__pycache__
http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/README.md
----------------------------------------------------------------------
diff --git a/bigtop-deploy/juju/hadoop-spark/README.md b/bigtop-deploy/juju/hadoop-spark/README.md
new file mode 100644
index 0000000..b2b936b
--- /dev/null
+++ b/bigtop-deploy/juju/hadoop-spark/README.md
@@ -0,0 +1,356 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+# Overview
+
+The Apache Hadoop software library is a framework that allows for the
+distributed processing of large data sets across clusters of computers
+using a simple programming model.
+
+Hadoop is designed to scale from a few servers to thousands of machines,
+each offering local computation and storage. Rather than rely on hardware
+to deliver high-availability, Hadoop can detect and handle failures at the
+application layer. This provides a highly-available service on top of a cluster
+of machines, each of which may be prone to failure.
+
+Spark is a fast and general engine for large-scale data processing.
+
+This bundle provides a complete deployment of Hadoop and Spark components from
+[Apache Bigtop][] that performs distributed data processing at scale. Ganglia
+and rsyslog applications are also provided to monitor cluster health and syslog
+activity.
+
+[Apache Bigtop]: http://bigtop.apache.org/
+
+## Bundle Composition
+
+The applications that comprise this bundle are spread across 9 units as
+follows:
+
+ * NameNode (HDFS)
+ * ResourceManager (YARN)
+ * Colocated on the NameNode unit
+ * Slave (DataNode and NodeManager)
+ * 3 separate units
+ * Spark
+ * Plugin (Facilitates communication with the Hadoop cluster)
+ * Colocated on the Spark unit
+ * Client (Hadoop endpoint)
+ * Colocated on the Spark unit
+ * Zookeeper
+ * 3 separate units
+ * Ganglia (Web interface for monitoring cluster metrics)
+ * Rsyslog (Aggregate cluster syslog events in a single location)
+ * Colocated on the Ganglia unit
+
+Deploying this bundle results in a fully configured Apache Bigtop
+cluster on any supported cloud, which can be scaled to meet workload
+demands.
+
+
+# Deploying
+
+A working Juju installation is assumed to be present. If Juju is not yet set
+up, please follow the [getting-started][] instructions prior to deploying this
+bundle.
+
+> **Note**: This bundle requires hardware resources that may exceed limits
+of Free-tier or Trial accounts on some clouds. To deploy to these
+environments, modify a local copy of [bundle.yaml][] to set
+`services: 'X': num_units: 1` and `machines: 'X': constraints: mem=3G` as
+needed to satisfy account limits.
+
+Deploy this bundle from the Juju charm store with the `juju deploy` command:
+
+ juju deploy hadoop-spark
+
+> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version
+of Juju, use [juju-quickstart][] with the following syntax: `juju quickstart
+hadoop-spark`.
+
+Alternatively, deploy a locally modified `bundle.yaml` with:
+
+ juju deploy /path/to/bundle.yaml
+
+> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version
+of Juju, use [juju-quickstart][] with the following syntax: `juju quickstart
+/path/to/bundle.yaml`.
+
+The charms in this bundle can also be built from their source layers in the
+[Bigtop charm repository][]. See the [Bigtop charm README][] for instructions
+on building and deploying these charms locally.
+
+## Network-Restricted Environments
+Charms can be deployed in environments with limited network access. To deploy
+in this environment, configure a Juju model with appropriate proxy and/or
+mirror options. See [Configuring Models][] for more information.
+
+[getting-started]: https://jujucharms.com/docs/stable/getting-started
+[bundle.yaml]: https://github.com/apache/bigtop/blob/master/bigtop-deploy/juju/hadoop-spark/bundle.yaml
+[juju-quickstart]: https://launchpad.net/juju-quickstart
+[Bigtop charm repository]: https://github.com/apache/bigtop/tree/master/bigtop-packages/src/charm
+[Bigtop charm README]: https://github.com/apache/bigtop/blob/master/bigtop-packages/src/charm/README.md
+[Configuring Models]: https://jujucharms.com/docs/stable/models-config
+
+
+# Verifying
+
+## Status
+The applications that make up this bundle provide status messages to indicate
+when they are ready:
+
+ juju status
+
+This is particularly useful when combined with `watch` to track the on-going
+progress of the deployment:
+
+ watch -n 2 juju status
+
+The message for each unit will provide information about that unit's state.
+Once they all indicate that they are ready, perform application smoke tests
+to verify that the bundle is working as expected.
+
+## Smoke Test
+The charms for each core component (namenode, resourcemanager, slave, spark,
+and zookeeper) provide a `smoke-test` action that can be used to verify the
+application is functioning as expected. Note that the 'slave' component runs
+extensive tests provided by Apache Bigtop and may take up to 30 minutes to
+complete. Run the smoke-test actions as follows:
+
+ juju run-action namenode/0 smoke-test
+ juju run-action resourcemanager/0 smoke-test
+ juju run-action slave/0 smoke-test
+ juju run-action spark/0 smoke-test
+ juju run-action zookeeper/0 smoke-test
+
+> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version
+of Juju, the syntax is `juju action do <application>/0 smoke-test`.
+
+Watch the progress of the smoke test actions with:
+
+ watch -n 2 juju show-action-status
+
+> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version
+of Juju, the syntax is `juju action status`.
+
+Eventually, all of the actions should settle to `status: completed`. If
+any report `status: failed`, that application is not working as expected. Get
+more information about a specific smoke test with:
+
+ juju show-action-output <action-id>
+
+> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version
+of Juju, the syntax is `juju action fetch <action-id>`.
+
+## Utilities
+Applications in this bundle include command line and web utilities that
+can be used to verify information about the cluster.
+
+From the command line, show the HDFS dfsadmin report and view the current list
+of YARN NodeManager units with the following:
+
+ juju run --application namenode "su hdfs -c 'hdfs dfsadmin -report'"
+ juju run --application resourcemanager "su yarn -c 'yarn node -list'"
+
+Show the list of Zookeeper nodes with the following:
+
+ juju run --unit zookeeper/0 'echo "ls /" | /usr/lib/zookeeper/bin/zkCli.sh'
+
+To access the HDFS web console, find the `PUBLIC-ADDRESS` of the namenode
+application and expose it:
+
+ juju status namenode
+ juju expose namenode
+
+The web interface will be available at the following URL:
+
+ http://NAMENODE_PUBLIC_IP:50070
+
+Similarly, to access the Resource Manager web consoles, find the
+`PUBLIC-ADDRESS` of the resourcemanager application and expose it:
+
+ juju status resourcemanager
+ juju expose resourcemanager
+
+The YARN and Job History web interfaces will be available at the following URLs:
+
+ http://RESOURCEMANAGER_PUBLIC_IP:8088
+ http://RESOURCEMANAGER_PUBLIC_IP:19888
+
+Finally, to access the Spark web console, find the `PUBLIC-ADDRESS` of the
+spark application and expose it:
+
+ juju status spark
+ juju expose spark
+
+The web interface will be available at the following URL:
+
+ http://SPARK_PUBLIC_IP:8080
+
+
+# Monitoring
+
+This bundle includes Ganglia for system-level monitoring of the namenode,
+resourcemanager, slave, spark, and zookeeper units. Metrics are sent to a
+centralized ganglia unit for easy viewing in a browser. To view the ganglia web
+interface, find the `PUBLIC-ADDRESS` of the Ganglia application and expose it:
+
+ juju status ganglia
+ juju expose ganglia
+
+The web interface will be available at:
+
+ http://GANGLIA_PUBLIC_IP/ganglia
+
+
+# Logging
+
+This bundle includes rsyslog to collect syslog data from the namenode,
+resourcemanager, slave, spark, and zookeeper units. These logs are sent to a
+centralized rsyslog unit for easy syslog analysis. One method of viewing this
+log data is to simply cat syslog from the rsyslog unit:
+
+ juju run --unit rsyslog/0 'sudo cat /var/log/syslog'
+
+Logs may also be forwarded to an external rsyslog processing service. See
+the *Forwarding logs to a system outside of the Juju environment* section of
+the [rsyslog README](https://jujucharms.com/rsyslog/) for more information.
+
+
+# Benchmarking
+
+The `resourcemanager` charm in this bundle provide several benchmarks to gauge
+the performance of the Hadoop cluster. Each benchmark is an action that can be
+run with `juju run-action`:
+
+ $ juju actions resourcemanager
+ ACTION DESCRIPTION
+ mrbench Mapreduce benchmark for small jobs
+ nnbench Load test the NameNode hardware and configuration
+ smoke-test Run an Apache Bigtop smoke test.
+ teragen Generate data with teragen
+ terasort Runs teragen to generate sample data, and then runs terasort to sort that data
+ testdfsio DFS IO Testing
+
+ $ juju run-action resourcemanager/0 nnbench
+ Action queued with id: 55887b40-116c-4020-8b35-1e28a54cc622
+
+ $ juju show-action-output 55887b40-116c-4020-8b35-1e28a54cc622
+ results:
+ meta:
+ composite:
+ direction: asc
+ units: secs
+ value: "128"
+ start: 2016-02-04T14:55:39Z
+ stop: 2016-02-04T14:57:47Z
+ results:
+ raw: '{"BAD_ID": "0", "FILE: Number of read operations": "0", "Reduce input groups":
+ "8", "Reduce input records": "95", "Map output bytes": "1823", "Map input records":
+ "12", "Combine input records": "0", "HDFS: Number of bytes read": "18635", "FILE:
+ Number of bytes written": "32999982", "HDFS: Number of write operations": "330",
+ "Combine output records": "0", "Total committed heap usage (bytes)": "3144749056",
+ "Bytes Written": "164", "WRONG_LENGTH": "0", "Failed Shuffles": "0", "FILE:
+ Number of bytes read": "27879457", "WRONG_MAP": "0", "Spilled Records": "190",
+ "Merged Map outputs": "72", "HDFS: Number of large read operations": "0", "Reduce
+ shuffle bytes": "2445", "FILE: Number of large read operations": "0", "Map output
+ materialized bytes": "2445", "IO_ERROR": "0", "CONNECTION": "0", "HDFS: Number
+ of read operations": "567", "Map output records": "95", "Reduce output records":
+ "8", "WRONG_REDUCE": "0", "HDFS: Number of bytes written": "27412", "GC time
+ elapsed (ms)": "603", "Input split bytes": "1610", "Shuffled Maps ": "72", "FILE:
+ Number of write operations": "0", "Bytes Read": "1490"}'
+ status: completed
+ timing:
+ completed: 2016-02-04 14:57:48 +0000 UTC
+ enqueued: 2016-02-04 14:55:14 +0000 UTC
+ started: 2016-02-04 14:55:27 +0000 UTC
+
+The `spark` charm in this bundle also provides several benchmarks to gauge
+the performance of the Spark cluster. Each benchmark is an action that can be
+run with `juju run-action`:
+
+ $ juju actions spark | grep Bench
+ connectedcomponent Run the Spark Bench ConnectedComponent benchmark.
+ decisiontree Run the Spark Bench DecisionTree benchmark.
+ kmeans Run the Spark Bench KMeans benchmark.
+ linearregression Run the Spark Bench LinearRegression benchmark.
+ logisticregression Run the Spark Bench LogisticRegression benchmark.
+ matrixfactorization Run the Spark Bench MatrixFactorization benchmark.
+ pagerank Run the Spark Bench PageRank benchmark.
+ pca Run the Spark Bench PCA benchmark.
+ pregeloperation Run the Spark Bench PregelOperation benchmark.
+ shortestpaths Run the Spark Bench ShortestPaths benchmark.
+ sql Run the Spark Bench SQL benchmark.
+ stronglyconnectedcomponent Run the Spark Bench StronglyConnectedComponent benchmark.
+ svdplusplus Run the Spark Bench SVDPlusPlus benchmark.
+ svm Run the Spark Bench SVM benchmark.
+
+ $ juju run-action spark/0 svdplusplus
+ Action queued with id: 339cec1f-e903-4ee7-85ca-876fb0c3d28e
+
+ $ juju show-action-output 339cec1f-e903-4ee7-85ca-876fb0c3d28e
+ results:
+ meta:
+ composite:
+ direction: asc
+ units: secs
+ value: "200.754000"
+ raw: |
+ SVDPlusPlus,2016-11-02-03:08:26,200.754000,85.974071,.428255,0,SVDPlusPlus-MLlibConfig,,,,,10,,,50000,4.0,1.3,
+ start: 2016-11-02T03:08:26Z
+ stop: 2016-11-02T03:11:47Z
+ results:
+ duration:
+ direction: asc
+ units: secs
+ value: "200.754000"
+ throughput:
+ direction: desc
+ units: MB/sec
+ value: ".428255"
+ status: completed
+ timing:
+ completed: 2016-11-02 03:11:48 +0000 UTC
+ enqueued: 2016-11-02 03:08:21 +0000 UTC
+ started: 2016-11-02 03:08:26 +0000 UTC
+
+
+# Scaling
+
+By default, three Hadoop slave and three zookeeper units are deployed. Scaling
+these applications is as simple as adding more units. To add one unit:
+
+ juju add-unit slave
+ juju add-unit zookeeper
+
+Multiple units may be added at once. For example, add four more slave units:
+
+ juju add-unit -n4 slave
+
+
+# Contact Information
+
+- <bi...@lists.ubuntu.com>
+
+
+# Resources
+
+- [Apache Bigtop](http://bigtop.apache.org/) home page
+- [Apache Bigtop issue tracking](http://bigtop.apache.org/issue-tracking.html)
+- [Apache Bigtop mailing lists](http://bigtop.apache.org/mail-lists.html)
+- [Juju Bigtop charms](https://jujucharms.com/q/apache/bigtop)
+- [Juju mailing list](https://lists.ubuntu.com/mailman/listinfo/juju)
+- [Juju community](https://jujucharms.com/community)
http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml
----------------------------------------------------------------------
diff --git a/bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml b/bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml
new file mode 100644
index 0000000..35623fd
--- /dev/null
+++ b/bigtop-deploy/juju/hadoop-spark/bundle-dev.yaml
@@ -0,0 +1,138 @@
+services:
+ namenode:
+ charm: "cs:~bigdata-dev/xenial/hadoop-namenode"
+ num_units: 1
+ annotations:
+ gui-x: "500"
+ gui-y: "800"
+ to:
+ - "0"
+ resourcemanager:
+ charm: "cs:~bigdata-dev/xenial/hadoop-resourcemanager"
+ num_units: 1
+ annotations:
+ gui-x: "500"
+ gui-y: "0"
+ to:
+ - "0"
+ slave:
+ charm: "cs:~bigdata-dev/xenial/hadoop-slave"
+ num_units: 3
+ annotations:
+ gui-x: "0"
+ gui-y: "400"
+ to:
+ - "1"
+ - "2"
+ - "3"
+ plugin:
+ charm: "cs:~bigdata-dev/xenial/hadoop-plugin"
+ annotations:
+ gui-x: "1000"
+ gui-y: "400"
+ client:
+ charm: "cs:xenial/hadoop-client-2"
+ num_units: 1
+ annotations:
+ gui-x: "1250"
+ gui-y: "400"
+ to:
+ - "4"
+ spark:
+ charm: "cs:~bigdata-dev/xenial/spark"
+ num_units: 1
+ options:
+ spark_execution_mode: "yarn-client"
+ annotations:
+ gui-x: "1000"
+ gui-y: "0"
+ to:
+ - "4"
+ zookeeper:
+ charm: "cs:xenial/zookeeper-10"
+ num_units: 3
+ annotations:
+ gui-x: "500"
+ gui-y: "400"
+ to:
+ - "5"
+ - "6"
+ - "7"
+ ganglia:
+ charm: "cs:~bigdata-dev/xenial/ganglia-5"
+ num_units: 1
+ annotations:
+ gui-x: "0"
+ gui-y: "800"
+ to:
+ - "8"
+ ganglia-node:
+ charm: "cs:~bigdata-dev/xenial/ganglia-node-6"
+ annotations:
+ gui-x: "250"
+ gui-y: "400"
+ rsyslog:
+ charm: "cs:~bigdata-dev/xenial/rsyslog-6"
+ num_units: 1
+ annotations:
+ gui-x: "1000"
+ gui-y: "800"
+ to:
+ - "8"
+ rsyslog-forwarder-ha:
+ charm: "cs:~bigdata-dev/xenial/rsyslog-forwarder-ha-7"
+ annotations:
+ gui-x: "750"
+ gui-y: "400"
+series: xenial
+relations:
+ - [resourcemanager, namenode]
+ - [namenode, slave]
+ - [resourcemanager, slave]
+ - [plugin, namenode]
+ - [plugin, resourcemanager]
+ - [client, plugin]
+ - [spark, plugin]
+ - [spark, zookeeper]
+ - ["ganglia-node:juju-info", "client:juju-info"]
+ - ["ganglia-node:juju-info", "namenode:juju-info"]
+ - ["ganglia-node:juju-info", "resourcemanager:juju-info"]
+ - ["ganglia-node:juju-info", "slave:juju-info"]
+ - ["ganglia-node:juju-info", "spark:juju-info"]
+ - ["ganglia-node:juju-info", "zookeeper:juju-info"]
+ - ["ganglia:node", "ganglia-node:node"]
+ - ["rsyslog-forwarder-ha:juju-info", "client:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "namenode:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "resourcemanager:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "slave:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "spark:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "zookeeper:juju-info"]
+ - ["rsyslog:aggregator", "rsyslog-forwarder-ha:syslog"]
+machines:
+ "0":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "1":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "2":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "3":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "4":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "5":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "6":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "7":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "8":
+ constraints: "mem=3G"
+ series: "xenial"
http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/bundle-local.yaml
----------------------------------------------------------------------
diff --git a/bigtop-deploy/juju/hadoop-spark/bundle-local.yaml b/bigtop-deploy/juju/hadoop-spark/bundle-local.yaml
new file mode 100644
index 0000000..160683a
--- /dev/null
+++ b/bigtop-deploy/juju/hadoop-spark/bundle-local.yaml
@@ -0,0 +1,138 @@
+services:
+ namenode:
+ charm: "/home/ubuntu/charms/xenial/hadoop-namenode"
+ num_units: 1
+ annotations:
+ gui-x: "500"
+ gui-y: "800"
+ to:
+ - "0"
+ resourcemanager:
+ charm: "/home/ubuntu/charms/xenial/hadoop-resourcemanager"
+ num_units: 1
+ annotations:
+ gui-x: "500"
+ gui-y: "0"
+ to:
+ - "0"
+ slave:
+ charm: "/home/ubuntu/charms/xenial/hadoop-slave"
+ num_units: 3
+ annotations:
+ gui-x: "0"
+ gui-y: "400"
+ to:
+ - "1"
+ - "2"
+ - "3"
+ plugin:
+ charm: "/home/ubuntu/charms/xenial/hadoop-plugin"
+ annotations:
+ gui-x: "1000"
+ gui-y: "400"
+ client:
+ charm: "cs:xenial/hadoop-client-2"
+ num_units: 1
+ annotations:
+ gui-x: "1250"
+ gui-y: "400"
+ to:
+ - "4"
+ spark:
+ charm: "/home/ubuntu/charms/xenial/spark"
+ num_units: 1
+ options:
+ spark_execution_mode: "yarn-client"
+ annotations:
+ gui-x: "1000"
+ gui-y: "0"
+ to:
+ - "4"
+ zookeeper:
+ charm: "cs:xenial/zookeeper-10"
+ num_units: 3
+ annotations:
+ gui-x: "500"
+ gui-y: "400"
+ to:
+ - "5"
+ - "6"
+ - "7"
+ ganglia:
+ charm: "cs:~bigdata-dev/xenial/ganglia-5"
+ num_units: 1
+ annotations:
+ gui-x: "0"
+ gui-y: "800"
+ to:
+ - "8"
+ ganglia-node:
+ charm: "cs:~bigdata-dev/xenial/ganglia-node-6"
+ annotations:
+ gui-x: "250"
+ gui-y: "400"
+ rsyslog:
+ charm: "cs:~bigdata-dev/xenial/rsyslog-6"
+ num_units: 1
+ annotations:
+ gui-x: "1000"
+ gui-y: "800"
+ to:
+ - "8"
+ rsyslog-forwarder-ha:
+ charm: "cs:~bigdata-dev/xenial/rsyslog-forwarder-ha-7"
+ annotations:
+ gui-x: "750"
+ gui-y: "400"
+series: xenial
+relations:
+ - [resourcemanager, namenode]
+ - [namenode, slave]
+ - [resourcemanager, slave]
+ - [plugin, namenode]
+ - [plugin, resourcemanager]
+ - [client, plugin]
+ - [spark, plugin]
+ - [spark, zookeeper]
+ - ["ganglia-node:juju-info", "client:juju-info"]
+ - ["ganglia-node:juju-info", "namenode:juju-info"]
+ - ["ganglia-node:juju-info", "resourcemanager:juju-info"]
+ - ["ganglia-node:juju-info", "slave:juju-info"]
+ - ["ganglia-node:juju-info", "spark:juju-info"]
+ - ["ganglia-node:juju-info", "zookeeper:juju-info"]
+ - ["ganglia:node", "ganglia-node:node"]
+ - ["rsyslog-forwarder-ha:juju-info", "client:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "namenode:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "resourcemanager:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "slave:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "spark:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "zookeeper:juju-info"]
+ - ["rsyslog:aggregator", "rsyslog-forwarder-ha:syslog"]
+machines:
+ "0":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "1":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "2":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "3":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "4":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "5":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "6":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "7":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "8":
+ constraints: "mem=3G"
+ series: "xenial"
http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/bundle.yaml
----------------------------------------------------------------------
diff --git a/bigtop-deploy/juju/hadoop-spark/bundle.yaml b/bigtop-deploy/juju/hadoop-spark/bundle.yaml
new file mode 100644
index 0000000..67b9bb7
--- /dev/null
+++ b/bigtop-deploy/juju/hadoop-spark/bundle.yaml
@@ -0,0 +1,138 @@
+services:
+ namenode:
+ charm: "cs:xenial/hadoop-namenode-6"
+ num_units: 1
+ annotations:
+ gui-x: "500"
+ gui-y: "800"
+ to:
+ - "0"
+ resourcemanager:
+ charm: "cs:xenial/hadoop-resourcemanager-6"
+ num_units: 1
+ annotations:
+ gui-x: "500"
+ gui-y: "0"
+ to:
+ - "0"
+ slave:
+ charm: "cs:xenial/hadoop-slave-6"
+ num_units: 3
+ annotations:
+ gui-x: "0"
+ gui-y: "400"
+ to:
+ - "1"
+ - "2"
+ - "3"
+ plugin:
+ charm: "cs:xenial/hadoop-plugin-6"
+ annotations:
+ gui-x: "1000"
+ gui-y: "400"
+ client:
+ charm: "cs:xenial/hadoop-client-2"
+ num_units: 1
+ annotations:
+ gui-x: "1250"
+ gui-y: "400"
+ to:
+ - "4"
+ spark:
+ charm: "cs:xenial/spark-15"
+ num_units: 1
+ options:
+ spark_execution_mode: "yarn-client"
+ annotations:
+ gui-x: "1000"
+ gui-y: "0"
+ to:
+ - "4"
+ zookeeper:
+ charm: "cs:xenial/zookeeper-10"
+ num_units: 3
+ annotations:
+ gui-x: "500"
+ gui-y: "400"
+ to:
+ - "5"
+ - "6"
+ - "7"
+ ganglia:
+ charm: "cs:~bigdata-dev/xenial/ganglia-5"
+ num_units: 1
+ annotations:
+ gui-x: "0"
+ gui-y: "800"
+ to:
+ - "8"
+ ganglia-node:
+ charm: "cs:~bigdata-dev/xenial/ganglia-node-6"
+ annotations:
+ gui-x: "250"
+ gui-y: "400"
+ rsyslog:
+ charm: "cs:~bigdata-dev/xenial/rsyslog-6"
+ num_units: 1
+ annotations:
+ gui-x: "1000"
+ gui-y: "800"
+ to:
+ - "8"
+ rsyslog-forwarder-ha:
+ charm: "cs:~bigdata-dev/xenial/rsyslog-forwarder-ha-7"
+ annotations:
+ gui-x: "750"
+ gui-y: "400"
+series: xenial
+relations:
+ - [resourcemanager, namenode]
+ - [namenode, slave]
+ - [resourcemanager, slave]
+ - [plugin, namenode]
+ - [plugin, resourcemanager]
+ - [client, plugin]
+ - [spark, plugin]
+ - [spark, zookeeper]
+ - ["ganglia-node:juju-info", "client:juju-info"]
+ - ["ganglia-node:juju-info", "namenode:juju-info"]
+ - ["ganglia-node:juju-info", "resourcemanager:juju-info"]
+ - ["ganglia-node:juju-info", "slave:juju-info"]
+ - ["ganglia-node:juju-info", "spark:juju-info"]
+ - ["ganglia-node:juju-info", "zookeeper:juju-info"]
+ - ["ganglia:node", "ganglia-node:node"]
+ - ["rsyslog-forwarder-ha:juju-info", "client:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "namenode:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "resourcemanager:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "slave:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "spark:juju-info"]
+ - ["rsyslog-forwarder-ha:juju-info", "zookeeper:juju-info"]
+ - ["rsyslog:aggregator", "rsyslog-forwarder-ha:syslog"]
+machines:
+ "0":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "1":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "2":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "3":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "4":
+ constraints: "mem=7G root-disk=32G"
+ series: "xenial"
+ "5":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "6":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "7":
+ constraints: "mem=3G root-disk=32G"
+ series: "xenial"
+ "8":
+ constraints: "mem=3G"
+ series: "xenial"
http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/copyright
----------------------------------------------------------------------
diff --git a/bigtop-deploy/juju/hadoop-spark/copyright b/bigtop-deploy/juju/hadoop-spark/copyright
new file mode 100644
index 0000000..e900b97
--- /dev/null
+++ b/bigtop-deploy/juju/hadoop-spark/copyright
@@ -0,0 +1,16 @@
+Format: http://dep.debian.net/deps/dep5/
+
+Files: *
+Copyright: Copyright 2015, Canonical Ltd., All Rights Reserved.
+License: Apache License 2.0
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ .
+ http://www.apache.org/licenses/LICENSE-2.0
+ .
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/tests/01-bundle.py
----------------------------------------------------------------------
diff --git a/bigtop-deploy/juju/hadoop-spark/tests/01-bundle.py b/bigtop-deploy/juju/hadoop-spark/tests/01-bundle.py
new file mode 100755
index 0000000..ba292bc
--- /dev/null
+++ b/bigtop-deploy/juju/hadoop-spark/tests/01-bundle.py
@@ -0,0 +1,137 @@
+#!/usr/bin/env python3
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import amulet
+import os
+import re
+import unittest
+import yaml
+
+
+class TestBundle(unittest.TestCase):
+ bundle_file = os.path.join(os.path.dirname(__file__), '..', 'bundle.yaml')
+
+ @classmethod
+ def setUpClass(cls):
+ # classmethod inheritance doesn't work quite right with
+ # setUpClass / tearDownClass, so subclasses have to manually call this
+ cls.d = amulet.Deployment(series='xenial')
+ with open(cls.bundle_file) as f:
+ bun = f.read()
+ bundle = yaml.safe_load(bun)
+
+ # NB: strip machine ('to') placement out. amulet loses our machine spec
+ # somewhere between yaml and json; without that spec, charms specifying
+ # machine placement will not deploy. This is ok for now because all
+ # charms in this bundle are using 'reset: false' so we'll already
+ # have our deployment just the way we want it by the time this test
+ # runs. However, it's bad. Remove once this is fixed:
+ # https://github.com/juju/amulet/issues/148
+ for service, service_config in bundle['services'].items():
+ if 'to' in service_config:
+ del service_config['to']
+
+ cls.d.load(bundle)
+ cls.d.setup(timeout=3600)
+
+ # we need units reporting ready before we attempt our smoke tests
+ cls.d.sentry.wait_for_messages({'client': re.compile('ready'),
+ 'namenode': re.compile('ready'),
+ 'resourcemanager': re.compile('ready'),
+ 'slave': re.compile('ready'),
+ 'spark': re.compile('ready'),
+ }, timeout=3600)
+ cls.hdfs = cls.d.sentry['namenode'][0]
+ cls.yarn = cls.d.sentry['resourcemanager'][0]
+ cls.slave = cls.d.sentry['slave'][0]
+ cls.spark = cls.d.sentry['spark'][0]
+
+ def test_components(self):
+ """
+ Confirm that all of the required components are up and running.
+ """
+ hdfs, retcode = self.hdfs.run("pgrep -a java")
+ yarn, retcode = self.yarn.run("pgrep -a java")
+ slave, retcode = self.slave.run("pgrep -a java")
+ spark, retcode = self.spark.run("pgrep -a java")
+
+ assert 'NameNode' in hdfs, "NameNode not started"
+ assert 'NameNode' not in slave, "NameNode should not be running on slave"
+
+ assert 'ResourceManager' in yarn, "ResourceManager not started"
+ assert 'ResourceManager' not in slave, "ResourceManager should not be running on slave"
+
+ assert 'JobHistoryServer' in yarn, "JobHistoryServer not started"
+ assert 'JobHistoryServer' not in slave, "JobHistoryServer should not be running on slave"
+
+ assert 'NodeManager' in slave, "NodeManager not started"
+ assert 'NodeManager' not in yarn, "NodeManager should not be running on resourcemanager"
+ assert 'NodeManager' not in hdfs, "NodeManager should not be running on namenode"
+
+ assert 'DataNode' in slave, "DataNode not started"
+ assert 'DataNode' not in yarn, "DataNode should not be running on resourcemanager"
+ assert 'DataNode' not in hdfs, "DataNode should not be running on namenode"
+
+ assert 'Master' in spark, "Spark Master not started"
+
+ def test_hdfs(self):
+ """
+ Validates mkdir, ls, chmod, and rm HDFS operations.
+ """
+ uuid = self.hdfs.run_action('smoke-test')
+ result = self.d.action_fetch(uuid, timeout=600, full_output=True)
+ # action status=completed on success
+ if (result['status'] != "completed"):
+ self.fail('HDFS smoke-test did not complete: %s' % result)
+
+ def test_yarn(self):
+ """
+ Validates YARN using the Bigtop 'yarn' smoke test.
+ """
+ uuid = self.yarn.run_action('smoke-test')
+ # 'yarn' smoke takes a while (bigtop tests download lots of stuff)
+ result = self.d.action_fetch(uuid, timeout=1800, full_output=True)
+ # action status=completed on success
+ if (result['status'] != "completed"):
+ self.fail('YARN smoke-test did not complete: %s' % result)
+
+ def test_spark(self):
+ """
+ Validates Spark with a simple sparkpi test.
+ """
+ uuid = self.spark.run_action('smoke-test')
+ result = self.d.action_fetch(uuid, timeout=600, full_output=True)
+ # action status=completed on success
+ if (result['status'] != "completed"):
+ self.fail('Spark smoke-test did not complete: %s' % result)
+
+ @unittest.skip(
+ 'Skipping slave smoke tests; they are too inconsistent and long running for CWR.')
+ def test_slave(self):
+ """
+ Validates slave using the Bigtop 'hdfs' and 'mapred' smoke test.
+ """
+ uuid = self.slave.run_action('smoke-test')
+ # 'hdfs+mapred' smoke takes a long while (bigtop tests are slow)
+ result = self.d.action_fetch(uuid, timeout=3600, full_output=True)
+ # action status=completed on success
+ if (result['status'] != "completed"):
+ self.fail('Slave smoke-test did not complete: %s' % result)
+
+
+if __name__ == '__main__':
+ unittest.main()
http://git-wip-us.apache.org/repos/asf/bigtop/blob/f4d023b4/bigtop-deploy/juju/hadoop-spark/tests/tests.yaml
----------------------------------------------------------------------
diff --git a/bigtop-deploy/juju/hadoop-spark/tests/tests.yaml b/bigtop-deploy/juju/hadoop-spark/tests/tests.yaml
new file mode 100644
index 0000000..c9325b0
--- /dev/null
+++ b/bigtop-deploy/juju/hadoop-spark/tests/tests.yaml
@@ -0,0 +1,7 @@
+reset: false
+deployment_timeout: 7200
+sources:
+ - 'ppa:juju/stable'
+packages:
+ - amulet
+ - python3-yaml