You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sdap.apache.org by fg...@apache.org on 2019/03/06 18:57:21 UTC
[incubator-sdap-nexus] branch 1.1.0-SNAPSHOT updated: Merge master into 1.1.0 (#62)

This is an automated email from the ASF dual-hosted git repository.

fgreg pushed a commit to branch 1.1.0-SNAPSHOT
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git


The following commit(s) were added to refs/heads/1.1.0-SNAPSHOT by this push:
     new 4f6e8df  Merge master into 1.1.0 (#62)
4f6e8df is described below

commit 4f6e8dfb46df96335e0d55624a36acd075c50a49
Author: fgreg <fg...@gmail.com>
AuthorDate: Wed Mar 6 10:57:14 2019 -0800

    Merge master into 1.1.0 (#62)
    
    * SDAP-151 Determine parallelism automatically for Spark analytics (#50)
    
    * Removed spark configuration, added nparts configuration, and autocompute parallelism for spark-based time series.
    
    * SDAP-151 Determine parallelism automatically for Spark analytics
    
    * SDAP-105 DOMS matchup netcdf and csv generation (#61)
    
    * adding netcdf generation, modified csv generation to line up with netcdf
    
    * adding all recorded variables, calc max and min, refactored
    
    * things work! testing locally with AVHRR using /domsresults endpoint
    
    * removing json files from git tracking
    
    * refactor
    
    * removing all files used for testing purposes
    
    * Consolidate depth to a single field, fix valid_min bug
    
    * minor fixes, typos
    
    * fix the way depth is handled, add comments, get rid of unecessary string comparison
    
    * update metadata links to be pulled from config.py
    
    * add keywords
    
    * SDAP-173 Fix Hovmoller code reporting missing get_spark_cfg attribute (#65)
    
    * SDAP-173 Fix Hovmoller code reporting missing get_spark_cfg attribute
    
    * SDAP-166 SolrCloud Docker Image (#63)
    
    * Now using INIT_SOLR_HOME="yes" option for creating the nexustiles core in singlenode.
    
    * New Solr Cloud docker image
    
    * Overhauled Solr images. Includes new Solr Cloud image and init container.
---
 README.md                                          |   4 +-
 .../webservice/algorithms/doms/BaseDomsHandler.py  | 729 +++++++++------------
 analysis/webservice/algorithms/doms/config.py      |  11 +
 .../webservice/algorithms_spark/HofMoellerSpark.py |   9 +-
 docker/README.md                                   |   1 -
 docker/Readme.rst                                  |   1 +
 docker/solr-single-node/README.md                  |  10 -
 docker/solr/Dockerfile                             |   9 +-
 docker/solr/README.md                              |   0
 docker/solr/Readme.rst                             |  48 ++
 .../cloud-init}/Dockerfile                         |  20 +-
 docker/solr/cloud-init/Readme.rst                  |  73 +++
 docker/solr/cloud-init/create-collection.py        | 111 ++++
 docker/{solr-single-node => solr/cloud}/Dockerfile |  18 +-
 docker/solr/cloud/Readme.rst                       |  93 +++
 .../docker-entrypoint-initdb.d/0-init-home.sh}     |  13 +-
 .../docker-entrypoint-initdb.d/1-bootstrap-zk.sh}  |  10 +-
 docker/solr/cloud/tmp/solr.xml                     |  53 ++
 docker/solr/cloud/tmp/zoo.cfg                      |  31 +
 .../singlenode}/Dockerfile                         |   9 +-
 docker/solr/singlenode/Readme.rst                  |  42 ++
 .../singlenode}/create-core.sh                     |   2 +-
 docs/conf.py                                       |   7 +-
 docs/dockerimages.rst                              |  10 +
 docs/index.rst                                     |   5 +
 docs/intro.rst                                     |   3 +
 26 files changed, 852 insertions(+), 470 deletions(-)

diff --git a/README.md b/README.md
index 88d1aae..e5499d2 100644
--- a/README.md
+++ b/README.md
@@ -5,9 +5,9 @@ The next generation cloud-based science data service platform. More information
 
 ## Building the Docs
 
-Ensure sphinx and sphinx-autobuild are installed
+Ensure sphinx, sphinx-autobuild, and recommonmark are installed. We use the recommonmark module for parsing Markdown files.
 
-    pip install sphinx sphinx-autobuild
+    pip install sphinx sphinx-autobuild recommonmark
 
 Run sphinx-autobuild to view the docs locally.
 
diff --git a/analysis/webservice/algorithms/doms/BaseDomsHandler.py b/analysis/webservice/algorithms/doms/BaseDomsHandler.py
index 586624d..c9b8acf 100644
--- a/analysis/webservice/algorithms/doms/BaseDomsHandler.py
+++ b/analysis/webservice/algorithms/doms/BaseDomsHandler.py
@@ -14,9 +14,11 @@
 # limitations under the License.
 
 import StringIO
+import os
 import csv
 import json
 from datetime import datetime
+import time
 from decimal import Decimal
 
 import numpy as np
@@ -38,6 +40,7 @@ except ImportError:
     from gdalnumeric import *
 
 from netCDF4 import Dataset
+import netCDF4
 import tempfile
 
 
@@ -96,7 +99,7 @@ class DomsQueryResults(NexusResults):
         return DomsCSVFormatter.create(self.__executionId, self.results(), self.__args, self.__details)
 
     def toNetCDF(self):
-        return DomsNetCDFFormatterAlt.create(self.__executionId, self.results(), self.__args, self.__details)
+        return DomsNetCDFFormatter.create(self.__executionId, self.results(), self.__args, self.__details)
 
 
 class DomsCSVFormatter:
@@ -109,7 +112,7 @@ class DomsCSVFormatter:
             DomsCSVFormatter.__addDynamicAttrs(csv_mem_file, executionId, results, params, details)
             csv.writer(csv_mem_file).writerow([])
 
-            DomsCSVFormatter.__packValues(csv_mem_file, results)
+            DomsCSVFormatter.__packValues(csv_mem_file, results, params)
 
             csv_out = csv_mem_file.getvalue()
         finally:
@@ -118,47 +121,60 @@ class DomsCSVFormatter:
         return csv_out
 
     @staticmethod
-    def __packValues(csv_mem_file, results):
+    def __packValues(csv_mem_file, results, params):
 
         writer = csv.writer(csv_mem_file)
 
         headers = [
             # Primary
-            "id", "source", "lon", "lat", "time", "platform", "sea_water_salinity_depth", "sea_water_salinity",
-            "sea_water_temperature_depth", "sea_water_temperature", "wind_speed", "wind_direction", "wind_u", "wind_v",
+            "id", "source", "lon (degrees_east)", "lat (degrees_north)", "time", "platform",
+            "sea_surface_salinity (1e-3)", "sea_surface_temperature (degree_C)", "wind_speed (m s-1)", "wind_direction",
+            "wind_u (m s-1)", "wind_v (m s-1)",
             # Match
-            "id", "source", "lon", "lat", "time", "platform", "sea_water_salinity_depth", "sea_water_salinity",
-            "sea_water_temperature_depth", "sea_water_temperature", "wind_speed", "wind_direction", "wind_u", "wind_v"
+            "id", "source", "lon (degrees_east)", "lat (degrees_north)", "time", "platform",
+            "depth (m)", "sea_water_salinity (1e-3)",
+            "sea_water_temperature (degree_C)", "wind_speed (m s-1)",
+            "wind_direction", "wind_u (m s-1)", "wind_v (m s-1)"
         ]
 
         writer.writerow(headers)
 
+        #
+        # Only include the depth variable related to the match-up parameter. If the match-up parameter
+        # is not sss or sst then do not include any depth data, just fill values.
+        #
+        if params["parameter"] == "sss":
+            depth = "sea_water_salinity_depth"
+        elif params["parameter"] == "sst":
+            depth = "sea_water_temperature_depth"
+        else:
+            depth = "NO_DEPTH"
+
         for primaryValue in results:
             for matchup in primaryValue["matches"]:
                 row = [
                     # Primary
                     primaryValue["id"], primaryValue["source"], str(primaryValue["x"]), str(primaryValue["y"]),
                     primaryValue["time"].strftime(ISO_8601), primaryValue["platform"],
-                    primaryValue.get("sea_water_salinity_depth", ""), primaryValue.get("sea_water_salinity", ""),
-                    primaryValue.get("sea_water_temperature_depth", ""), primaryValue.get("sea_water_temperature", ""),
+                    primaryValue.get("sea_water_salinity", ""), primaryValue.get("sea_water_temperature", ""),
                     primaryValue.get("wind_speed", ""), primaryValue.get("wind_direction", ""),
                     primaryValue.get("wind_u", ""), primaryValue.get("wind_v", ""),
 
                     # Matchup
                     matchup["id"], matchup["source"], matchup["x"], matchup["y"],
                     matchup["time"].strftime(ISO_8601), matchup["platform"],
-                    matchup.get("sea_water_salinity_depth", ""), matchup.get("sea_water_salinity", ""),
-                    matchup.get("sea_water_temperature_depth", ""), matchup.get("sea_water_temperature", ""),
+                    matchup.get(depth, ""), matchup.get("sea_water_salinity", ""),
+                    matchup.get("sea_water_temperature", ""),
                     matchup.get("wind_speed", ""), matchup.get("wind_direction", ""),
                     matchup.get("wind_u", ""), matchup.get("wind_v", ""),
                 ]
-
                 writer.writerow(row)
 
     @staticmethod
     def __addConstants(csvfile):
 
         global_attrs = [
+            {"Global Attribute": "product_version", "Value": "1.0"},
             {"Global Attribute": "Conventions", "Value": "CF-1.6, ACDD-1.3"},
             {"Global Attribute": "title", "Value": "DOMS satellite-insitu machup output file"},
             {"Global Attribute": "history",
@@ -173,7 +189,9 @@ class DomsCSVFormatter:
             {"Global Attribute": "keywords_vocabulary",
              "Value": "NASA Global Change Master Directory (GCMD) Science Keywords"},
             # TODO What should the keywords be?
-            {"Global Attribute": "keywords", "Value": ""},
+            {"Global Attribute": "keywords", "Value": "SATELLITES, OCEAN PLATFORMS, SHIPS, BUOYS, MOORINGS, AUVS, ROV, "
+                                                      "NASA/JPL/PODAAC, FSU/COAPS, UCAR/NCAR, SALINITY, "
+                                                      "SEA SURFACE TEMPERATURE, SURFACE WINDS"},
             {"Global Attribute": "creator_name", "Value": "NASA PO.DAAC"},
             {"Global Attribute": "creator_email", "Value": "podaac@podaac.jpl.nasa.gov"},
             {"Global Attribute": "creator_url", "Value": "https://podaac.jpl.nasa.gov/"},
@@ -196,14 +214,20 @@ class DomsCSVFormatter:
             for match in primaryValue['matches']:
                 platforms.add(match['platform'])
 
+        # insituDatasets = params["matchup"].split(",")
+        insituDatasets = params["matchup"]
+        insituLinks = set()
+        for insitu in insituDatasets:
+            insituLinks.add(config.METADATA_LINKS[insitu])
+
+
         global_attrs = [
             {"Global Attribute": "Platform", "Value": ', '.join(platforms)},
             {"Global Attribute": "time_coverage_start",
              "Value": params["startTime"].strftime(ISO_8601)},
             {"Global Attribute": "time_coverage_end",
              "Value": params["endTime"].strftime(ISO_8601)},
-            # TODO I don't think this applies
-            # {"Global Attribute": "time_coverage_resolution", "Value": "point"},
+            {"Global Attribute": "time_coverage_resolution", "Value": "point"},
 
             {"Global Attribute": "geospatial_lon_min", "Value": params["bbox"].split(',')[0]},
             {"Global Attribute": "geospatial_lat_min", "Value": params["bbox"].split(',')[1]},
@@ -223,31 +247,25 @@ class DomsCSVFormatter:
             {"Global Attribute": "DOMS_matchID", "Value": executionId},
             {"Global Attribute": "DOMS_TimeWindow", "Value": params["timeTolerance"] / 60 / 60},
             {"Global Attribute": "DOMS_TimeWindow_Units", "Value": "hours"},
-            {"Global Attribute": "DOMS_depth_min", "Value": params["depthMin"]},
-            {"Global Attribute": "DOMS_depth_min_units", "Value": "m"},
-            {"Global Attribute": "DOMS_depth_max", "Value": params["depthMax"]},
-            {"Global Attribute": "DOMS_depth_max_units", "Value": "m"},
 
             {"Global Attribute": "DOMS_platforms", "Value": params["platforms"]},
             {"Global Attribute": "DOMS_SearchRadius", "Value": params["radiusTolerance"]},
             {"Global Attribute": "DOMS_SearchRadius_Units", "Value": "m"},
-            {"Global Attribute": "DOMS_bounding_box", "Value": params["bbox"]},
 
+            {"Global Attribute": "DOMS_DatasetMetadata", "Value": ', '.join(insituLinks)},
             {"Global Attribute": "DOMS_primary", "Value": params["primary"]},
-            {"Global Attribute": "DOMS_match-up", "Value": ",".join(params["matchup"])},
+            {"Global Attribute": "DOMS_match_up", "Value": params["matchup"]},
             {"Global Attribute": "DOMS_ParameterPrimary", "Value": params.get("parameter", "")},
 
             {"Global Attribute": "DOMS_time_to_complete", "Value": details["timeToComplete"]},
             {"Global Attribute": "DOMS_time_to_complete_units", "Value": "seconds"},
             {"Global Attribute": "DOMS_num_matchup_matched", "Value": details["numInSituMatched"]},
             {"Global Attribute": "DOMS_num_primary_matched", "Value": details["numGriddedMatched"]},
-            {"Global Attribute": "DOMS_num_matchup_checked",
-             "Value": details["numInSituChecked"] if details["numInSituChecked"] != 0 else "N/A"},
-            {"Global Attribute": "DOMS_num_primary_checked",
-             "Value": details["numGriddedChecked"] if details["numGriddedChecked"] != 0 else "N/A"},
 
             {"Global Attribute": "date_modified", "Value": datetime.utcnow().replace(tzinfo=UTC).strftime(ISO_8601)},
             {"Global Attribute": "date_created", "Value": datetime.utcnow().replace(tzinfo=UTC).strftime(ISO_8601)},
+
+            {"Global Attribute": "URI_Matchup", "Value": "http://{webservice}/domsresults?id=" + executionId + "&output=CSV"},
         ]
 
         writer = csv.DictWriter(csvfile, sorted(next(iter(global_attrs)).keys()))
@@ -258,31 +276,22 @@ class DomsCSVFormatter:
 class DomsNetCDFFormatter:
     @staticmethod
     def create(executionId, results, params, details):
+
         t = tempfile.mkstemp(prefix="doms_", suffix=".nc")
         tempFileName = t[1]
 
         dataset = Dataset(tempFileName, "w", format="NETCDF4")
+        dataset.DOMS_matchID = executionId
+        DomsNetCDFFormatter.__addNetCDFConstants(dataset)
 
-        dataset.matchID = executionId
-        dataset.Matchup_TimeWindow = params["timeTolerance"]
-        dataset.Matchup_TimeWindow_Units = "hours"
-
-        dataset.time_coverage_start = datetime.fromtimestamp(params["startTime"] / 1000).strftime('%Y%m%d %H:%M:%S')
-        dataset.time_coverage_end = datetime.fromtimestamp(params["endTime"] / 1000).strftime('%Y%m%d %H:%M:%S')
-        dataset.depth_min = params["depthMin"]
-        dataset.depth_max = params["depthMax"]
-        dataset.platforms = params["platforms"]
-
-        dataset.Matchup_SearchRadius = params["radiusTolerance"]
-        dataset.Matchup_SearchRadius_Units = "m"
-
-        dataset.bounding_box = params["bbox"]
-        dataset.primary = params["primary"]
-        dataset.secondary = ",".join(params["matchup"])
-
-        dataset.Matchup_ParameterPrimary = params["parameter"] if "parameter" in params else ""
-
+        dataset.date_modified = datetime.utcnow().replace(tzinfo=UTC).strftime(ISO_8601)
+        dataset.date_created = datetime.utcnow().replace(tzinfo=UTC).strftime(ISO_8601)
+        dataset.time_coverage_start = params["startTime"].strftime(ISO_8601)
+        dataset.time_coverage_end = params["endTime"].strftime(ISO_8601)
         dataset.time_coverage_resolution = "point"
+        dataset.DOMS_match_up = params["matchup"]
+        dataset.DOMS_num_matchup_matched = details["numInSituMatched"]
+        dataset.DOMS_num_primary_matched = details["numGriddedMatched"]
 
         bbox = geo.BoundingBox(asString=params["bbox"])
         dataset.geospatial_lat_max = bbox.north
@@ -293,54 +302,56 @@ class DomsNetCDFFormatter:
         dataset.geospatial_lon_resolution = "point"
         dataset.geospatial_lat_units = "degrees_north"
         dataset.geospatial_lon_units = "degrees_east"
-        dataset.geospatial_vertical_min = 0.0
-        dataset.geospatial_vertical_max = params["radiusTolerance"]
+        dataset.geospatial_vertical_min = float(params["depthMin"])
+        dataset.geospatial_vertical_max = float(params["depthMax"])
         dataset.geospatial_vertical_units = "m"
         dataset.geospatial_vertical_resolution = "point"
         dataset.geospatial_vertical_positive = "down"
 
-        dataset.time_to_complete = details["timeToComplete"]
-        dataset.num_insitu_matched = details["numInSituMatched"]
-        dataset.num_gridded_checked = details["numGriddedChecked"]
-        dataset.num_gridded_matched = details["numGriddedMatched"]
-        dataset.num_insitu_checked = details["numInSituChecked"]
+        dataset.DOMS_TimeWindow = params["timeTolerance"] / 60 / 60
+        dataset.DOMS_TimeWindow_Units = "hours"
+        dataset.DOMS_SearchRadius = float(params["radiusTolerance"])
+        dataset.DOMS_SearchRadius_Units = "m"
+        # dataset.URI_Subset = "http://webservice subsetting query request"
+        dataset.URI_Matchup = "http://{webservice}/domsresults?id=" + executionId + "&output=NETCDF"
+        dataset.DOMS_ParameterPrimary = params["parameter"] if "parameter" in params else ""
+        dataset.DOMS_platforms = params["platforms"]
+        dataset.DOMS_primary = params["primary"]
+        dataset.DOMS_time_to_complete = details["timeToComplete"]
+        dataset.DOMS_time_to_complete_units = "seconds"
+
+        insituDatasets = params["matchup"]
+        insituLinks = set()
+        for insitu in insituDatasets:
+            insituLinks.add(config.METADATA_LINKS[insitu])
+        dataset.DOMS_DatasetMetadata = ', '.join(insituLinks)
 
-        dataset.date_modified = datetime.now().strftime('%Y%m%d %H:%M:%S')
-        dataset.date_created = datetime.now().strftime('%Y%m%d %H:%M:%S')
-
-        DomsNetCDFFormatter.__addNetCDFConstants(dataset)
-
-        idList = []
-        primaryIdList = []
-        DomsNetCDFFormatter.__packDataIntoDimensions(idList, primaryIdList, results)
-
-        idDim = dataset.createDimension("id", size=None)
-        primaryIdDim = dataset.createDimension("primary_id", size=None)
-
-        idVar = dataset.createVariable("id", "i4", ("id",), chunksizes=(2048,))
-        primaryIdVar = dataset.createVariable("primary_id", "i4", ("primary_id",), chunksizes=(2048,))
+        platforms = set()
+        for primaryValue in results:
+            platforms.add(primaryValue['platform'])
+            for match in primaryValue['matches']:
+                platforms.add(match['platform'])
+        dataset.platform = ', '.join(platforms)
 
-        idVar[:] = idList
-        primaryIdVar[:] = primaryIdList
+        satellite_group_name = "SatelliteData"
+        insitu_group_name = "InsituData"
 
-        DomsNetCDFFormatter.__createDimension(dataset, results, "lat", "f4", "y")
-        DomsNetCDFFormatter.__createDimension(dataset, results, "lon", "f4", "x")
+        #Create Satellite group, variables, and attributes
+        satelliteGroup = dataset.createGroup(satellite_group_name)
+        satelliteWriter = DomsNetCDFValueWriter(satelliteGroup, params["parameter"])
 
-        DomsNetCDFFormatter.__createDimension(dataset, results, "sea_water_temperature_depth", "f4",
-                                              "sea_water_temperature_depth")
-        DomsNetCDFFormatter.__createDimension(dataset, results, "sea_water_temperature", "f4", "sea_water_temperature")
-        DomsNetCDFFormatter.__createDimension(dataset, results, "sea_water_salinity_depth", "f4",
-                                              "sea_water_salinity_depth")
-        DomsNetCDFFormatter.__createDimension(dataset, results, "sea_water_salinity", "f4", "sea_water_salinity")
+        # Create InSitu group, variables, and attributes
+        insituGroup = dataset.createGroup(insitu_group_name)
+        insituWriter = DomsNetCDFValueWriter(insituGroup, params["parameter"])
 
-        DomsNetCDFFormatter.__createDimension(dataset, results, "wind_speed", "f4", "wind_speed")
-        DomsNetCDFFormatter.__createDimension(dataset, results, "wind_direction", "f4", "wind_direction")
-        DomsNetCDFFormatter.__createDimension(dataset, results, "wind_u", "f4", "wind_u")
-        DomsNetCDFFormatter.__createDimension(dataset, results, "wind_v", "f4", "wind_v")
+        # Add data to Insitu and Satellite groups, generate array of match ID pairs
+        matches = DomsNetCDFFormatter.__writeResults(results, satelliteWriter, insituWriter)
+        dataset.createDimension("MatchedRecords", size=None)
+        dataset.createDimension("MatchedGroups", size=2)
+        matchArray = dataset.createVariable("matchIDs", "f4", ("MatchedRecords", "MatchedGroups"))
+        matchArray[:] = matches
 
-        DomsNetCDFFormatter.__createDimension(dataset, results, "time", "f4", "time")
         dataset.close()
-
         f = open(tempFileName, "rb")
         data = f.read()
         f.close()
@@ -348,199 +359,8 @@ class DomsNetCDFFormatter:
         return data
 
     @staticmethod
-    def __packDataIntoDimensions(idVar, primaryIdVar, values, primaryValueId=None):
-
-        for value in values:
-            id = hash(value["id"])
-            idVar.append(id)
-            primaryIdVar.append(primaryValueId if primaryValueId is not None else -1)
-
-            if "matches" in value and len(value["matches"]) > 0:
-                DomsNetCDFFormatter.__packDataIntoDimensions(idVar, primaryIdVar, value["matches"], id)
-
-    @staticmethod
-    def __packDimensionList(values, field, varList):
-        for value in values:
-            if field in value:
-                varList.append(value[field])
-            else:
-                varList.append(np.nan)
-            if "matches" in value and len(value["matches"]) > 0:
-                DomsNetCDFFormatter.__packDimensionList(value["matches"], field, varList)
-
-    @staticmethod
-    def __createDimension(dataset, values, name, type, arrayField):
-        dim = dataset.createDimension(name, size=None)
-        var = dataset.createVariable(name, type, (name,), chunksizes=(2048,), fill_value=-32767.0)
-
-        varList = []
-        DomsNetCDFFormatter.__packDimensionList(values, arrayField, varList)
-
-        var[:] = varList
-
-        if name == "lon":
-            DomsNetCDFFormatter.__enrichLonVariable(var)
-        elif name == "lat":
-            DomsNetCDFFormatter.__enrichLatVariable(var)
-        elif name == "time":
-            DomsNetCDFFormatter.__enrichTimeVariable(var)
-        elif name == "sea_water_salinity":
-            DomsNetCDFFormatter.__enrichSSSVariable(var)
-        elif name == "sea_water_salinity_depth":
-            DomsNetCDFFormatter.__enrichSSSDepthVariable(var)
-        elif name == "sea_water_temperature":
-            DomsNetCDFFormatter.__enrichSSTVariable(var)
-        elif name == "sea_water_temperature_depth":
-            DomsNetCDFFormatter.__enrichSSTDepthVariable(var)
-        elif name == "wind_direction":
-            DomsNetCDFFormatter.__enrichWindDirectionVariable(var)
-        elif name == "wind_speed":
-            DomsNetCDFFormatter.__enrichWindSpeedVariable(var)
-        elif name == "wind_u":
-            DomsNetCDFFormatter.__enrichWindUVariable(var)
-        elif name == "wind_v":
-            DomsNetCDFFormatter.__enrichWindVVariable(var)
-
-    @staticmethod
-    def __enrichSSSVariable(var):
-        var.long_name = "sea surface salinity"
-        var.standard_name = "sea_surface_salinity"
-        var.units = "1e-3"
-        var.valid_min = 30
-        var.valid_max = 40
-        var.scale_factor = 1.0
-        var.add_offset = 0.0
-        var.coordinates = "lon lat time"
-        var.grid_mapping = "crs"
-        var.comment = ""
-        var.cell_methods = ""
-        var.metadata_link = ""
-
-    @staticmethod
-    def __enrichSSSDepthVariable(var):
-        var.long_name = "sea surface salinity_depth"
-        var.standard_name = "sea_surface_salinity_depth"
-        var.units = "m"
-        var.scale_factor = 1.0
-        var.add_offset = 0.0
-        var.coordinates = "lon lat time"
-        var.grid_mapping = "crs"
-        var.comment = ""
-        var.cell_methods = ""
-        var.metadata_link = ""
-
-    @staticmethod
-    def __enrichSSTVariable(var):
-        var.long_name = "sea surface temperature"
-        var.standard_name = "sea_surface_temperature"
-        var.units = "c"
-        var.valid_min = -3
-        var.valid_max = 50
-        var.scale_factor = 1.0
-        var.add_offset = 0.0
-        var.coordinates = "lon lat time"
-        var.grid_mapping = "crs"
-        var.comment = ""
-        var.cell_methods = ""
-        var.metadata_link = ""
-
-    @staticmethod
-    def __enrichSSTDepthVariable(var):
-        var.long_name = "sea surface temperature_depth"
-        var.standard_name = "sea_surface_temperature_depth"
-        var.units = "m"
-        var.scale_factor = 1.0
-        var.add_offset = 0.0
-        var.coordinates = "lon lat time"
-        var.grid_mapping = "crs"
-        var.comment = ""
-        var.cell_methods = ""
-        var.metadata_link = ""
-
-    @staticmethod
-    def __enrichWindDirectionVariable(var):
-        var.long_name = "wind direction"
-        var.standard_name = "wind_direction"
-        var.units = "degrees"
-        var.scale_factor = 1.0
-        var.add_offset = 0.0
-        var.coordinates = "lon lat time"
-        var.grid_mapping = "crs"
-        var.comment = ""
-        var.cell_methods = ""
-        var.metadata_link = ""
-
-    @staticmethod
-    def __enrichWindSpeedVariable(var):
-        var.long_name = "wind speed"
-        var.standard_name = "wind_speed"
-        var.units = "km/h"
-        var.scale_factor = 1.0
-        var.add_offset = 0.0
-        var.coordinates = "lon lat time"
-        var.grid_mapping = "crs"
-        var.comment = ""
-        var.cell_methods = ""
-        var.metadata_link = ""
-
-    @staticmethod
-    def __enrichWindUVariable(var):
-        var.long_name = "wind u"
-        var.standard_name = "wind_u"
-        var.units = ""
-        var.scale_factor = 1.0
-        var.add_offset = 0.0
-        var.coordinates = "lon lat time"
-        var.grid_mapping = "crs"
-        var.comment = ""
-        var.cell_methods = ""
-        var.metadata_link = ""
-
-    @staticmethod
-    def __enrichWindVVariable(var):
-        var.long_name = "wind v"
-        var.standard_name = "wind_v"
-        var.units = ""
-        var.scale_factor = 1.0
-        var.add_offset = 0.0
-        var.coordinates = "lon lat time"
-        var.grid_mapping = "crs"
-        var.comment = ""
-        var.cell_methods = ""
-        var.metadata_link = ""
-
-    @staticmethod
-    def __enrichTimeVariable(var):
-        var.long_name = "Time"
-        var.standard_name = "time"
-        var.axis = "T"
-        var.units = "seconds since 1970-01-01 00:00:00 0:00"
-        var.calendar = "standard"
-        var.comment = "Nominal time of satellite corresponding to the start of the product time interval"
-
-    @staticmethod
-    def __enrichLonVariable(var):
-        var.long_name = "Longitude"
-        var.standard_name = "longitude"
-        var.axis = "X"
-        var.units = "degrees_east"
-        var.valid_min = -180.0
-        var.valid_max = 180.0
-        var.comment = "Data longitude for in-situ, midpoint beam for satellite measurements."
-
-    @staticmethod
-    def __enrichLatVariable(var):
-        var.long_name = "Latitude"
-        var.standard_name = "latitude"
-        var.axis = "Y"
-        var.units = "degrees_north"
-        var.valid_min = -90.0
-        var.valid_max = 90.0
-        var.comment = "Data latitude for in-situ, midpoint beam for satellite measurements."
-
-    @staticmethod
     def __addNetCDFConstants(dataset):
-        dataset.bnds = 2
+        dataset.product_version = "1.0"
         dataset.Conventions = "CF-1.6, ACDD-1.3"
         dataset.title = "DOMS satellite-insitu machup output file"
         dataset.history = "Processing_Version = V1.0, Software_Name = DOMS, Software_Version = 1.03"
@@ -549,176 +369,267 @@ class DomsNetCDFFormatter:
         dataset.standard_name_vocabulary = "CF Standard Name Table v27", "BODC controlled vocabulary"
         dataset.cdm_data_type = "Point/Profile, Swath/Grid"
         dataset.processing_level = "4"
-        dataset.platform = "Endeavor"
-        dataset.instrument = "Endeavor on-board sea-bird SBE 9/11 CTD"
         dataset.project = "Distributed Oceanographic Matchup System (DOMS)"
         dataset.keywords_vocabulary = "NASA Global Change Master Directory (GCMD) Science Keywords"
-        dataset.keywords = "Salinity, Upper Ocean, SPURS, CTD, Endeavor, Atlantic Ocean"
+        dataset.keywords = "SATELLITES, OCEAN PLATFORMS, SHIPS, BUOYS, MOORINGS, AUVS, ROV, NASA/JPL/PODAAC, " \
+                           "FSU/COAPS, UCAR/NCAR, SALINITY, SEA SURFACE TEMPERATURE, SURFACE WINDS"
         dataset.creator_name = "NASA PO.DAAC"
         dataset.creator_email = "podaac@podaac.jpl.nasa.gov"
         dataset.creator_url = "https://podaac.jpl.nasa.gov/"
         dataset.publisher_name = "NASA PO.DAAC"
         dataset.publisher_email = "podaac@podaac.jpl.nasa.gov"
         dataset.publisher_url = "https://podaac.jpl.nasa.gov"
-        dataset.acknowledgment = "DOMS is a NASA/AIST-funded project.  Grant number ####."
-
+        dataset.acknowledgment = "DOMS is a NASA/AIST-funded project. NRA NNH14ZDA001N."
 
-class DomsNetCDFFormatterAlt:
     @staticmethod
-    def create(executionId, results, params, details):
-        t = tempfile.mkstemp(prefix="doms_", suffix=".nc")
-        tempFileName = t[1]
-
-        dataset = Dataset(tempFileName, "w", format="NETCDF4")
-
-        dataset.matchID = executionId
-        dataset.Matchup_TimeWindow = params["timeTolerance"]
-        dataset.Matchup_TimeWindow_Units = "hours"
-
-        dataset.time_coverage_start = datetime.fromtimestamp(params["startTime"] / 1000).strftime('%Y%m%d %H:%M:%S')
-        dataset.time_coverage_end = datetime.fromtimestamp(params["endTime"] / 1000).strftime('%Y%m%d %H:%M:%S')
-        dataset.depth_min = params["depthMin"]
-        dataset.depth_max = params["depthMax"]
-        dataset.platforms = params["platforms"]
-
-        dataset.Matchup_SearchRadius = params["radiusTolerance"]
-        dataset.Matchup_SearchRadius_Units = "m"
-
-        dataset.bounding_box = params["bbox"]
-        dataset.primary = params["primary"]
-        dataset.secondary = ",".join(params["matchup"])
-
-        dataset.Matchup_ParameterPrimary = params["parameter"] if "parameter" in params else ""
-
-        dataset.time_coverage_resolution = "point"
-
-        bbox = geo.BoundingBox(asString=params["bbox"])
-        dataset.geospatial_lat_max = bbox.north
-        dataset.geospatial_lat_min = bbox.south
-        dataset.geospatial_lon_max = bbox.east
-        dataset.geospatial_lon_min = bbox.west
-        dataset.geospatial_lat_resolution = "point"
-        dataset.geospatial_lon_resolution = "point"
-        dataset.geospatial_lat_units = "degrees_north"
-        dataset.geospatial_lon_units = "degrees_east"
-        dataset.geospatial_vertical_min = 0.0
-        dataset.geospatial_vertical_max = params["radiusTolerance"]
-        dataset.geospatial_vertical_units = "m"
-        dataset.geospatial_vertical_resolution = "point"
-        dataset.geospatial_vertical_positive = "down"
+    def __writeResults(results, satelliteWriter, insituWriter):
+        ids = {}
+        matches = []
+        insituIndex = 0
 
-        dataset.time_to_complete = details["timeToComplete"]
-        dataset.num_insitu_matched = details["numInSituMatched"]
-        dataset.num_gridded_checked = details["numGriddedChecked"]
-        dataset.num_gridded_matched = details["numGriddedMatched"]
-        dataset.num_insitu_checked = details["numInSituChecked"]
+        #
+        # Loop through all of the results, add each satellite data point to the array
+        #
+        for r in range(0, len(results)):
+            result = results[r]
+            satelliteWriter.addData(result)
 
-        dataset.date_modified = datetime.now().strftime('%Y%m%d %H:%M:%S')
-        dataset.date_created = datetime.now().strftime('%Y%m%d %H:%M:%S')
+            # Add each match only if it is not already in the array of in situ points
+            for match in result["matches"]:
+                if match["id"] not in ids:
+                    ids[match["id"]] = insituIndex
+                    insituIndex += 1
+                    insituWriter.addData(match)
 
-        DomsNetCDFFormatterAlt.__addNetCDFConstants(dataset)
+                # Append an index pait of (satellite, in situ) to the array of matches
+                matches.append((r, ids[match["id"]]))
 
-        satelliteGroup = dataset.createGroup("SatelliteData")
-        satelliteWriter = DomsNetCDFValueWriter(satelliteGroup)
+        # Add data/write to the netCDF file
+        satelliteWriter.writeGroup()
+        insituWriter.writeGroup()
 
-        insituGroup = dataset.createGroup("InsituData")
-        insituWriter = DomsNetCDFValueWriter(insituGroup)
+        return matches
 
-        matches = DomsNetCDFFormatterAlt.__writeResults(results, satelliteWriter, insituWriter)
 
-        satelliteWriter.commit()
-        insituWriter.commit()
+class DomsNetCDFValueWriter:
+    def __init__(self, group, matchup_parameter):
+        group.createDimension("dim", size=None)
+        self.group = group
 
-        satDim = dataset.createDimension("satellite_ids", size=None)
-        satVar = dataset.createVariable("satellite_ids", "i4", ("satellite_ids",), chunksizes=(2048,),
-                                        fill_value=-32767)
+        self.lat = []
+        self.lon = []
+        self.time = []
+        self.sea_water_salinity = []
+        self.wind_speed = []
+        self.wind_u = []
+        self.wind_v = []
+        self.wind_direction = []
+        self.sea_water_temperature = []
+        self.depth = []
+
+        self.satellite_group_name = "SatelliteData"
+        self.insitu_group_name = "InsituData"
+
+        #
+        # Only include the depth variable related to the match-up parameter. If the match-up parameter is
+        # not sss or sst then do not include any depth data, just fill values.
+        #
+        if matchup_parameter == "sss":
+            self.matchup_depth = "sea_water_salinity_depth"
+        elif matchup_parameter == "sst":
+            self.matchup_depth = "sea_water_temperature_depth"
+        else:
+            self.matchup_depth = "NO_DEPTH"
+
+    def addData(self, value):
+        self.lat.append(value.get("y", None))
+        self.lon.append(value.get("x", None))
+        self.time.append(time.mktime(value.get("time").timetuple()))
+        self.sea_water_salinity.append(value.get("sea_water_salinity", None))
+        self.wind_speed.append(value.get("wind_speed", None))
+        self.wind_u.append(value.get("wind_u", None))
+        self.wind_v.append(value.get("wind_v", None))
+        self.wind_direction.append(value.get("wind_direction", None))
+        self.sea_water_temperature.append(value.get("sea_water_temperature", None))
+        self.depth.append(value.get(self.matchup_depth, None))
+
+    def writeGroup(self):
+        #
+        # Create variables, enrich with attributes, and add data
+        #
+        lonVar = self.group.createVariable("lon", "f4", ("dim",), fill_value=-32767.0)
+        latVar = self.group.createVariable("lat", "f4", ("dim",), fill_value=-32767.0)
+        timeVar = self.group.createVariable("time", "f4", ("dim",), fill_value=-32767.0)
+
+        self.__enrichLon(lonVar, min(self.lon), max(self.lon))
+        self.__enrichLat(latVar, min(self.lat), max(self.lat))
+        self.__enrichTime(timeVar)
+
+        latVar[:] = self.lat
+        lonVar[:] = self.lon
+        timeVar[:] = self.time
+
+        if self.sea_water_salinity.count(None) != len(self.sea_water_salinity):
+            if self.group.name == self.satellite_group_name:
+                sssVar = self.group.createVariable("SeaSurfaceSalinity", "f4", ("dim",), fill_value=-32767.0)
+                self.__enrichSSSMeasurements(sssVar, min(self.sea_water_salinity), max(self.sea_water_salinity))
+            else:  # group.name == self.insitu_group_name
+                sssVar = self.group.createVariable("SeaWaterSalinity", "f4", ("dim",), fill_value=-32767.0)
+                self.__enrichSWSMeasurements(sssVar, min(self.sea_water_salinity), max(self.sea_water_salinity))
+            sssVar[:] = self.sea_water_salinity
+
+        if self.wind_speed.count(None) != len(self.wind_speed):
+            windSpeedVar = self.group.createVariable("WindSpeed", "f4", ("dim",), fill_value=-32767.0)
+            self.__enrichWindSpeed(windSpeedVar, self.__calcMin(self.wind_speed), max(self.wind_speed))
+            windSpeedVar[:] = self.wind_speed
+
+        if self.wind_u.count(None) != len(self.wind_u):
+            windUVar = self.group.createVariable("WindU", "f4", ("dim",), fill_value=-32767.0)
+            windUVar[:] = self.wind_u
+            self.__enrichWindU(windUVar, self.__calcMin(self.wind_u), max(self.wind_u))
+
+        if self.wind_v.count(None) != len(self.wind_v):
+            windVVar = self.group.createVariable("WindV", "f4", ("dim",), fill_value=-32767.0)
+            windVVar[:] = self.wind_v
+            self.__enrichWindV(windVVar, self.__calcMin(self.wind_v), max(self.wind_v))
+
+        if self.wind_direction.count(None) != len(self.wind_direction):
+            windDirVar = self.group.createVariable("WindDirection", "f4", ("dim",), fill_value=-32767.0)
+            windDirVar[:] = self.wind_direction
+            self.__enrichWindDir(windDirVar)
+
+        if self.sea_water_temperature.count(None) != len(self.sea_water_temperature):
+            if self.group.name == self.satellite_group_name:
+                tempVar = self.group.createVariable("SeaSurfaceTemp", "f4", ("dim",), fill_value=-32767.0)
+                self.__enrichSurfaceTemp(tempVar, self.__calcMin(self.sea_water_temperature), max(self.sea_water_temperature))
+            else:
+                tempVar = self.group.createVariable("SeaWaterTemp", "f4", ("dim",), fill_value=-32767.0)
+                self.__enrichWaterTemp(tempVar, self.__calcMin(self.sea_water_temperature), max(self.sea_water_temperature))
+            tempVar[:] = self.sea_water_temperature
 
-        satVar[:] = [f[0] for f in matches]
+        if self.group.name == self.insitu_group_name:
+            depthVar = self.group.createVariable("Depth", "f4", ("dim",), fill_value=-32767.0)
 
-        insituDim = dataset.createDimension("insitu_ids", size=None)
-        insituVar = dataset.createVariable("insitu_ids", "i4", ("insitu_ids",), chunksizes=(2048,),
-                                           fill_value=-32767)
-        insituVar[:] = [f[1] for f in matches]
+            if self.depth.count(None) != len(self.depth):
+                self.__enrichDepth(depthVar, self.__calcMin(self.depth), max(self.depth))
+                depthVar[:] = self.depth
+            else:
+                # If depth has no data, set all values to 0
+                tempDepth = [0 for x in range(len(self.depth))]
+                depthVar[:] = tempDepth
 
-        dataset.close()
+    #
+    # Lists may include 'None" values, to calc min these must be filtered out
+    #
+    @staticmethod
+    def __calcMin(var):
+        return min(x for x in var if x is not None)
 
-        f = open(tempFileName, "rb")
-        data = f.read()
-        f.close()
-        os.unlink(tempFileName)
-        return data
 
+    #
+    # Add attributes to each variable
+    #
     @staticmethod
-    def __writeResults(results, satelliteWriter, insituWriter):
-        ids = {}
-        matches = []
+    def __enrichLon(var, var_min, var_max):
+        var.long_name = "Longitude"
+        var.standard_name = "longitude"
+        var.axis = "X"
+        var.units = "degrees_east"
+        var.valid_min = var_min
+        var.valid_max = var_max
 
-        insituIndex = 0
+    @staticmethod
+    def __enrichLat(var, var_min, var_max):
+        var.long_name = "Latitude"
+        var.standard_name = "latitude"
+        var.axis = "Y"
+        var.units = "degrees_north"
+        var.valid_min = var_min
+        var.valid_max = var_max
 
-        for r in range(0, len(results)):
-            result = results[r]
-            satelliteWriter.write(result)
-            for match in result["matches"]:
-                if match["id"] not in ids:
-                    ids[match["id"]] = insituIndex
-                    insituIndex += 1
-                    insituWriter.write(match)
+    @staticmethod
+    def __enrichTime(var):
+        var.long_name = "Time"
+        var.standard_name = "time"
+        var.axis = "T"
+        var.units = "seconds since 1970-01-01 00:00:00 0:00"
 
-                matches.append((r, ids[match["id"]]))
+    @staticmethod
+    def __enrichSSSMeasurements(var, var_min, var_max):
+        var.long_name = "Sea surface salinity"
+        var.standard_name = "sea_surface_salinity"
+        var.units = "1e-3"
+        var.valid_min = var_min
+        var.valid_max = var_max
+        var.coordinates = "lon lat time"
 
-        return matches
+    @staticmethod
+    def __enrichSWSMeasurements(var, var_min, var_max):
+        var.long_name = "Sea water salinity"
+        var.standard_name = "sea_water_salinity"
+        var.units = "1e-3"
+        var.valid_min = var_min
+        var.valid_max = var_max
+        var.coordinates = "lon lat depth time"
 
     @staticmethod
-    def __addNetCDFConstants(dataset):
-        dataset.bnds = 2
-        dataset.Conventions = "CF-1.6, ACDD-1.3"
-        dataset.title = "DOMS satellite-insitu machup output file"
-        dataset.history = "Processing_Version = V1.0, Software_Name = DOMS, Software_Version = 1.03"
-        dataset.institution = "JPL, FSU, NCAR"
-        dataset.source = "doms.jpl.nasa.gov"
-        dataset.standard_name_vocabulary = "CF Standard Name Table v27", "BODC controlled vocabulary"
-        dataset.cdm_data_type = "Point/Profile, Swath/Grid"
-        dataset.processing_level = "4"
-        dataset.platform = "Endeavor"
-        dataset.instrument = "Endeavor on-board sea-bird SBE 9/11 CTD"
-        dataset.project = "Distributed Oceanographic Matchup System (DOMS)"
-        dataset.keywords_vocabulary = "NASA Global Change Master Directory (GCMD) Science Keywords"
-        dataset.keywords = "Salinity, Upper Ocean, SPURS, CTD, Endeavor, Atlantic Ocean"
-        dataset.creator_name = "NASA PO.DAAC"
-        dataset.creator_email = "podaac@podaac.jpl.nasa.gov"
-        dataset.creator_url = "https://podaac.jpl.nasa.gov/"
-        dataset.publisher_name = "NASA PO.DAAC"
-        dataset.publisher_email = "podaac@podaac.jpl.nasa.gov"
-        dataset.publisher_url = "https://podaac.jpl.nasa.gov"
-        dataset.acknowledgment = "DOMS is a NASA/AIST-funded project.  Grant number ####."
+    def __enrichDepth(var, var_min, var_max):
+        var.valid_min = var_min
+        var.valid_max = var_max
+        var.units = "m"
+        var.long_name = "Depth"
+        var.standard_name = "depth"
+        var.axis = "Z"
+        var.positive = "Down"
 
+    @staticmethod
+    def __enrichWindSpeed(var, var_min, var_max):
+        var.long_name = "Wind speed"
+        var.standard_name = "wind_speed"
+        var.units = "m s-1"
+        var.valid_min = var_min
+        var.valid_max = var_max
+        var.coordinates = "lon lat depth time"
 
-class DomsNetCDFValueWriter:
-    def __init__(self, group):
-        self.latVar = DomsNetCDFValueWriter.__createDimension(group, "lat", "f4")
-        self.lonVar = DomsNetCDFValueWriter.__createDimension(group, "lon", "f4")
-        self.sstVar = DomsNetCDFValueWriter.__createDimension(group, "sea_water_temperature", "f4")
-        self.timeVar = DomsNetCDFValueWriter.__createDimension(group, "time", "f4")
+    @staticmethod
+    def __enrichWindU(var, var_min, var_max):
+        var.long_name = "Eastward wind"
+        var.standard_name = "eastward_wind"
+        var.units = "m s-1"
+        var.valid_min = var_min
+        var.valid_max = var_max
+        var.coordinates = "lon lat depth time"
 
-        self.lat = []
-        self.lon = []
-        self.sst = []
-        self.time = []
+    @staticmethod
+    def __enrichWindV(var, var_min, var_max):
+        var.long_name = "Northward wind"
+        var.standard_name = "northward_wind"
+        var.units = "m s-1"
+        var.valid_min = var_min
+        var.valid_max = var_max
+        var.coordinates = "lon lat depth time"
 
-    def write(self, value):
-        self.lat.append(value["y"])
-        self.lon.append(value["x"])
-        self.time.append(value["time"])
-        self.sst.append(value["sea_water_temperature"])
+    @staticmethod
+    def __enrichWaterTemp(var, var_min, var_max):
+        var.long_name = "Sea water temperature"
+        var.standard_name = "sea_water_temperature"
+        var.units = "degree_C"
+        var.valid_min = var_min
+        var.valid_max = var_max
+        var.coordinates = "lon lat depth time"
 
-    def commit(self):
-        self.latVar[:] = self.lat
-        self.lonVar[:] = self.lon
-        self.sstVar[:] = self.sst
-        self.timeVar[:] = self.time
+    @staticmethod
+    def __enrichSurfaceTemp(var, var_min, var_max):
+        var.long_name = "Sea surface temperature"
+        var.standard_name = "sea_surface_temperature"
+        var.units = "degree_C"
+        var.valid_min = var_min
+        var.valid_max = var_max
+        var.coordinates = "lon lat time"
 
     @staticmethod
-    def __createDimension(group, name, type):
-        dim = group.createDimension(name, size=None)
-        var = group.createVariable(name, type, (name,), chunksizes=(2048,), fill_value=-32767.0)
-        return var
+    def __enrichWindDir(var):
+        var.long_name = "Wind from direction"
+        var.standard_name = "wind_from_direction"
+        var.units = "degree"
+        var.coordinates = "lon lat depth time"
diff --git a/analysis/webservice/algorithms/doms/config.py b/analysis/webservice/algorithms/doms/config.py
index 0863a55..ff492e8 100644
--- a/analysis/webservice/algorithms/doms/config.py
+++ b/analysis/webservice/algorithms/doms/config.py
@@ -48,6 +48,12 @@ ENDPOINTS = [
     }
 ]
 
+METADATA_LINKS = {
+    "samos": "http://samos.coaps.fsu.edu/html/nav.php?s=2",
+    "icoads": "https://rda.ucar.edu/datasets/ds548.1/",
+    "spurs": "https://podaac.jpl.nasa.gov/spurs"
+}
+
 import os
 
 try:
@@ -87,6 +93,11 @@ try:
                 "metadataUrl": "http://doms.jpl.nasa.gov/ws/metadata/dataset?shortName=SPURS-2&format=umm-json"
             }
         ]
+        METADATA_LINKS = {
+            "samos": "http://samos.coaps.fsu.edu/html/nav.php?s=2",
+            "icoads": "https://rda.ucar.edu/datasets/ds548.1/",
+            "spurs": "https://podaac.jpl.nasa.gov/spurs"
+        }
 except KeyError:
     pass
 
diff --git a/analysis/webservice/algorithms_spark/HofMoellerSpark.py b/analysis/webservice/algorithms_spark/HofMoellerSpark.py
index 96e9f6a..1696732 100644
--- a/analysis/webservice/algorithms_spark/HofMoellerSpark.py
+++ b/analysis/webservice/algorithms_spark/HofMoellerSpark.py
@@ -190,13 +190,10 @@ class BaseHoffMoellerHandlerImpl(SparkHandler):
                     request.get_start_datetime().strftime(ISO_8601), request.get_end_datetime().strftime(ISO_8601)),
                 code=400)
 
-        spark_master, spark_nexecs, spark_nparts = request.get_spark_cfg()
-
         start_seconds_from_epoch = long((start_time - EPOCH).total_seconds())
         end_seconds_from_epoch = long((end_time - EPOCH).total_seconds())
 
-        return ds, bounding_polygon, start_seconds_from_epoch, end_seconds_from_epoch, \
-               spark_master, spark_nexecs, spark_nparts
+        return ds, bounding_polygon, start_seconds_from_epoch, end_seconds_from_epoch
 
     def applyDeseasonToHofMoellerByField(self, results, pivot="lats", field="mean", append=True):
         shape = (len(results), len(results[0][pivot]))
@@ -336,7 +333,7 @@ class LatitudeTimeHoffMoellerSparkHandlerImpl(BaseHoffMoellerHandlerImpl):
         BaseHoffMoellerHandlerImpl.__init__(self)
 
     def calc(self, compute_options, **args):
-        ds, bbox, start_time, end_time, spark_master, spark_nexecs, spark_nparts = self.parse_arguments(compute_options)
+        ds, bbox, start_time, end_time = self.parse_arguments(compute_options)
 
         min_lon, min_lat, max_lon, max_lat = bbox.bounds
 
@@ -378,7 +375,7 @@ class LongitudeTimeHoffMoellerSparkHandlerImpl(BaseHoffMoellerHandlerImpl):
         BaseHoffMoellerHandlerImpl.__init__(self)
 
     def calc(self, compute_options, **args):
-        ds, bbox, start_time, end_time, spark_master, spark_nexecs, spark_nparts = self.parse_arguments(compute_options)
+        ds, bbox, start_time, end_time = self.parse_arguments(compute_options)
 
         min_lon, min_lat, max_lon, max_lat = bbox.bounds
 
diff --git a/docker/README.md b/docker/README.md
deleted file mode 100644
index b80ece9..0000000
--- a/docker/README.md
+++ /dev/null
@@ -1 +0,0 @@
-# NEXUS Docker
\ No newline at end of file
diff --git a/docker/Readme.rst b/docker/Readme.rst
new file mode 100644
index 0000000..a305620
--- /dev/null
+++ b/docker/Readme.rst
@@ -0,0 +1 @@
+# NEXUS Docker
diff --git a/docker/solr-single-node/README.md b/docker/solr-single-node/README.md
deleted file mode 100644
index 97043a4..0000000
--- a/docker/solr-single-node/README.md
+++ /dev/null
@@ -1,10 +0,0 @@
-
-
-This Docker container runs Apache Solr v7.4 as a single node with nexustiles collection.
-
-The easiest way to run it is:
-
-    export SOLR_HOME=/opt/solr/server/solr/
-    docker run -it --name solr -e SOLR_HOME=${SOLR_HOME}-v /home/nexus/solr/data:${SOLR_HOME}/nexustiles sdap/solr-singlenode:${VERSION}
-
-/home/nexus/solr/data is directory on host machine where index files will be written to.
diff --git a/docker/solr/Dockerfile b/docker/solr/Dockerfile
index caad8bb..e7cd99d 100644
--- a/docker/solr/Dockerfile
+++ b/docker/solr/Dockerfile
@@ -26,9 +26,10 @@ RUN cd / && \
     git clone https://github.com/apache/incubator-sdap-nexus.git && \
     cp -r /incubator-sdap-nexus/data-access/config/schemas/solr/nexustiles /tmp/nexustiles && \
     rm -rf /incubator-sdap-nexus && \
-    wget http://central.maven.org/maven2/org/locationtech/jts/jts-core/1.15.0/jts-core-1.15.0.jar && \
-    cp jts-core-1.15.0.jar /opt/solr/server/solr-webapp/webapp/WEB-INF/lib/jts-core-1.15.0.jar && \
-    chown ${SOLR_USER}:${SOLR_GROUP} /opt/solr/server/solr-webapp/webapp/WEB-INF/lib/jts-core-1.15.0.jar && \
-    rm jts-core-1.15.0.jar
+    wget http://central.maven.org/maven2/org/locationtech/jts/jts-core/1.15.1/jts-core-1.15.1.jar && \
+    cp jts-core-1.15.1.jar /opt/solr/server/solr-webapp/webapp/WEB-INF/lib/jts-core-1.15.1.jar && \
+    chown ${SOLR_USER}:${SOLR_GROUP} /opt/solr/server/solr-webapp/webapp/WEB-INF/lib/jts-core-1.15.1.jar && \
+    rm jts-core-1.15.1.jar
+
 
 USER ${SOLR_USER}
diff --git a/docker/solr/README.md b/docker/solr/README.md
deleted file mode 100644
index e69de29..0000000
diff --git a/docker/solr/Readme.rst b/docker/solr/Readme.rst
new file mode 100644
index 0000000..6ecbe5b
--- /dev/null
+++ b/docker/solr/Readme.rst
@@ -0,0 +1,48 @@
+.. _solr_images:
+
+Solr Images
+=====================
+
+All docker builds for the Solr images should happen from this directory. For copy/paste ability, first export the environment variable ``BUILD_VERSION`` to the version number you would like to tag images as.
+
+Common Environment Variables
+------------------------------
+
+Any environment variable that can be passed to `solr.in.sh <https://github.com/apache/lucene-solr/blob/95d01c6583b825b6b87591e4f27002c285ea25fb/solr/bin/solr.in.sh>`_ and be passed as an environment variable to the docker container and it will be utilized. A few options are called out here:
+
+``SOLR_HEAP``
+    *default: 512m*
+
+    Increase Java Heap as needed to support your indexing / query needs
+
+``SOLR_HOME``
+    *default /opt/solr/server/solr*
+
+    Path to a directory for Solr to store cores and their data. This directory is exposed as a ``VOLUME`` that can be mounted.
+
+If you want to mount the ``SOLR_HOME`` directory to a directory on the host machine, you need to provide the container path to the docker run ``-v`` option. Doing this allows you to retain the index between start/stop of this container.
+
+sdap/solr
+---------
+
+This is the base image used by both singlenode and cloud versions of the Solr image.
+
+How To Build
+^^^^^^^^^^^^
+
+This image can be built by:
+
+.. code-block:: bash
+
+    docker build -t sdap/solr:${BUILD_VERSION} .
+
+How to Run
+^^^^^^^^^^
+
+This image is not intended to be run directly.
+
+.. include:: ../docker/solr/singlenode/Readme.rst
+
+.. include:: ../docker/solr/cloud/Readme.rst
+
+.. include:: ../docker/solr/cloud-init/Readme.rst
diff --git a/docker/solr-single-node/Dockerfile b/docker/solr/cloud-init/Dockerfile
similarity index 64%
copy from docker/solr-single-node/Dockerfile
copy to docker/solr/cloud-init/Dockerfile
index 87d4c9a..1bb7644 100644
--- a/docker/solr-single-node/Dockerfile
+++ b/docker/solr/cloud-init/Dockerfile
@@ -13,17 +13,19 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-ARG tag_version=latest
-FROM sdap/solr:${tag_version}
+
+FROM python:3
 MAINTAINER Apache SDAP "dev@sdap.apache.org"
 
-USER root
+ENV MINIMUM_NODES="1" \
+    SDAP_ZK_SOLR="localhost:2181/solr" \
+    SDAP_SOLR_URL="http://localhost:8983/solr/" \
+    ZK_LOCK_GUID="c4d193b1-7e47-4b32-a169-a596463da0f5" \
+    MAX_RETRIES="30" \
+    CREATE_COLLECTION_PARAMS="name=nexustiles&collection.configName=nexustiles&numShards=1"
 
-COPY create-core.sh /docker-entrypoint-initdb.d/create-core.sh
-RUN echo "${SOLR_USER} ALL=(ALL) NOPASSWD: /usr/bin/cp -r /tmp/nexustiles/* ${SOLR_HOME}/nexustiles/" >> /etc/sudoers && \
-  echo "${SOLR_USER} ALL=(ALL) NOPASSWD: /usr/bin/chown -R ${SOLR_USER}\:${SOLR_GROUP} ${SOLR_HOME}/nexustiles" >> /etc/sudoers
 
-USER ${SOLR_USER}
-VOLUME ${SOLR_HOME}/nexustiles
+RUN pip install kazoo==2.6.0 requests==2.21.0
+COPY ./cloud-init/create-collection.py /tmp/create-collection.py
 
-ENTRYPOINT ["solr-foreground"]
+ENTRYPOINT ["/tmp/create-collection.py"]
diff --git a/docker/solr/cloud-init/Readme.rst b/docker/solr/cloud-init/Readme.rst
new file mode 100644
index 0000000..e8a9548
--- /dev/null
+++ b/docker/solr/cloud-init/Readme.rst
@@ -0,0 +1,73 @@
+.. _solr_cloud_init:
+
+sdap/solr-cloud-init
+--------------------
+
+This image can be used to automatically create the ``nexustiles`` collection in SolrCloud.
+
+How To Build
+^^^^^^^^^^^^
+
+This image can be built from the incubator/sdap/solr directory:
+
+.. code-block:: bash
+
+    docker build -t sdap/solr-cloud-init:${BUILD_VERSION} -f cloud-init/Dockerfile .
+
+How to Run
+^^^^^^^^^^
+
+This image is designed to run in a container alongside the :ref:`solr_cloud` container. The purpose is to detect if there are at least ``MINIMUM_NODES`` live nodes in the cluster. If there are, then detect if the ``nexustiles`` collection exists or not. If it does not, this script will create it using the parameters defined by the ``CREATE_COLLECTION_PARAMS`` environment variable. See the reference documents for the `create <http://lucene.apache.org/solr/guide/7_4/collections-api.html#cr [...]
+
+.. note::
+
+	The ``action=CREATE`` parameter is already passed for you and should not be part of ``CREATE_COLLECTION_PARAMS``
+
+.. note::
+
+  This image was designed to be long running. It will only exit if there was an error detecting or creating the ``nexustiles`` collection.
+
+
+Environment Variables
+""""""""""""""""""""""""""""""""""""
+
+``MINIMUM_NODES``
+    *default: 1*
+
+    The minimum number of nodes that must be 'live' before the collection is created.
+
+``SDAP_ZK_SOLR``
+    *default: localhost:2181/solr*
+
+    The host:port/chroot of the zookeeper being used by SolrCloud.
+
+``SDAP_SOLR_URL``
+    *default: http://localhost:8983/solr/*
+
+    The URL that should be polled to check if a SolrCloud node is running. This should be the URL of the :ref:`solr_cloud` container that is being started alongside this container.
+``ZK_LOCK_GUID``
+    *default: c4d193b1-7e47-4b32-a169-a596463da0f5*
+
+    A GUID that is used to create a lock in zookeeper so that if more than one of these init containers are started at the same time, only one will attempt to create the collection. This GUID should be the same across all containers that are trying to create the same collection.
+
+``MAX_RETRIES``
+    *default: 30*
+
+    The number of times we will try to connect to SolrCloud at ``SDAP_SOLR_URL``. This is roughly equivalent to how many seconds we will wait for the node at ``SDAP_SOLR_URL`` to become available. If ``MAX_RETRIES`` is exceeded, the container will exit with an error.
+
+``CREATE_COLLECTION_PARAMS``
+    *default: name=nexustiles&collection.configName=nexustiles&numShards=1*
+
+    The parameters sent to the collection create function. See the reference documents for the `create <http://lucene.apache.org/solr/guide/7_4/collections-api.html#create>`_ function for the Solr collections API for valid parameters.
+
+
+Example Run
+"""""""""""""""
+
+Assuming Zookeeper is running on the host machine port 2181, and a :ref:`solr_cloud` container is also running with port 8983 mapped to the host machine, the easiest way to run this image is:
+
+.. code-block:: bash
+
+    docker run -it --rm --name init -e SDAP_ZK_SOLR="host.docker.internal:2181/solr" -e SDAP_SOLR_URL="http://host.docker.internal:8983/solr/" sdap/solr-cloud-init:${BUILD_VERSION}
+
+After running this image, the ``nexustiles`` collection should be available on the SolrCloud installation. Check the logs for the container to see details.
diff --git a/docker/solr/cloud-init/create-collection.py b/docker/solr/cloud-init/create-collection.py
new file mode 100755
index 0000000..f8f98bc
--- /dev/null
+++ b/docker/solr/cloud-init/create-collection.py
@@ -0,0 +1,111 @@
+#!/usr/local/bin/python -u
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import requests
+import requests.exceptions
+import json
+import json.decoder
+import time
+import sys
+import logging
+from kazoo.client import KazooClient
+
+logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', datefmt="%Y-%m-%dT%H:%M:%S", stream=sys.stdout)
+
+MAX_RETRIES = int(os.environ["MAX_RETRIES"])
+SDAP_ZK_SOLR = os.environ["SDAP_ZK_SOLR"]
+SDAP_SOLR_URL = os.environ["SDAP_SOLR_URL"]
+ZK_LOCK_GUID = os.environ["ZK_LOCK_GUID"]
+MINIMUM_NODES = int(os.environ["MINIMUM_NODES"])
+CREATE_COLLECTION_PARAMS = os.environ["CREATE_COLLECTION_PARAMS"]
+
+def get_cluster_status():
+    try:
+        return requests.get("{}admin/collections?action=CLUSTERSTATUS".format(SDAP_SOLR_URL)).json()
+    except (requests.exceptions.ConnectionError, json.decoder.JSONDecodeError):
+        return False
+
+logging.info("Attempting to aquire lock from {}".format(SDAP_ZK_SOLR))
+zk_host, zk_chroot = SDAP_ZK_SOLR.split('/')
+zk = KazooClient(hosts=zk_host)
+zk.start()
+zk.ensure_path(zk_chroot)
+zk.chroot = zk_chroot
+lock = zk.Lock("/collection-creator", ZK_LOCK_GUID)
+try:
+    with lock:  # blocks waiting for lock acquisition
+        logging.info("Lock aquired. Checking for SolrCloud at {}".format(SDAP_SOLR_URL))
+        # Wait for MAX_RETRIES for the entire Solr cluster to be available.
+        attempts = 0
+        status = None
+        collection_exists = False
+        while attempts <= MAX_RETRIES:
+            status = get_cluster_status()
+            if not status:
+                # If we can't get the cluster status, my Solr node is not running
+                attempts += 1
+                logging.info("Waiting for Solr at {}".format(SDAP_SOLR_URL))
+                time.sleep(1)
+                continue
+            else:
+                # If we can get the cluster status, at least my Solr node is running
+                # We can check if the collection exists already now
+                if 'collections' in status['cluster'] and 'nexustiles' in status['cluster']['collections']:
+                    # Collection already exists. Break out of the while loop
+                    collection_exists = True
+                    logging.info("nexustiles collection already exists.")
+                    break
+                else:
+                    # Collection does not exist, but need to make sure number of expected nodes are running
+                    live_nodes = status['cluster']['live_nodes']
+                    if len(live_nodes) < MINIMUM_NODES:
+                        # Not enough live nodes
+                        logging.info("Found {} live node(s). Expected at least {}. Live nodes: {}".format(len(live_nodes), MINIMUM_NODES, live_nodes))
+                        attempts += 1
+                        time.sleep(1)
+                        continue
+                    else:
+                        # We now have a full cluster, ready to create collection.
+                        logging.info("Detected full cluster of at least {} nodes. Checking for nexustiles collection".format(MINIMUM_NODES))
+                        break
+
+        # Make sure we didn't exhaust our retries
+        if attempts > MAX_RETRIES:
+            raise RuntimeError("Exceeded {} retries while waiting for at least {} nodes to become live for {}".format(MAX_RETRIES, MINIMUM_NODES, SDAP_SOLR_URL))
+
+        # Full cluster, did not exceed retries. Check if collection already exists
+        if not collection_exists:
+            # Collection does not exist, create it.
+            create_command = "{}admin/collections?action=CREATE&{}".format(SDAP_SOLR_URL, CREATE_COLLECTION_PARAMS)
+            logging.info("Creating collection with command {}".format(create_command))
+            create_response = requests.get(create_command).json()
+            if 'failure' not in create_response:
+                # Collection created, we're done.
+                logging.info("Collection created. {}".format(create_response))
+                pass
+            else:
+                # Some error occured while creating the collection
+                raise RuntimeError("Could not create collection. Received response: {}".format(create_response))
+finally:
+    zk.stop()
+    zk.close()
+
+# We're done, do nothing forever.
+logging.info("Done.")
+while True:
+    time.sleep(987654321)
diff --git a/docker/solr-single-node/Dockerfile b/docker/solr/cloud/Dockerfile
similarity index 57%
copy from docker/solr-single-node/Dockerfile
copy to docker/solr/cloud/Dockerfile
index 87d4c9a..79dfdd1 100644
--- a/docker/solr-single-node/Dockerfile
+++ b/docker/solr/cloud/Dockerfile
@@ -17,13 +17,15 @@ ARG tag_version=latest
 FROM sdap/solr:${tag_version}
 MAINTAINER Apache SDAP "dev@sdap.apache.org"
 
-USER root
+ENV SDAP_ZK_SERVICE_HOST="localhost" \
+    SDAP_ZK_SERVICE_PORT="2181" \
+    SDAP_ZK_SOLR_CHROOT="solr" \
+    SOLR_HOST="localhost"
 
-COPY create-core.sh /docker-entrypoint-initdb.d/create-core.sh
-RUN echo "${SOLR_USER} ALL=(ALL) NOPASSWD: /usr/bin/cp -r /tmp/nexustiles/* ${SOLR_HOME}/nexustiles/" >> /etc/sudoers && \
-  echo "${SOLR_USER} ALL=(ALL) NOPASSWD: /usr/bin/chown -R ${SOLR_USER}\:${SOLR_GROUP} ${SOLR_HOME}/nexustiles" >> /etc/sudoers
+COPY ./cloud/docker-entrypoint-initdb.d/* /docker-entrypoint-initdb.d/
+COPY ./cloud/tmp/* /tmp/
 
-USER ${SOLR_USER}
-VOLUME ${SOLR_HOME}/nexustiles
-
-ENTRYPOINT ["solr-foreground"]
+# This will run docker-entrypoint.sh with the value of CMD as default arguments. However, if any arguments are supplied
+# to the docker run command when launching this image, the command line arguments will override these CMD arguments
+ENTRYPOINT ["/bin/bash", "-c", "docker-entrypoint.sh $(eval echo $@)", "$@"]
+CMD ["solr-foreground", "-c", "-z ${SDAP_ZK_SERVICE_HOST}:${SDAP_ZK_SERVICE_PORT}/${SDAP_ZK_SOLR_CHROOT}"]
diff --git a/docker/solr/cloud/Readme.rst b/docker/solr/cloud/Readme.rst
new file mode 100644
index 0000000..ae71a2e
--- /dev/null
+++ b/docker/solr/cloud/Readme.rst
@@ -0,0 +1,93 @@
+.. _solr_cloud:
+
+sdap/solr-cloud
+--------------------
+
+This image runs SolrCloud.
+
+How To Build
+^^^^^^^^^^^^
+
+This image can be built from the incubator/sdap/solr directory:
+
+.. code-block:: bash
+
+    docker build -t sdap/solr-cloud:${BUILD_VERSION} -f cloud/Dockerfile --build-arg tag_version=${BUILD_VERSION} .
+
+How to Run
+^^^^^^^^^^
+
+This Docker container runs Apache Solr v7.4 in cloud mode with the nexustiles collection. It requires a running Zookeeper service in order to work. It will automatically bootstrap Zookeeper by uploading configuration and core properties to Zookeeper when it starts.
+
+It is necessary to decide wether or not you want data to persist when the container is stopped or if the data should be discarded.
+
+.. note::
+
+  There are multiple times that ``host.docker.internal`` is used in the example ``docker run`` commands provided below. This is a special DNS name that is known to work on Docker for Mac for `connecting from a container to a service on the host <https://docs.docker.com/docker-for-mac/networking/#i-want-to-connect-from-a-container-to-a-service-on-the-host>`_. If you are not launching the container with Docker for Mac, there is no guarantee that this DNS name will be resolvable inside the  [...]
+
+Cloud Specific Environment Variables
+""""""""""""""""""""""""""""""""""""
+
+``SDAP_ZK_SERVICE_HOST``
+    *default: localhost*
+
+    This is the hostname of the Zookeeper service that Solr should use to connect.
+
+``SDAP_ZK_SERVICE_PORT``
+    *default: 2181*
+
+    The port Solr should try to connect to Zookeeper with.
+
+``SDAP_ZK_SOLR_CHROOT``
+    *default: solr*
+
+    The Zookeeper chroot under which Solr configuration will be accessed.
+
+``SOLR_HOST``
+    *default: localhost*
+
+    The hostname of the Solr instance that will be recored in Zookeeper.
+
+Zookeeper
+""""""""""""
+
+Zookeeper can be running on the host machine or anywhere that docker can access (e.g. a bridge network). Take note of the host where Zookeeper is running and use that value for the ``SDAP_ZK_SERVICE_HOST`` environment variable.
+
+
+Persist Data
+""""""""""""""""
+
+To persist the data, we need to provide a volume mount from the host machine to the container path where the collection data is stored. By default, collection data is stored in the location indicated by the ``$SOLR_HOME`` environment variable. If you do not provide a custom ``SOLR_HOME`` location, the default is ``/opt/solr/server/solr``.
+
+Assuming Zookeeper is running on the host machine port 2181, the easiest way to run this image and persist data to a location on the host machine is:
+
+.. code-block:: bash
+
+    docker run --name solr -v ${PWD}/solrhome:/opt/solr/server/solr -p 8983:8983 -d -e SDAP_ZK_SERVICE_HOST="host.docker.internal" -e SOLR_HOST="host.docker.internal" sdap/solr-cloud:${VERSION}
+
+``${PWD}/solrhome`` is the directory on host machine where ``SOLR_HOME`` will be created if it does not already exist.
+
+Don't Persist Data
+""""""""""""""""""
+
+If you do not need to persist data between runs of this image, just simply run the image without a volume mount.
+
+Assuming Zookeeper is running on the host machine port 2181, the easiest way to run this image without persisting data is:
+
+.. code-block:: bash
+
+    docker run --name solr -p 8983:8983 -d -e SDAP_ZK_SERVICE_HOST="host.docker.internal" -e SOLR_HOST="host.docker.internal" sdap/solr-cloud:${VERSION}
+
+When the container is removed, the data will be lost.
+
+Collection Initialization
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Solr Collections must be created after at least one SolrCloud node is live. When a collection is created, by default Solr will attempt to spread the shards across all of the live nodes at the time of creation. This poses two problems
+
+1) The nexustiles collection can not be created during a "bootstrapping" process in this image.
+2) The nexustiles collection should not be created until an appropriate amount of nodes are live.
+
+A helper container has been created to deal with these issues. See :ref:`solr_cloud_init` for more details.
+
+The other option is to create the collection manually after starting as many SolrCloud nodes as desired. This can be done through the Solr Admin UI or by utilizing the `admin collections API <http://lucene.apache.org/solr/guide/7_4/collections-api.html#collections-api>`_.
diff --git a/docker/solr-single-node/create-core.sh b/docker/solr/cloud/docker-entrypoint-initdb.d/0-init-home.sh
similarity index 78%
copy from docker/solr-single-node/create-core.sh
copy to docker/solr/cloud/docker-entrypoint-initdb.d/0-init-home.sh
index 1520b6a..149c660 100755
--- a/docker/solr-single-node/create-core.sh
+++ b/docker/solr/cloud/docker-entrypoint-initdb.d/0-init-home.sh
@@ -1,4 +1,4 @@
-#!/bin/bash -ex
+#!/bin/bash
 
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
@@ -17,9 +17,10 @@
 
 set -ex
 
-SOLR_HOME=${SOLR_HOME:=/opt/solr/server/solr/}
-mkdir -p ${SOLR_HOME}/nexustiles
-sudo cp -r /tmp/nexustiles/* ${SOLR_HOME}/nexustiles/
-sudo chown -R ${SOLR_USER}:${SOLR_GROUP} ${SOLR_HOME}/nexustiles
+if [ ! -f ${SOLR_HOME}/solr.xml ]; then
+    cp /tmp/solr.xml ${SOLR_HOME}
+fi
 
-set +x
+if [ ! -f ${SOLR_HOME}/zoo.cfg ]; then
+    cp /tmp/zoo.cfg ${SOLR_HOME}
+fi
diff --git a/docker/solr-single-node/create-core.sh b/docker/solr/cloud/docker-entrypoint-initdb.d/1-bootstrap-zk.sh
similarity index 77%
copy from docker/solr-single-node/create-core.sh
copy to docker/solr/cloud/docker-entrypoint-initdb.d/1-bootstrap-zk.sh
index 1520b6a..cbabbda 100755
--- a/docker/solr-single-node/create-core.sh
+++ b/docker/solr/cloud/docker-entrypoint-initdb.d/1-bootstrap-zk.sh
@@ -1,4 +1,4 @@
-#!/bin/bash -ex
+#!/bin/bash
 
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
@@ -17,9 +17,7 @@
 
 set -ex
 
-SOLR_HOME=${SOLR_HOME:=/opt/solr/server/solr/}
-mkdir -p ${SOLR_HOME}/nexustiles
-sudo cp -r /tmp/nexustiles/* ${SOLR_HOME}/nexustiles/
-sudo chown -R ${SOLR_USER}:${SOLR_GROUP} ${SOLR_HOME}/nexustiles
+ZK_HOST="${SDAP_ZK_SERVICE_HOST}:${SDAP_ZK_SERVICE_PORT}/${SDAP_ZK_SOLR_CHROOT}"
 
-set +x
+./bin/solr zk upconfig -z ${ZK_HOST} -n nexustiles -d /tmp/nexustiles
+./bin/solr zk cp -z ${ZK_HOST} ${SOLR_HOME}/solr.xml zk:/solr.xml
diff --git a/docker/solr/cloud/tmp/solr.xml b/docker/solr/cloud/tmp/solr.xml
new file mode 100644
index 0000000..4a79fe2
--- /dev/null
+++ b/docker/solr/cloud/tmp/solr.xml
@@ -0,0 +1,53 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!--
+   This is an example of a simple "solr.xml" file for configuring one or
+   more Solr Cores, as well as allowing Cores to be added, removed, and
+   reloaded via HTTP requests.
+
+   More information about options available in this configuration file,
+   and Solr Core administration can be found online:
+   http://wiki.apache.org/solr/CoreAdmin
+-->
+
+<solr>
+
+  <solrcloud>
+
+    <str name="host">${host:}</str>
+    <int name="hostPort">${jetty.port:8983}</int>
+    <str name="hostContext">${hostContext:solr}</str>
+
+    <bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>
+
+    <int name="zkClientTimeout">${zkClientTimeout:30000}</int>
+    <int name="distribUpdateSoTimeout">${distribUpdateSoTimeout:600000}</int>
+    <int name="distribUpdateConnTimeout">${distribUpdateConnTimeout:60000}</int>
+    <str name="zkCredentialsProvider">${zkCredentialsProvider:org.apache.solr.common.cloud.DefaultZkCredentialsProvider}</str>
+    <str name="zkACLProvider">${zkACLProvider:org.apache.solr.common.cloud.DefaultZkACLProvider}</str>
+
+  </solrcloud>
+
+  <shardHandlerFactory name="shardHandlerFactory"
+    class="HttpShardHandlerFactory">
+    <int name="socketTimeout">${socketTimeout:600000}</int>
+    <int name="connTimeout">${connTimeout:60000}</int>
+  </shardHandlerFactory>
+
+</solr>
diff --git a/docker/solr/cloud/tmp/zoo.cfg b/docker/solr/cloud/tmp/zoo.cfg
new file mode 100644
index 0000000..7e42d8c
--- /dev/null
+++ b/docker/solr/cloud/tmp/zoo.cfg
@@ -0,0 +1,31 @@
+# The number of milliseconds of each tick
+tickTime=2000
+# The number of ticks that the initial
+# synchronization phase can take
+initLimit=10
+# The number of ticks that can pass between
+# sending a request and getting an acknowledgement
+syncLimit=5
+
+# the directory where the snapshot is stored.
+# dataDir=/opt/zookeeper/data
+# NOTE: Solr defaults the dataDir to <solrHome>/zoo_data
+
+# the port at which the clients will connect
+# clientPort=2181
+# NOTE: Solr sets this based on zkRun / zkHost params
+
+# the maximum number of client connections.
+# increase this if you need to handle more clients
+#maxClientCnxns=60
+#
+# Be sure to read the maintenance section of the
+# administrator guide before turning on autopurge.
+#
+# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
+#
+# The number of snapshots to retain in dataDir
+#autopurge.snapRetainCount=3
+# Purge task interval in hours
+# Set to "0" to disable auto purge feature
+#autopurge.purgeInterval=1
diff --git a/docker/solr-single-node/Dockerfile b/docker/solr/singlenode/Dockerfile
similarity index 79%
rename from docker/solr-single-node/Dockerfile
rename to docker/solr/singlenode/Dockerfile
index 87d4c9a..10021e0 100644
--- a/docker/solr-single-node/Dockerfile
+++ b/docker/solr/singlenode/Dockerfile
@@ -1,4 +1,3 @@
-
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
@@ -19,11 +18,13 @@ MAINTAINER Apache SDAP "dev@sdap.apache.org"
 
 USER root
 
-COPY create-core.sh /docker-entrypoint-initdb.d/create-core.sh
 RUN echo "${SOLR_USER} ALL=(ALL) NOPASSWD: /usr/bin/cp -r /tmp/nexustiles/* ${SOLR_HOME}/nexustiles/" >> /etc/sudoers && \
-  echo "${SOLR_USER} ALL=(ALL) NOPASSWD: /usr/bin/chown -R ${SOLR_USER}\:${SOLR_GROUP} ${SOLR_HOME}/nexustiles" >> /etc/sudoers
+    echo "${SOLR_USER} ALL=(ALL) NOPASSWD: /usr/bin/chown -R ${SOLR_USER}\:${SOLR_GROUP} ${SOLR_HOME}/nexustiles" >> /etc/sudoers
+
+COPY ./singlenode/create-core.sh /docker-entrypoint-initdb.d/0-create-core.sh
 
 USER ${SOLR_USER}
 VOLUME ${SOLR_HOME}/nexustiles
 
-ENTRYPOINT ["solr-foreground"]
+ENTRYPOINT ["docker-entrypoint.sh"]
+CMD ["solr-foreground"]
diff --git a/docker/solr/singlenode/Readme.rst b/docker/solr/singlenode/Readme.rst
new file mode 100644
index 0000000..0f814f2
--- /dev/null
+++ b/docker/solr/singlenode/Readme.rst
@@ -0,0 +1,42 @@
+.. _solr_singlenode:
+
+sdap/solr-singlenode
+--------------------
+
+This is the singlenode version of Solr.
+
+How To Build
+^^^^^^^^^^^^
+
+This image can be built from the incubator/sdap/solr directory:
+
+.. code-block:: bash
+
+    docker build -t sdap/solr-singlenode:${BUILD_VERSION} -f singlenode/Dockerfile --build-arg tag_version=${BUILD_VERSION} .
+
+How to Run
+^^^^^^^^^^
+
+This Docker container runs Apache Solr v7.4 as a single node with the nexustiles collection. The main decision when running this image is wether or not you want data to persist when the container is stopped or if the data should be discarded.
+
+Persist Data
+""""""""""""
+
+To persist the data in the ``nexustiles`` collection, we need to provide a volume mount from the host machine to the container path where the collection data is stored. By default, collection data is stored in the location indicated by the ``$SOLR_HOME`` environment variable. If you do not provide a custom ``SOLR_HOME`` location, the default is ``/opt/solr/server/solr``. Therefore, the easiest way to run this image and persist data to a location on the host machine is:
+
+.. code-block:: bash
+
+    docker run --name solr -v ${PWD}/solrhome/nexustiles:/opt/solr/server/solr/nexustiles -p 8083:8083 -d sdap/solr-singlenode:${BUILD_VERSION}
+
+``${PWD}/solrhome/nexustiles`` is the directory on host machine where the ``nexustiles`` collection will be created if it does not already exist. If you have run this container before and ``${PWD}/solrhome/nexustiles`` already contains files, those files will *not* be overwritten. In this way, it is possible to retain data on the host machine between runs of this docker image.
+
+Don't Persist Data
+""""""""""""""""""
+
+If you do not need to persist data between runs of this image, just simply run the image without a volume mount.
+
+.. code-block:: bash
+
+    docker run --name solr -p 8083:8083 -d sdap/solr-singlenode:${BUILD_VERSION}
+
+When the container is removed, the data will be lost.
diff --git a/docker/solr-single-node/create-core.sh b/docker/solr/singlenode/create-core.sh
similarity index 98%
rename from docker/solr-single-node/create-core.sh
rename to docker/solr/singlenode/create-core.sh
index 1520b6a..a2f6e38 100755
--- a/docker/solr-single-node/create-core.sh
+++ b/docker/solr/singlenode/create-core.sh
@@ -1,4 +1,4 @@
-#!/bin/bash -ex
+#!/bin/bash
 
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
diff --git a/docs/conf.py b/docs/conf.py
index 8cca529..d6ac58c 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -16,7 +16,6 @@
 # import sys
 # sys.path.insert(0, os.path.abspath('.'))
 
-
 # -- Project information -----------------------------------------------------
 
 project = u'incubator-sdap-nexus'
@@ -24,9 +23,9 @@ copyright = u'2018, Apache SDAP'
 author = u'Apache SDAP'
 
 # The short X.Y version
-version = u''
+version = u'1.0'
 # The full version, including alpha/beta/rc tags
-release = u''
+release = u'1.0.0-SNAPSHOT'
 
 
 # -- General configuration ---------------------------------------------------
@@ -52,7 +51,7 @@ templates_path = ['_templates']
 # You can specify multiple suffix as a list of string:
 #
 # source_suffix = ['.rst', '.md']
-source_suffix = '.rst'
+source_suffix = ['.rst']
 
 # The master toctree document.
 master_doc = 'index'
diff --git a/docs/dockerimages.rst b/docs/dockerimages.rst
new file mode 100644
index 0000000..558f558
--- /dev/null
+++ b/docs/dockerimages.rst
@@ -0,0 +1,10 @@
+.. _dockerimages:
+
+*************
+Docker Images
+*************
+
+incubator-sdap-nexus contains a number of different Docker images for download. All images are available from the `SDAP organization <https://hub.docker.com/u/sdap>`_ on DockerHub.
+
+
+.. include:: ../docker/solr/Readme.rst
diff --git a/docs/index.rst b/docs/index.rst
index 5666094..0db88d4 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,6 +1,10 @@
 Welcome to incubator-sdap-nexus's documentation!
 ================================================
 
+.. warning::
+
+  Apache incubator-sdap-nexus is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the name of Apache TLP sponsor. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does  [...]
+
 .. toctree::
    :maxdepth: 2
    :caption: Contents:
@@ -8,6 +12,7 @@ Welcome to incubator-sdap-nexus's documentation!
    intro
    quickstart
    ningester
+   dockerimages
 
 
 Check out the :ref:`quickstart`.
diff --git a/docs/intro.rst b/docs/intro.rst
index 6914726..5ebe701 100644
--- a/docs/intro.rst
+++ b/docs/intro.rst
@@ -1,5 +1,8 @@
 .. _intro:
 
+.. warning::
+
+  Apache incubator-sdap-nexus is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the name of Apache TLP sponsor. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does  [...]
 
 *******************
 About NEXUS