You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sdap.apache.org by ea...@apache.org on 2020/09/15 16:46:28 UTC

[incubator-sdap-nexus] branch bug_fixes updated (24ff296 -> 48da694)

This is an automated email from the ASF dual-hosted git repository.

eamonford pushed a change to branch bug_fixes
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git.


 discard 24ff296  use new nexusjpl/solr image, update solr-create-collection image tag
 discard 2feeebd  upgrade images
 discard dbe2bf3  revert doms
 discard ead83ba  updated helm chart for zookeeper
     add b880c63  SDAP-285: Upgrade custom Solr image to include JTS, and update solr-create-collection image to create geo field  (#108)
     new bb31585  updated helm chart for zookeeper
     new d562fa8  revert doms
     new 2679e64  upgrade images
     new 48da694  use new nexusjpl/solr image, update solr-create-collection image tag

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (24ff296)
            \
             N -- N -- N   refs/heads/bug_fixes (48da694)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 docker/.gitignore                                  |  1 -
 docker/cassandra/Dockerfile                        | 31 --------
 docker/cassandra/README.md                         |  0
 docker/cassandra/docker-entrypoint.sh              | 85 --------------------
 docker/solr/Dockerfile                             | 23 ++----
 docker/solr/cloud-init/create-collection.py        | 35 ++++++++
 docker/solr/cloud/Dockerfile                       | 31 --------
 docker/solr/cloud/Readme.rst                       | 93 ----------------------
 .../docker-entrypoint-initdb.d/0-init-home.sh      | 26 ------
 .../docker-entrypoint-initdb.d/1-bootstrap-zk.sh   | 23 ------
 docker/solr/cloud/tmp/solr.xml                     | 53 ------------
 docker/solr/cloud/tmp/zoo.cfg                      | 31 --------
 docker/solr/singlenode/Dockerfile                  | 30 -------
 docker/solr/singlenode/Readme.rst                  | 42 ----------
 docker/solr/singlenode/create-core.sh              | 25 ------
 15 files changed, 40 insertions(+), 489 deletions(-)
 delete mode 100644 docker/.gitignore
 delete mode 100644 docker/cassandra/Dockerfile
 delete mode 100644 docker/cassandra/README.md
 delete mode 100755 docker/cassandra/docker-entrypoint.sh
 delete mode 100644 docker/solr/cloud/Dockerfile
 delete mode 100644 docker/solr/cloud/Readme.rst
 delete mode 100755 docker/solr/cloud/docker-entrypoint-initdb.d/0-init-home.sh
 delete mode 100755 docker/solr/cloud/docker-entrypoint-initdb.d/1-bootstrap-zk.sh
 delete mode 100644 docker/solr/cloud/tmp/solr.xml
 delete mode 100644 docker/solr/cloud/tmp/zoo.cfg
 delete mode 100644 docker/solr/singlenode/Dockerfile
 delete mode 100644 docker/solr/singlenode/Readme.rst
 delete mode 100755 docker/solr/singlenode/create-core.sh


[incubator-sdap-nexus] 01/04: updated helm chart for zookeeper

Posted by ea...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

eamonford pushed a commit to branch bug_fixes
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git

commit bb3158553c19a911631a8d7bbac711ac51c6dc34
Author: Eamon Ford <ea...@jpl.nasa.gov>
AuthorDate: Thu Jul 16 18:54:07 2020 -0700

    updated helm chart for zookeeper
    
    use solr and zk helm charts
    
    change .Release.Namespace to .Release.Name
    
    add rabbitmq storageclass
    
    fix rbac
    
    add  max_concurrency
    
    add solr_host arg
    
    add solr port
    
    always deploy solr
    
    add solr-host option
    
    read cli args for cass and solr hosts
    
    pass cassandra host
    
    add support for cassandra username and password
    
    cassandra helm chart included
    
    fix arguments sent to spark driver, add logging in cassandraproxy
    
    pass factory method to nexuscalchandlers to create tile service in spark nodes
    
    fix namespace
    
    fix bad argument order
    
    fix cass url for granule ingester
    
    change solr-create-collection to a deployment
    
    make solr history default
    
    pr
    
    enable external solr/zk/cass hosts
    
    rabbitmq.enabled
    
    revert doms
    
    revert
    
    update images
    
    only deploy config operator if it is enabled
    
    remove http:// from solr hardcoded endpoint
    
    turn off configmap by default
---
 .gitignore                                       |   1 -
 analysis/setup.py                                |   3 +-
 analysis/webservice/algorithms_spark/__init__.py |   6 -
 analysis/webservice/config/web.ini               |   2 +-
 data-access/nexustiles/dao/CassandraProxy.py     |   3 +
 data-access/tests/config/datastores.ini          |   9 --
 helm/requirements.yaml                           |  11 +-
 helm/templates/_helpers.tpl                      |   9 +-
 helm/templates/cassandra.yml                     | 107 -----------------
 helm/templates/collection-manager.yml            |  10 +-
 helm/templates/config-operator-rbac.yml          |   4 +-
 helm/templates/config-operator.yml               |   3 +-
 helm/templates/granule-ingester.yml              |  15 ++-
 helm/templates/history-pvc.yml                   |   2 +
 helm/templates/init-cassandra-configmap.yml      |  13 ++
 helm/templates/solr-create-collection.yml        |  34 ++++++
 helm/templates/solr.yml                          | 129 --------------------
 helm/templates/webapp.yml                        |   7 +-
 helm/templates/zookeeper.yml                     | 144 -----------------------
 helm/values.yaml                                 |  86 +++++++++-----
 tools/doms/README.md                             |  66 -----------
 tools/doms/doms_reader.py                        | 144 -----------------------
 22 files changed, 154 insertions(+), 654 deletions(-)

diff --git a/.gitignore b/.gitignore
index 3e29626..4e4cf6e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -2,6 +2,5 @@
 *.code-workspace
 *.idea
 *.DS_Store
-analysis/webservice/algorithms/doms/domsconfig.ini
 data-access/nexustiles/config/datastores.ini
 venv/
diff --git a/analysis/setup.py b/analysis/setup.py
index 62a6891..9a449ce 100644
--- a/analysis/setup.py
+++ b/analysis/setup.py
@@ -50,8 +50,7 @@ setuptools.setup(
     #    'webservice.nexus_tornado.request.renderers'
     #],
     package_data={
-        'webservice': ['config/web.ini', 'config/algorithms.ini'],
-        'webservice.algorithms.doms': ['domsconfig.ini.default']
+        'webservice': ['config/web.ini', 'config/algorithms.ini']
     },
     data_files=[
         ('static', ['static/index.html'])
diff --git a/analysis/webservice/algorithms_spark/__init__.py b/analysis/webservice/algorithms_spark/__init__.py
index d6ed83f..a25c8d5 100644
--- a/analysis/webservice/algorithms_spark/__init__.py
+++ b/analysis/webservice/algorithms_spark/__init__.py
@@ -20,7 +20,6 @@ import ClimMapSpark
 import CorrMapSpark
 import DailyDifferenceAverageSpark
 import HofMoellerSpark
-import Matchup
 import MaximaMinimaSpark
 import NexusCalcSparkHandler
 import TimeAvgMapSpark
@@ -47,11 +46,6 @@ if module_exists("pyspark"):
         pass
 
     try:
-        import Matchup
-    except ImportError:
-        pass
-
-    try:
         import TimeAvgMapSpark
     except ImportError:
         pass
diff --git a/analysis/webservice/config/web.ini b/analysis/webservice/config/web.ini
index 2644ade..a1ecb2c 100644
--- a/analysis/webservice/config/web.ini
+++ b/analysis/webservice/config/web.ini
@@ -14,4 +14,4 @@ static_enabled=true
 static_dir=static
 
 [modules]
-module_dirs=webservice.algorithms,webservice.algorithms_spark,webservice.algorithms.doms
\ No newline at end of file
+module_dirs=webservice.algorithms,webservice.algorithms_spark
\ No newline at end of file
diff --git a/data-access/nexustiles/dao/CassandraProxy.py b/data-access/nexustiles/dao/CassandraProxy.py
index a8a4e6e..54a849b 100644
--- a/data-access/nexustiles/dao/CassandraProxy.py
+++ b/data-access/nexustiles/dao/CassandraProxy.py
@@ -161,6 +161,9 @@ class CassandraProxy(object):
         self.__cass_protocol_version = config.getint("cassandra", "protocol_version")
         self.__cass_dc_policy = config.get("cassandra", "dc_policy")
 
+        logger.info("Setting cassandra host to " + self.__cass_url)
+        logger.info("Setting cassandra username to " + self.__cass_username)
+
         try:
             self.__cass_port = config.getint("cassandra", "port")
         except NoOptionError:
diff --git a/data-access/tests/config/datastores.ini b/data-access/tests/config/datastores.ini
deleted file mode 100644
index 194760c..0000000
--- a/data-access/tests/config/datastores.ini
+++ /dev/null
@@ -1,9 +0,0 @@
-[cassandra]
-host=127.0.0.1
-keyspace=nexustiles
-local_datacenter=datacenter1
-protocol_version=3
-
-[solr]
-host=localhost:8983
-core=nexustiles
\ No newline at end of file
diff --git a/helm/requirements.yaml b/helm/requirements.yaml
index 7970f29..78cc52e 100644
--- a/helm/requirements.yaml
+++ b/helm/requirements.yaml
@@ -6,6 +6,13 @@ dependencies:
   - name: rabbitmq
     version: 7.1.0
     repository: https://charts.bitnami.com/bitnami
-    condition: ingestion.enabled
-  
+    condition: rabbitmq.enabled
+  - name: solr
+    version: 1.5.2
+    repository: http://storage.googleapis.com/kubernetes-charts-incubator
+    condition: solr.enabled
+  - name: cassandra
+    version: 5.5.3
+    repository: https://charts.bitnami.com/bitnami
+    condition: cassandra.enabled
 
diff --git a/helm/templates/_helpers.tpl b/helm/templates/_helpers.tpl
index b697c17..5944f33 100644
--- a/helm/templates/_helpers.tpl
+++ b/helm/templates/_helpers.tpl
@@ -4,7 +4,7 @@
 Name of the generated configmap containing the contents of the collections config file.
 */}}
 {{- define "nexus.collectionsConfig.configmapName" -}}
-collections-config
+{{ .Values.ingestion.collections.configMap | default "collections-config" }}
 {{- end -}}
 
 {{/*
@@ -45,3 +45,10 @@ The data volume mount which is used in both the Collection Manager and the Granu
   mountPath: {{ .Values.ingestion.granules.mountPath }}
 {{- end -}}
 
+{{- define "nexus.urls.solr" -}}
+{{ .Values.external.solrHostAndPort | default (print "http://" .Release.Name "-solr-svc:8983") }}
+{{- end -}}
+
+{{- define "nexus.urls.zookeeper" -}}
+{{ .Values.external.zookeeperHostAndPort | default (print .Release.Name "-zookeeper:2181") }}
+{{- end -}}
\ No newline at end of file
diff --git a/helm/templates/cassandra.yml b/helm/templates/cassandra.yml
deleted file mode 100644
index 6023e55..0000000
--- a/helm/templates/cassandra.yml
+++ /dev/null
@@ -1,107 +0,0 @@
-apiVersion: v1
-kind: Service
-metadata:
-  name: sdap-cassandra
-spec:
-  clusterIP: None
-  ports:
-  - name: cql
-    port: 9042
-    targetPort: cql
-  selector:
-    app: sdap-cassandra
-
----
-
-apiVersion: apps/v1
-kind: StatefulSet
-metadata:
-  name: cassandra-set
-spec:
-  serviceName: sdap-cassandra
-  replicas: {{ .Values.cassandra.replicas }}
-  selector:
-    matchLabels:
-      app: sdap-cassandra
-  template:
-    metadata:
-      labels:
-        app: sdap-cassandra
-    spec:
-      terminationGracePeriodSeconds: 120
-      {{ if .Values.cassandra.tolerations }}
-      tolerations:
-{{ .Values.cassandra.tolerations | toYaml | indent 6 }}
-      {{ end }}
-      {{ if .Values.cassandra.nodeSelector }}
-      nodeSelector:
-{{ .Values.cassandra.nodeSelector | toYaml | indent 8 }}
-      {{ end }}
-      affinity:
-        podAntiAffinity:
-          # Prefer spreading over all hosts
-          preferredDuringSchedulingIgnoredDuringExecution:
-          - weight: 100
-            podAffinityTerm:
-              labelSelector:
-                  matchExpressions:
-                    - key: "app"
-                      operator: In
-                      values:
-                      - sdap-cassandra
-              topologyKey: "kubernetes.io/hostname"
-      containers:
-      - name: cassandra
-        image: nexusjpl/cassandra:1.0.0-rc1
-        imagePullPolicy: Always
-        ports:
-        - containerPort: 7000
-          name: intra-node
-        - containerPort: 7001
-          name: tls-intra-node
-        - containerPort: 7199
-          name: jmx
-        - containerPort: 9042
-          name: cql
-        resources:
-          requests:
-            cpu: {{ .Values.cassandra.requests.cpu }}
-            memory: {{ .Values.cassandra.requests.memory }}
-          limits:
-            cpu: {{ .Values.cassandra.limits.cpu }}
-            memory: {{ .Values.cassandra.limits.memory }}
-        securityContext:
-          capabilities:
-            add:
-              - IPC_LOCK
-        lifecycle:
-          preStop:
-            exec:
-              command:
-              - /bin/sh
-              - -c
-              - nodetool drain
-        env:
-          - name: MAX_HEAP_SIZE
-            value: 2G
-          - name: HEAP_NEWSIZE
-            value: 200M
-          - name: CASSANDRA_SEEDS
-            value: "cassandra-set-0.sdap-cassandra"
-          - name: POD_IP
-            valueFrom:
-              fieldRef:
-                fieldPath: status.podIP
-        volumeMounts:
-        - name: cassandra-data
-          mountPath: /var/lib/cassandra
-
-  volumeClaimTemplates:
-  - metadata:
-      name: cassandra-data
-    spec:
-      accessModes: [ "ReadWriteOnce" ]
-      storageClassName: {{ .Values.storageClass }}
-      resources:
-        requests:
-          storage: {{ .Values.cassandra.storage }}
diff --git a/helm/templates/collection-manager.yml b/helm/templates/collection-manager.yml
index 6708b13..e281526 100644
--- a/helm/templates/collection-manager.yml
+++ b/helm/templates/collection-manager.yml
@@ -19,7 +19,7 @@ spec:
     spec:
       containers:
         - image: {{ .Values.ingestion.collectionManager.image }}
-          imagePullPolicy: Always
+          imagePullPolicy: IfNotPresent 
           name: collection-manager
           env:
             - name: RABBITMQ_USERNAME
@@ -30,9 +30,9 @@ spec:
               value: {{ .Values.rabbitmq.fullnameOverride }}
             - name: COLLECTIONS_PATH
               value: {{ include "nexus.collectionsConfig.mountPath" . }}/collections.yml
-            {{- if $history.url }}
+            {{- if $history.solrEnabled }}
             - name: HISTORY_URL
-              value: {{ .Values.ingestion.history.url}}
+              value: {{ include "nexus.urls.solr" . }}
             {{- else }}
             - name: HISTORY_PATH
               value: {{ include "nexus.history.mountPath" . }}
@@ -46,7 +46,7 @@ spec:
               memory: {{ .Values.ingestion.collectionManager.memory }}
           volumeMounts:
 {{ include "nexus.ingestion.dataVolumeMount" . | indent 12 }}
-            {{- if not $history.url }}
+            {{- if not $history.solrEnabled }}
             - name: history-volume
               mountPath: {{ include "nexus.history.mountPath" . }}
             {{- end }}
@@ -57,7 +57,7 @@ spec:
         - name: collections-config-volume
           configMap:
             name: {{ include "nexus.collectionsConfig.configmapName" . }}
-        {{- if not $history.url }}
+        {{- if not $history.solrEnabled }}
         - name: history-volume
           persistentVolumeClaim:
             claimName: history-volume-claim
diff --git a/helm/templates/config-operator-rbac.yml b/helm/templates/config-operator-rbac.yml
index 54064d5..6626b0b 100644
--- a/helm/templates/config-operator-rbac.yml
+++ b/helm/templates/config-operator-rbac.yml
@@ -6,7 +6,7 @@ metadata:
 ---
 
 apiVersion: rbac.authorization.k8s.io/v1
-kind: RoleBinding
+kind: ClusterRoleBinding
 metadata:
   name: config-operator-role-binding
 roleRef:
@@ -16,4 +16,6 @@ roleRef:
 subjects:
   - kind: ServiceAccount
     name: config-operator
+    namespace: {{ .Release.Namespace }}
+
 
diff --git a/helm/templates/config-operator.yml b/helm/templates/config-operator.yml
index 3f56f44..298095e 100644
--- a/helm/templates/config-operator.yml
+++ b/helm/templates/config-operator.yml
@@ -1,4 +1,5 @@
 {{ if .Values.ingestion.enabled }}
+{{ if not .Values.ingestion.collections.configMap }}
 apiVersion: apps/v1
 kind: Deployment
 metadata:
@@ -21,4 +22,4 @@ spec:
           image: {{ .Values.ingestion.configOperator.image }}
           imagePullPolicy: Always
 {{ end }}
-
+{{ end }}
diff --git a/helm/templates/granule-ingester.yml b/helm/templates/granule-ingester.yml
index 2ce03b6..bb616ad 100644
--- a/helm/templates/granule-ingester.yml
+++ b/helm/templates/granule-ingester.yml
@@ -17,6 +17,7 @@ spec:
     spec:
       containers:
         - image: {{ .Values.ingestion.granuleIngester.image }}
+          imagePullPolicy: IfNotPresent
           name: granule-ingester
           env:
             - name: RABBITMQ_USERNAME
@@ -26,9 +27,17 @@ spec:
             - name: RABBITMQ_HOST
               value: {{ .Values.rabbitmq.fullnameOverride }}
             - name: CASSANDRA_CONTACT_POINTS
-              value: sdap-cassandra
-            - name: SOLR_HOST_AND_PORT
-              value: http://sdap-solr:8983
+              value: {{ .Release.Name }}-cassandra
+            - name: CASSANDRA_USERNAME
+              value: cassandra
+            - name: CASSANDRA_PASSWORD
+              value: cassandra
+            - name: ZK_HOST_AND_PORT
+              value: {{ include "nexus.urls.zookeeper" . }}
+            {{ if .Values.ingestion.granuleIngester.maxConcurrency }}
+            - name: MAX_CONCURRENCY
+              value: "{{ .Values.ingestion.granuleIngester.maxConcurrency }}"
+            {{ end }}
           resources:
             requests:
               cpu: {{ .Values.ingestion.granuleIngester.cpu }}
diff --git a/helm/templates/history-pvc.yml b/helm/templates/history-pvc.yml
index 3ecabe9..ed18f76 100644
--- a/helm/templates/history-pvc.yml
+++ b/helm/templates/history-pvc.yml
@@ -2,6 +2,8 @@ apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
   name: history-volume-claim
+  annotations:
+    helm.sh/resource-policy: "keep"
 spec:
   accessModes:
     - ReadWriteOnce
diff --git a/helm/templates/init-cassandra-configmap.yml b/helm/templates/init-cassandra-configmap.yml
new file mode 100644
index 0000000..3e7ed3c
--- /dev/null
+++ b/helm/templates/init-cassandra-configmap.yml
@@ -0,0 +1,13 @@
+apiVersion: v1
+data:
+  init.cql: |
+    CREATE KEYSPACE IF NOT EXISTS nexustiles WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 1 };
+
+    CREATE TABLE IF NOT EXISTS nexustiles.sea_surface_temp  (
+    tile_id    	uuid PRIMARY KEY,
+    tile_blob  	blob
+    );
+kind: ConfigMap
+metadata:
+  name: init-cassandra
+  namespace: {{ .Release.Namespace }}
diff --git a/helm/templates/solr-create-collection.yml b/helm/templates/solr-create-collection.yml
new file mode 100644
index 0000000..7ecb2e3
--- /dev/null
+++ b/helm/templates/solr-create-collection.yml
@@ -0,0 +1,34 @@
+{{ if .Values.solrInitEnabled }}
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: solr-create-collection
+spec:
+  selector:
+    matchLabels:
+      app: solr-create-collection # has to match .spec.template.metadata.labels
+  replicas: 1
+  template:
+    metadata:
+      labels:
+        app: solr-create-collection
+    spec:
+      containers:
+      - name: solr-create-collection
+        imagePullPolicy: Always
+        image: nexusjpl/solr-cloud-init:1.0.1
+        resources:
+          requests:
+            memory: "0.5Gi"
+            cpu: "0.25"
+        env:
+        - name: MINIMUM_NODES
+          value: "{{ .Values.solr.replicaCount }}"
+        - name: SDAP_SOLR_URL
+          value: {{ include "nexus.urls.solr" . }}/solr/
+        - name: SDAP_ZK_SOLR
+          value: {{ include "nexus.urls.zookeeper" . }}/solr
+        - name: CREATE_COLLECTION_PARAMS
+          value: "name=nexustiles&numShards=$(MINIMUM_NODES)&waitForFinalState=true"
+      restartPolicy: Always
+{{ end }}
\ No newline at end of file
diff --git a/helm/templates/solr.yml b/helm/templates/solr.yml
deleted file mode 100644
index c8d0f9b..0000000
--- a/helm/templates/solr.yml
+++ /dev/null
@@ -1,129 +0,0 @@
-apiVersion: v1
-kind: Service
-metadata:
-  name: sdap-solr
-spec:
-  ports:
-  - port: 8983
-  clusterIP: None
-  selector:
-    app: sdap-solr
-
----
-
-apiVersion: apps/v1
-kind: StatefulSet
-metadata:
-  name: solr-set
-spec:
-  selector:
-    matchLabels:
-      app: sdap-solr # has to match .spec.template.metadata.labels
-  serviceName: "sdap-solr"
-  replicas:  {{.Values.solr.replicas }} # by default is 1
-  podManagementPolicy: Parallel
-  template:
-    metadata:
-      labels:
-        app: sdap-solr # has to match .spec.selector.matchLabels
-    spec:
-      terminationGracePeriodSeconds: 10
-      {{ if .Values.solr.tolerations }}
-      tolerations:
-{{ .Values.solr.tolerations | toYaml | indent 6 }}
-      {{ end }}
-      {{ if .Values.solr.nodeSelector }}
-      nodeSelector:
-{{ .Values.solr.nodeSelector | toYaml | indent 8 }}
-      {{ end }}
-      affinity:
-        podAntiAffinity:
-          # Prefer spreading over all hosts
-          preferredDuringSchedulingIgnoredDuringExecution:
-          - weight: 100
-            podAffinityTerm:
-              labelSelector:
-                  matchExpressions:
-                    - key: "app"
-                      operator: In
-                      values:
-                      - sdap-solr
-              topologyKey: "kubernetes.io/hostname"
-      securityContext:
-        runAsUser: 8983
-        fsGroup: 8983
-      containers:
-      - name: solr-create-collection
-        imagePullPolicy: Always
-        image: nexusjpl/solr-cloud-init:1.0.0-rc1
-        resources:
-          requests:
-            memory: "1Gi"
-            cpu: "0.25"
-        env:
-        - name: MINIMUM_NODES
-          value: "2" # MINIMUM_NODES should be the same as spec.replicas
-        - name: SOLR_HOST
-          valueFrom:
-              fieldRef:
-                fieldPath: status.podIP
-        - name: SDAP_SOLR_URL
-          value: http://$(SOLR_HOST):8983/solr/
-        - name: SDAP_ZK_SOLR
-          value: "zk-hs:2181/solr"
-        - name: CREATE_COLLECTION_PARAMS
-          value: "name=nexustiles&collection.configName=nexustiles&numShards=$(MINIMUM_NODES)&waitForFinalState=true"
-      - name: solr-cloud
-        imagePullPolicy: Always
-        image: nexusjpl/solr-cloud:1.0.0-rc1
-        resources:
-          requests:
-            memory: {{ .Values.solr.requests.memory }}
-            cpu: {{ .Values.solr.requests.cpu }}
-          limits:
-            memory: {{ .Values.solr.limits.memory }}
-            cpu: {{ .Values.solr.limits.cpu }}
-        env:
-        - name: SOLR_HEAP
-          value: {{ .Values.solr.heap }}
-        - name: SOLR_HOST
-          valueFrom:
-              fieldRef:
-                fieldPath: status.podIP
-        - name: SDAP_ZK_SERVICE_HOST
-          value: "zk-hs"
-        ports:
-        - containerPort: 8983
-          name: http
-        volumeMounts:
-        - name: solr-data
-          mountPath: /opt/solr/server/solr/
-        readinessProbe:
-          exec:
-            command:
-            - solr
-            - healthcheck
-            - -c
-            - nexustiles
-            - -z
-            - zk-hs:2181/solr
-          initialDelaySeconds: 10
-          timeoutSeconds: 5
-        livenessProbe:
-          exec:
-            command:
-            - solr
-            - assert
-            - -s
-            - http://localhost:8983/solr/
-          initialDelaySeconds: 10
-          timeoutSeconds: 5
-  volumeClaimTemplates:
-  - metadata:
-      name: solr-data
-    spec:
-      accessModes: [ "ReadWriteOnce" ]
-      storageClassName: {{ .Values.storageClass }}
-      resources:
-        requests:
-          storage: {{ .Values.solr.storage }}
diff --git a/helm/templates/webapp.yml b/helm/templates/webapp.yml
index d77496f..e4e2adf 100644
--- a/helm/templates/webapp.yml
+++ b/helm/templates/webapp.yml
@@ -9,8 +9,13 @@ spec:
   pythonVersion: "2"
   mode: cluster
   image: {{ .Values.webapp.distributed.image }}
-  imagePullPolicy: Always 
+  imagePullPolicy: IfNotPresent
   mainApplicationFile: local:///incubator-sdap-nexus/analysis/webservice/webapp.py
+  arguments:
+    - --cassandra-host={{ .Release.Name }}-cassandra
+    - --cassandra-username=cassandra
+    - --cassandra-password=cassandra
+    - --solr-host={{ include "nexus.urls.solr" . }}
   sparkVersion: "2.4.4"
   restartPolicy:
     type: OnFailure
diff --git a/helm/templates/zookeeper.yml b/helm/templates/zookeeper.yml
deleted file mode 100644
index bdc3925..0000000
--- a/helm/templates/zookeeper.yml
+++ /dev/null
@@ -1,144 +0,0 @@
-apiVersion: v1
-kind: Service
-metadata:
-  name: zk-hs
-  labels:
-    app: zk
-spec:
-  ports:
-  - port: 2888
-    name: server
-  - port: 3888
-    name: leader-election
-  clusterIP: None
-  selector:
-    app: zk
----
-apiVersion: v1
-kind: Service
-metadata:
-  name: zk-cs
-  labels:
-    app: zk
-spec:
-  ports:
-  - port: 2181
-    name: client
-  selector:
-    app: zk
----
-apiVersion: policy/v1beta1
-kind: PodDisruptionBudget
-metadata:
-  name: zk-pdb
-spec:
-  selector:
-    matchLabels:
-      app: zk
-  maxUnavailable: 1
----
-apiVersion: apps/v1
-kind: StatefulSet
-metadata:
-  name: zk
-spec:
-  selector:
-    matchLabels:
-      app: zk
-  serviceName: zk-hs
-  replicas: {{ .Values.zookeeper.replicas }}
-  updateStrategy:
-    type: RollingUpdate
-  podManagementPolicy: Parallel
-  template:
-    metadata:
-      labels:
-        app: zk
-    spec:
-      {{ if .Values.zookeeper.tolerations }}
-      tolerations:
-{{ .Values.zookeeper.tolerations | toYaml | indent 6 }}
-      {{ end }}
-      {{ if .Values.zookeeper.nodeSelector }}
-      nodeSelector:
-{{ .Values.zookeeper.nodeSelector | toYaml | indent 8 }}
-      {{ end }}
-      affinity:
-        podAntiAffinity:
-          preferredDuringSchedulingIgnoredDuringExecution:
-          - weight: 100
-            podAffinityTerm:
-              labelSelector:
-                  matchExpressions:
-                    - key: "app"
-                      operator: In
-                      values:
-                      - zk
-              topologyKey: "kubernetes.io/hostname"
-      containers:
-      - name: kubernetes-zookeeper
-        imagePullPolicy: Always
-        image: "k8s.gcr.io/kubernetes-zookeeper:1.0-3.4.10"
-        resources:
-          requests:
-            memory: {{ .Values.zookeeper.memory }}
-            cpu: {{ .Values.zookeeper.cpu }}
-        ports:
-        - containerPort: 2181
-          name: client
-        - containerPort: 2888
-          name: server
-        - containerPort: 3888
-          name: leader-election
-        command:
-        - sh
-        - -c
-        - "start-zookeeper \
-          --servers={{ .Values.zookeeper.replicas }} \
-          --data_dir=/var/lib/zookeeper/data \
-          --data_log_dir=/var/lib/zookeeper/data/log \
-          --conf_dir=/opt/zookeeper/conf \
-          --client_port=2181 \
-          --election_port=3888 \
-          --server_port=2888 \
-          --tick_time=2000 \
-          --init_limit=10 \
-          --sync_limit=5 \
-          --heap=512M \
-          --max_client_cnxns=60 \
-          --snap_retain_count=3 \
-          --purge_interval=12 \
-          --max_session_timeout=40000 \
-          --min_session_timeout=4000 \
-          --log_level=INFO"
-        readinessProbe:
-          exec:
-            command:
-            - sh
-            - -c
-            - "zookeeper-ready 2181"
-          initialDelaySeconds: 10
-          timeoutSeconds: 5
-        livenessProbe:
-          exec:
-            command:
-            - sh
-            - -c
-            - "zookeeper-ready 2181"
-          initialDelaySeconds: 10
-          timeoutSeconds: 5
-        volumeMounts:
-        - name: zkdatadir
-          mountPath: /var/lib/zookeeper
-      securityContext:
-        runAsUser: 1000
-        fsGroup: 1000
-  volumeClaimTemplates:
-  - metadata:
-      name: zkdatadir
-    spec:
-      accessModes: [ "ReadWriteOnce" ]
-      storageClassName: {{ .Values.storageClass }}
-      resources:
-        requests:
-          storage: {{ .Values.zookeeper.storage }}
diff --git a/helm/values.yaml b/helm/values.yaml
index c012e6e..4c7bca4 100644
--- a/helm/values.yaml
+++ b/helm/values.yaml
@@ -31,7 +31,7 @@ ingestion:
 
   granuleIngester:
     replicas: 2
-    image: nexusjpl/granule-ingester:0.0.1
+    image: nexusjpl/granule-ingester:0.0.3
 
     ## cpu refers to both request and limit
     cpu: 1
@@ -40,7 +40,7 @@ ingestion:
     memory: 1Gi
 
   collectionManager:
-    image: nexusjpl/collection-manager:0.0.2
+    image: nexusjpl/collection-manager:0.0.3
 
     ## cpu refers to both request and limit
     cpu: 0.5
@@ -78,53 +78,55 @@ ingestion:
 
     ## Load the Collections Config file from a local path
     ## This is a future option that is not yet supported!
-    # localDir: /Users/edford/Desktop/collections.yml
+    #configMap: collections-config
 
     ## Load the Collections Config file from a git repository
     ## Until localDir is supported, this configuration is mandatory
     git:
 
-
       ## This should be an https repository url of the form https://github.com/username/repo.git
       url:
 
       branch: master
 
-      ## token is not yet supported!
       # token: someToken
 
   ## Where to store ingestion history
   ## Defaults to a using a history directory, stored on a PVC using the storageClass defined in this file above
   history:
     ## Store ingestion history in a solr database instead of a filesystem directory
-    # url: http://history-solr
+     solrEnabled: true
 
-cassandra:
-  replicas: 2
-  storage: 13Gi
-  requests:
-    cpu: 1
-    memory: 3Gi
-  limits:
-    cpu: 1
-    memory: 3Gi
+external:
+  solrHostAndPort:   
+  zookeeperHostAndPort:
 
-solr:
-  replicas: 2
-  storage: 10Gi
-  heap: 4g
-  requests:
-    memory: 5Gi
-    cpu: 1
-  limits:
-    memory: 5Gi
-    cpu: 1
+solrInitEnabled: true
 
-zookeeper:
-  replicas: 3
-  memory: 1Gi
-  cpu: 0.5
-  storage: 8Gi
+solr:
+  enabled: true
+  replicaCount: 3
+  volumeClaimTemplates:
+    storageClassName: hostpath
+    storageSize: 10Gi
+  resources:
+    requests:
+      memory: 2Gi
+      cpu: 1
+    limits:
+      memory: 2Gi
+      cpu: 1
+  zookeeper:
+    replicaCount: 3
+    persistence:
+      storageClass: hostpath
+    resources:
+      limits:
+        memory: 1Gi
+        cpu: 0.5
+      requests:
+        memory: 1Gi
+        cpu: 0.5
 
 ingressEnabled: false
 
@@ -150,10 +152,32 @@ nginx-ingress:
 rabbitmq:
   ## fullnameOverride sets the name of the RabbitMQ service
   ## with which the ingestion components will communicate.
+  enabled: true
+  persistence:
+    storageClass: hostpath
   fullnameOverride: rabbitmq
   replicaCount: 1
   auth:
     username: guest
     password: guest
   ingress:
-    enabled: true
\ No newline at end of file
+    enabled: true
+
+cassandra:
+  enabled: true
+  initDBConfigMap: init-cassandra
+  dbUser:
+    user: cassandra
+    password: cassandra
+  cluster:
+    replicaCount: 1
+  persistence:
+    storageClass: hostpath
+    size: 8Gi
+  resources:
+    requests:
+      cpu: 1
+      memory: 8Gi
+    limits:
+      cpu: 1
+      memory: 8Gi
diff --git a/tools/doms/README.md b/tools/doms/README.md
deleted file mode 100644
index c49fa4a..0000000
--- a/tools/doms/README.md
+++ /dev/null
@@ -1,66 +0,0 @@
-# doms_reader.py
-The functions in doms_reader.py read a DOMS netCDF file into memory, assemble a list of matches of satellite and in situ data, and optionally output the matches to a CSV file. Each matched pair contains one satellite data record and one in situ data record.
-
-The DOMS netCDF files hold satellite data and in situ data in different groups (`SatelliteData` and `InsituData`). The `matchIDs` netCDF variable contains pairs of IDs (matches) which reference a satellite data record and an in situ data record in their respective groups. These records have a many-to-many relationship; one satellite record may match to many in situ records, and one in situ record may match to many satellite records. The `assemble_matches` function assembles the individua [...]
-
-## Requirements
-This tool was developed and tested with Python 2.7.5 and 3.7.0a0.
-Imported packages:
-* argparse
-* netcdf4
-* sys
-* datetime
-* csv
-* collections
-* logging
-    
-
-## Functions
-### Function: `assemble_matches(filename)`
-Read a DOMS netCDF file into memory and return a list of matches from the file.
-
-#### Parameters 
-- `filename` (str): the DOMS netCDF file name.
-    
-#### Returns
-- `matches` (list): List of matches. 
-
-Each list element in `matches` is a dictionary organized as follows:
-    For match `m`, netCDF group `GROUP` ('SatelliteData' or 'InsituData'), and netCDF group variable `VARIABLE`:
-
-`matches[m][GROUP]['matchID']`: netCDF `MatchedRecords` dimension ID for the match
-`matches[m][GROUP]['GROUPID']`: GROUP netCDF `dim` dimension ID for the record
-`matches[m][GROUP][VARIABLE]`: variable value 
-
-For example, to access the timestamps of the satellite data and the in situ data of the first match in the list, along with the `MatchedRecords` dimension ID and the groups' `dim` dimension ID:
-```python
-matches[0]['SatelliteData']['time']
-matches[0]['InsituData']['time']
-matches[0]['SatelliteData']['matchID']
-matches[0]['SatelliteData']['SatelliteDataID']
-matches[0]['InsituData']['InsituDataID']
-```
-
-        
-### Function: `matches_to_csv(matches, csvfile)`
-Write the DOMS matches to a CSV file. Include a header of column names which are based on the group and variable names from the netCDF file.
-    
-#### Parameters:
-- `matches` (list): the list of dictionaries containing the DOMS matches as returned from the `assemble_matches` function.
-- `csvfile` (str): the name of the CSV output file.
-
-## Usage
-For example, to read some DOMS netCDF file called `doms_file.nc`:
-### Command line
-The main function for `doms_reader.py` takes one `filename` parameter (`doms_file.nc` argument in this example) for the DOMS netCDF file to read, calls the `assemble_matches` function, then calls the `matches_to_csv` function to write the matches to a CSV file `doms_matches.csv`.
-```
-python doms_reader.py doms_file.nc
-```
-```
-python3 doms_reader.py doms_file.nc
-```
-### Importing `assemble_matches`
-```python
-from doms_reader import assemble_matches
-matches = assemble_matches('doms_file.nc')
-```
diff --git a/tools/doms/doms_reader.py b/tools/doms/doms_reader.py
deleted file mode 100644
index c8229c4..0000000
--- a/tools/doms/doms_reader.py
+++ /dev/null
@@ -1,144 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import argparse
-from netCDF4 import Dataset, num2date
-import sys
-import datetime
-import csv
-from collections import OrderedDict
-import logging
-
-LOGGER = logging.getLogger("doms_reader")
-
-def assemble_matches(filename):
-    """
-    Read a DOMS netCDF file and return a list of matches.
-    
-    Parameters
-    ----------
-    filename : str
-        The DOMS netCDF file name.
-    
-    Returns
-    -------
-    matches : list
-        List of matches. Each list element is a dictionary.
-        For match m, netCDF group GROUP (SatelliteData or InsituData), and
-        group variable VARIABLE:
-        matches[m][GROUP]['matchID']: MatchedRecords dimension ID for the match
-        matches[m][GROUP]['GROUPID']: GROUP dim dimension ID for the record
-        matches[m][GROUP][VARIABLE]: variable value 
-    """
-    
-    try:
-        # Open the netCDF file
-        with Dataset(filename, 'r') as doms_nc:
-            # Check that the number of groups is consistent w/ the MatchedGroups
-            # dimension
-            assert len(doms_nc.groups) == doms_nc.dimensions['MatchedGroups'].size,\
-                ("Number of groups isn't the same as MatchedGroups dimension.")
-            
-            matches = []
-            matched_records = doms_nc.dimensions['MatchedRecords'].size
-            
-            # Loop through the match IDs to assemble matches
-            for match in range(0, matched_records):
-                match_dict = OrderedDict()
-                # Grab the data from each platform (group) in the match
-                for group_num, group in enumerate(doms_nc.groups):
-                    match_dict[group] = OrderedDict()
-                    match_dict[group]['matchID'] = match
-                    ID = doms_nc.variables['matchIDs'][match][group_num]
-                    match_dict[group][group + 'ID'] = ID
-                    for var in doms_nc.groups[group].variables.keys():
-                        match_dict[group][var] = doms_nc.groups[group][var][ID]
-                    
-                    # Create a UTC datetime field from timestamp
-                    dt = num2date(match_dict[group]['time'],
-                                  doms_nc.groups[group]['time'].units)
-                    match_dict[group]['datetime'] = dt
-                LOGGER.info(match_dict)
-                matches.append(match_dict)
-            
-            return matches
-    except (OSError, IOError) as err:
-        LOGGER.exception("Error reading netCDF file " + filename)
-        raise err
-    
-def matches_to_csv(matches, csvfile):
-    """
-    Write the DOMS matches to a CSV file. Include a header of column names
-    which are based on the group and variable names from the netCDF file.
-    
-    Parameters
-    ----------
-    matches : list
-        The list of dictionaries containing the DOMS matches as returned from
-        assemble_matches.      
-    csvfile : str
-        The name of the CSV output file.
-    """
-    # Create a header for the CSV. Column names are GROUP_VARIABLE or
-    # GROUP_GROUPID.
-    header = []
-    for key, value in matches[0].items():
-        for otherkey in value.keys():
-            header.append(key + "_" + otherkey)
-    
-    try:
-        # Write the CSV file
-        with open(csvfile, 'w') as output_file:
-            csv_writer = csv.writer(output_file)
-            csv_writer.writerow(header)
-            for match in matches:
-                row = []
-                for group, data in match.items():
-                    for value in data.values():
-                        row.append(value)
-                csv_writer.writerow(row)
-    except (OSError, IOError) as err:
-        LOGGER.exception("Error writing CSV file " + csvfile)
-        raise err
-
-if __name__ == '__main__':
-    """
-    Execution:
-        python doms_reader.py filename
-        OR
-        python3 doms_reader.py filename
-    """
-    logging.basicConfig(format='%(asctime)s %(levelname)-8s %(message)s',
-                    level=logging.INFO,
-                    datefmt='%Y-%m-%d %H:%M:%S')
-
-    p = argparse.ArgumentParser()
-    p.add_argument('filename', help='DOMS netCDF file to read')
-    args = p.parse_args()
-
-    doms_matches = assemble_matches(args.filename)
-
-    matches_to_csv(doms_matches, 'doms_matches.csv')
-    
-    
-    
-    
-    
-    
-    
-    
-
-    
-    
\ No newline at end of file


[incubator-sdap-nexus] 04/04: use new nexusjpl/solr image, update solr-create-collection image tag

Posted by ea...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

eamonford pushed a commit to branch bug_fixes
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git

commit 48da69484b8326e81765ac4f4b70526f86f979ca
Author: Eamon Ford <ea...@gmail.com>
AuthorDate: Fri Sep 4 15:40:49 2020 -0700

    use new nexusjpl/solr image, update solr-create-collection image tag
---
 helm/templates/solr-create-collection.yml | 2 +-
 helm/values.yaml                          | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/helm/templates/solr-create-collection.yml b/helm/templates/solr-create-collection.yml
index 7ecb2e3..d95ef82 100644
--- a/helm/templates/solr-create-collection.yml
+++ b/helm/templates/solr-create-collection.yml
@@ -16,7 +16,7 @@ spec:
       containers:
       - name: solr-create-collection
         imagePullPolicy: Always
-        image: nexusjpl/solr-cloud-init:1.0.1
+        image: nexusjpl/solr-cloud-init:1.0.2
         resources:
           requests:
             memory: "0.5Gi"
diff --git a/helm/values.yaml b/helm/values.yaml
index 9d1280d..b2e1a91 100644
--- a/helm/values.yaml
+++ b/helm/values.yaml
@@ -105,6 +105,9 @@ solrInitEnabled: true
 
 solr:
   enabled: true
+  image:
+    repository: nexusjpl/solr
+    tag: 8.4.0
   replicaCount: 3
   volumeClaimTemplates:
     storageClassName: hostpath


[incubator-sdap-nexus] 02/04: revert doms

Posted by ea...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

eamonford pushed a commit to branch bug_fixes
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git

commit d562fa80ca66673e58eb86d9fb7429d08337ab0a
Author: Eamon Ford <ea...@gmail.com>
AuthorDate: Mon Aug 10 12:02:32 2020 -0700

    revert doms
---
 .gitignore                                       |   1 +
 analysis/setup.py                                |   3 +-
 analysis/webservice/algorithms_spark/__init__.py |   6 +
 analysis/webservice/config/web.ini               |   2 +-
 data-access/nexustiles/dao/CassandraProxy.py     |   3 -
 data-access/tests/config/datastores.ini          |   9 ++
 tools/doms/README.md                             |  66 +++++++++++
 tools/doms/doms_reader.py                        | 144 +++++++++++++++++++++++
 8 files changed, 229 insertions(+), 5 deletions(-)

diff --git a/.gitignore b/.gitignore
index 4e4cf6e..3e29626 100644
--- a/.gitignore
+++ b/.gitignore
@@ -2,5 +2,6 @@
 *.code-workspace
 *.idea
 *.DS_Store
+analysis/webservice/algorithms/doms/domsconfig.ini
 data-access/nexustiles/config/datastores.ini
 venv/
diff --git a/analysis/setup.py b/analysis/setup.py
index 9a449ce..62a6891 100644
--- a/analysis/setup.py
+++ b/analysis/setup.py
@@ -50,7 +50,8 @@ setuptools.setup(
     #    'webservice.nexus_tornado.request.renderers'
     #],
     package_data={
-        'webservice': ['config/web.ini', 'config/algorithms.ini']
+        'webservice': ['config/web.ini', 'config/algorithms.ini'],
+        'webservice.algorithms.doms': ['domsconfig.ini.default']
     },
     data_files=[
         ('static', ['static/index.html'])
diff --git a/analysis/webservice/algorithms_spark/__init__.py b/analysis/webservice/algorithms_spark/__init__.py
index a25c8d5..d6ed83f 100644
--- a/analysis/webservice/algorithms_spark/__init__.py
+++ b/analysis/webservice/algorithms_spark/__init__.py
@@ -20,6 +20,7 @@ import ClimMapSpark
 import CorrMapSpark
 import DailyDifferenceAverageSpark
 import HofMoellerSpark
+import Matchup
 import MaximaMinimaSpark
 import NexusCalcSparkHandler
 import TimeAvgMapSpark
@@ -46,6 +47,11 @@ if module_exists("pyspark"):
         pass
 
     try:
+        import Matchup
+    except ImportError:
+        pass
+
+    try:
         import TimeAvgMapSpark
     except ImportError:
         pass
diff --git a/analysis/webservice/config/web.ini b/analysis/webservice/config/web.ini
index a1ecb2c..2644ade 100644
--- a/analysis/webservice/config/web.ini
+++ b/analysis/webservice/config/web.ini
@@ -14,4 +14,4 @@ static_enabled=true
 static_dir=static
 
 [modules]
-module_dirs=webservice.algorithms,webservice.algorithms_spark
\ No newline at end of file
+module_dirs=webservice.algorithms,webservice.algorithms_spark,webservice.algorithms.doms
\ No newline at end of file
diff --git a/data-access/nexustiles/dao/CassandraProxy.py b/data-access/nexustiles/dao/CassandraProxy.py
index 54a849b..a8a4e6e 100644
--- a/data-access/nexustiles/dao/CassandraProxy.py
+++ b/data-access/nexustiles/dao/CassandraProxy.py
@@ -161,9 +161,6 @@ class CassandraProxy(object):
         self.__cass_protocol_version = config.getint("cassandra", "protocol_version")
         self.__cass_dc_policy = config.get("cassandra", "dc_policy")
 
-        logger.info("Setting cassandra host to " + self.__cass_url)
-        logger.info("Setting cassandra username to " + self.__cass_username)
-
         try:
             self.__cass_port = config.getint("cassandra", "port")
         except NoOptionError:
diff --git a/data-access/tests/config/datastores.ini b/data-access/tests/config/datastores.ini
new file mode 100644
index 0000000..194760c
--- /dev/null
+++ b/data-access/tests/config/datastores.ini
@@ -0,0 +1,9 @@
+[cassandra]
+host=127.0.0.1
+keyspace=nexustiles
+local_datacenter=datacenter1
+protocol_version=3
+
+[solr]
+host=localhost:8983
+core=nexustiles
\ No newline at end of file
diff --git a/tools/doms/README.md b/tools/doms/README.md
new file mode 100644
index 0000000..c49fa4a
--- /dev/null
+++ b/tools/doms/README.md
@@ -0,0 +1,66 @@
+# doms_reader.py
+The functions in doms_reader.py read a DOMS netCDF file into memory, assemble a list of matches of satellite and in situ data, and optionally output the matches to a CSV file. Each matched pair contains one satellite data record and one in situ data record.
+
+The DOMS netCDF files hold satellite data and in situ data in different groups (`SatelliteData` and `InsituData`). The `matchIDs` netCDF variable contains pairs of IDs (matches) which reference a satellite data record and an in situ data record in their respective groups. These records have a many-to-many relationship; one satellite record may match to many in situ records, and one in situ record may match to many satellite records. The `assemble_matches` function assembles the individua [...]
+
+## Requirements
+This tool was developed and tested with Python 2.7.5 and 3.7.0a0.
+Imported packages:
+* argparse
+* netcdf4
+* sys
+* datetime
+* csv
+* collections
+* logging
+    
+
+## Functions
+### Function: `assemble_matches(filename)`
+Read a DOMS netCDF file into memory and return a list of matches from the file.
+
+#### Parameters 
+- `filename` (str): the DOMS netCDF file name.
+    
+#### Returns
+- `matches` (list): List of matches. 
+
+Each list element in `matches` is a dictionary organized as follows:
+    For match `m`, netCDF group `GROUP` ('SatelliteData' or 'InsituData'), and netCDF group variable `VARIABLE`:
+
+`matches[m][GROUP]['matchID']`: netCDF `MatchedRecords` dimension ID for the match
+`matches[m][GROUP]['GROUPID']`: GROUP netCDF `dim` dimension ID for the record
+`matches[m][GROUP][VARIABLE]`: variable value 
+
+For example, to access the timestamps of the satellite data and the in situ data of the first match in the list, along with the `MatchedRecords` dimension ID and the groups' `dim` dimension ID:
+```python
+matches[0]['SatelliteData']['time']
+matches[0]['InsituData']['time']
+matches[0]['SatelliteData']['matchID']
+matches[0]['SatelliteData']['SatelliteDataID']
+matches[0]['InsituData']['InsituDataID']
+```
+
+        
+### Function: `matches_to_csv(matches, csvfile)`
+Write the DOMS matches to a CSV file. Include a header of column names which are based on the group and variable names from the netCDF file.
+    
+#### Parameters:
+- `matches` (list): the list of dictionaries containing the DOMS matches as returned from the `assemble_matches` function.
+- `csvfile` (str): the name of the CSV output file.
+
+## Usage
+For example, to read some DOMS netCDF file called `doms_file.nc`:
+### Command line
+The main function for `doms_reader.py` takes one `filename` parameter (`doms_file.nc` argument in this example) for the DOMS netCDF file to read, calls the `assemble_matches` function, then calls the `matches_to_csv` function to write the matches to a CSV file `doms_matches.csv`.
+```
+python doms_reader.py doms_file.nc
+```
+```
+python3 doms_reader.py doms_file.nc
+```
+### Importing `assemble_matches`
+```python
+from doms_reader import assemble_matches
+matches = assemble_matches('doms_file.nc')
+```
diff --git a/tools/doms/doms_reader.py b/tools/doms/doms_reader.py
new file mode 100644
index 0000000..c8229c4
--- /dev/null
+++ b/tools/doms/doms_reader.py
@@ -0,0 +1,144 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+from netCDF4 import Dataset, num2date
+import sys
+import datetime
+import csv
+from collections import OrderedDict
+import logging
+
+LOGGER = logging.getLogger("doms_reader")
+
+def assemble_matches(filename):
+    """
+    Read a DOMS netCDF file and return a list of matches.
+    
+    Parameters
+    ----------
+    filename : str
+        The DOMS netCDF file name.
+    
+    Returns
+    -------
+    matches : list
+        List of matches. Each list element is a dictionary.
+        For match m, netCDF group GROUP (SatelliteData or InsituData), and
+        group variable VARIABLE:
+        matches[m][GROUP]['matchID']: MatchedRecords dimension ID for the match
+        matches[m][GROUP]['GROUPID']: GROUP dim dimension ID for the record
+        matches[m][GROUP][VARIABLE]: variable value 
+    """
+    
+    try:
+        # Open the netCDF file
+        with Dataset(filename, 'r') as doms_nc:
+            # Check that the number of groups is consistent w/ the MatchedGroups
+            # dimension
+            assert len(doms_nc.groups) == doms_nc.dimensions['MatchedGroups'].size,\
+                ("Number of groups isn't the same as MatchedGroups dimension.")
+            
+            matches = []
+            matched_records = doms_nc.dimensions['MatchedRecords'].size
+            
+            # Loop through the match IDs to assemble matches
+            for match in range(0, matched_records):
+                match_dict = OrderedDict()
+                # Grab the data from each platform (group) in the match
+                for group_num, group in enumerate(doms_nc.groups):
+                    match_dict[group] = OrderedDict()
+                    match_dict[group]['matchID'] = match
+                    ID = doms_nc.variables['matchIDs'][match][group_num]
+                    match_dict[group][group + 'ID'] = ID
+                    for var in doms_nc.groups[group].variables.keys():
+                        match_dict[group][var] = doms_nc.groups[group][var][ID]
+                    
+                    # Create a UTC datetime field from timestamp
+                    dt = num2date(match_dict[group]['time'],
+                                  doms_nc.groups[group]['time'].units)
+                    match_dict[group]['datetime'] = dt
+                LOGGER.info(match_dict)
+                matches.append(match_dict)
+            
+            return matches
+    except (OSError, IOError) as err:
+        LOGGER.exception("Error reading netCDF file " + filename)
+        raise err
+    
+def matches_to_csv(matches, csvfile):
+    """
+    Write the DOMS matches to a CSV file. Include a header of column names
+    which are based on the group and variable names from the netCDF file.
+    
+    Parameters
+    ----------
+    matches : list
+        The list of dictionaries containing the DOMS matches as returned from
+        assemble_matches.      
+    csvfile : str
+        The name of the CSV output file.
+    """
+    # Create a header for the CSV. Column names are GROUP_VARIABLE or
+    # GROUP_GROUPID.
+    header = []
+    for key, value in matches[0].items():
+        for otherkey in value.keys():
+            header.append(key + "_" + otherkey)
+    
+    try:
+        # Write the CSV file
+        with open(csvfile, 'w') as output_file:
+            csv_writer = csv.writer(output_file)
+            csv_writer.writerow(header)
+            for match in matches:
+                row = []
+                for group, data in match.items():
+                    for value in data.values():
+                        row.append(value)
+                csv_writer.writerow(row)
+    except (OSError, IOError) as err:
+        LOGGER.exception("Error writing CSV file " + csvfile)
+        raise err
+
+if __name__ == '__main__':
+    """
+    Execution:
+        python doms_reader.py filename
+        OR
+        python3 doms_reader.py filename
+    """
+    logging.basicConfig(format='%(asctime)s %(levelname)-8s %(message)s',
+                    level=logging.INFO,
+                    datefmt='%Y-%m-%d %H:%M:%S')
+
+    p = argparse.ArgumentParser()
+    p.add_argument('filename', help='DOMS netCDF file to read')
+    args = p.parse_args()
+
+    doms_matches = assemble_matches(args.filename)
+
+    matches_to_csv(doms_matches, 'doms_matches.csv')
+    
+    
+    
+    
+    
+    
+    
+    
+
+    
+    
\ No newline at end of file


[incubator-sdap-nexus] 03/04: upgrade images

Posted by ea...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

eamonford pushed a commit to branch bug_fixes
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git

commit 2679e641b5218ed8729c1c34b02e2fac44d2955e
Author: Eamon Ford <ea...@gmail.com>
AuthorDate: Tue Aug 11 11:59:57 2020 -0700

    upgrade images
---
 helm/values.yaml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/helm/values.yaml b/helm/values.yaml
index 4c7bca4..9d1280d 100644
--- a/helm/values.yaml
+++ b/helm/values.yaml
@@ -31,7 +31,7 @@ ingestion:
 
   granuleIngester:
     replicas: 2
-    image: nexusjpl/granule-ingester:0.0.3
+    image: nexusjpl/granule-ingester:0.0.4
 
     ## cpu refers to both request and limit
     cpu: 1
@@ -40,7 +40,7 @@ ingestion:
     memory: 1Gi
 
   collectionManager:
-    image: nexusjpl/collection-manager:0.0.3
+    image: nexusjpl/collection-manager:0.0.4
 
     ## cpu refers to both request and limit
     cpu: 0.5